Upgrade to Pro — share decks privately, control downloads, hide ads and more …

DWX2021: Open-Source Intelligenz (OSINT) mit Big Data auf Azure - Geht das?

DWX2021: Open-Source Intelligenz (OSINT) mit Big Data auf Azure - Geht das?

In dieser Session stellen wir ein Forschungsprojekt vor, welches wir für einen Kunden durchgeführt haben. Es ging darum die Fragestellung zu klären, ob mit Open-Source Intelligence-, Big Data- und Cloudtechnologien ein System gebaut werden kann, mit welchem Daten aus Social Media Kanälen in Echtzeit ausgewertet werden können. Die dafür verwendeten Technologien drehen sich um Azure Databricks, Data Lake, C# und Graphdatenbanken. Wir erzählen von unseren Erfahrungen, Learnings und Resultaten.

Manuel Meyer

June 29, 2021
Tweet

More Decks by Manuel Meyer

Other Decks in Programming

Transcript

  1. www.manuelmeyer.net/
    www.stefanko.ch
    @manumeyer1
    @koecse
    OSINT mit Big Data auf Azure – Geht das?
    Developer Week 2021
    Manuel Meyer, Stefan Koch

    View Slide

  2. Manuel Meyer
    helps customers:
    • to kick-start the Azure journey.
    • to architect, implement and optimize their
    Azure Solutions
    www.manuelmeyer.net
    www.azurezurichusergroup.com
    @manumeyer1

    View Slide

  3. Stefan Koch
    • Earns his bread and butter at Trivadis as a BI
    Consultant
    • Can move silently through the cloud
    • As dexterous with the gun as with the
    keyboard
    [email protected]
    @koecse

    View Slide

  4. Agenda
    ▪ OSINT?
    ▪ Projekt Morpheus
    ▪ OSINT in Azure
    ▪ OSINT Tools & Tutorials.

    View Slide

  5. View Slide

  6. View Slide

  7. View Slide

  8. https://www.youtube.com/watch?v=7C20JmCt_3Q

    View Slide

  9. 2016
    "celebrities who feared their
    phone conversations were
    being hacked"
    • Dual-Boot
    • End-to-end Encryption
    • Instant Messaging
    • Calls
    • Kill Code
    2’000$ for 6 Months

    View Slide

  10. 60’000 users
    Gendarmerie National:
    «90% of subscribers are criminals»
    British National Crime Agency:
    «No evidence of non-criminals using it»
    «The industry standard of organized crime»
    2020

    View Slide

  11. 2017 Gendarmerie discovers first devices
    And starts the investigation
    2019 EU Funding
    Infiltration
    Distribution in the EU…
    https://en.wikipedia.org/wiki/EncroChat

    View Slide

  12. View Slide

  13. 746 Festnahmen
    8 Tonnen Kokain
    1.2 Tonnen Crystal Meth
    19 Drogenlabors
    100 Waffen
    55 Luxusautos
    63 Mio Euro Cash
    1 Folterkammer.
    Das Ende vom EncroChat!

    View Slide

  14. View Slide

  15. View Slide

  16. View Slide

  17. Public Sources
    • Most wanted
    • Social Media Profiles
    • Twitter
    • …

    View Slide

  18. OSINT?

    View Slide

  19. View Slide

  20. «… is a multi-factor methodology for
    collecting, analyzing and making decisions
    about publicly available data sources to be
    used in an intelligence context»
    OSINT

    View Slide

  21. «… is a multi-factor methodology for
    collecting, analyzing and making decisions
    about publicly available data sources to be
    used in an intelligence context»
    Intelligence

    View Slide

  22. «Information, especially secret information
    gathered about an actual or potential
    enemy or adversary»
    Intelligence

    View Slide

  23. View Slide

  24. The need for new technology

    View Slide

  25. View Slide

  26. Initial Project
    Project Morpheus

    View Slide

  27. View Slide

  28. The setup

    View Slide

  29. A Data Plattform in a computer

    View Slide

  30. Indexsearch with ElasticSearch

    View Slide

  31. Graph analysis with Neo4j

    View Slide

  32. Apache Spark

    View Slide

  33. View Slide

  34. we have felt the answer

    View Slide

  35. Morpheus on steorids
    (in the Azure Cloud)

    View Slide

  36. Twitter
    https://developer.twitter.com/en

    View Slide

  37. Streamsets
    https://streamsets.com/

    View Slide

  38. Demo Streamsets

    View Slide

  39. Azure Event Hub
    https://azure.microsoft.com/en-us/services/event-hubs/

    View Slide

  40. Azure Data Lake
    https://azure.microsoft.com/en-us/solutions/data-lake/

    View Slide

  41. Azure Data Lake
    https://azure.microsoft.com/en-us/solutions/data-lake/
    LEGACY ADLS v2
    (Storage Account with
    hierarchical file System)

    View Slide

  42. Azure Data Lake
    https://azure.microsoft.com/en-us/solutions/data-lake/
    ADLS v2
    (Storage Account with hierarchical file System)

    View Slide

  43. Azure Databricks
    https://databricks.com/

    View Slide

  44. Azure Databricks

    View Slide

  45. Demo Databricks

    View Slide

  46. How to get the Data out of the Lake?
    https://azure.microsoft.com/de-de/services/synapse-analytics/

    View Slide

  47. Azure Synapse Analytics
    https://azure.microsoft.com/de-de/services/synapse-analytics/

    View Slide

  48. Azure Synapse Analytics
    https://www.jamesserra.com/archive/2020/08/sql-on-demand-in-azure-synapse-analytics/

    View Slide

  49. The needle in the haystack?

    View Slide

  50. ElasticSearch
    https://www.elastic.io/

    View Slide

  51. Demo ElasticSearch

    View Slide

  52. Azure Cognitive Services
    https://azure.microsoft.com/de-de/services/cognitive-services/

    View Slide

  53. Sentiment Analysis

    View Slide

  54. Global source, means many languages

    View Slide

  55. Translator

    View Slide

  56. OSINT Tools
    Social Analyzer
    https://github.com/qeeqbox/social-analyzer

    View Slide

  57. OSINT Tools
    Social Analyzer
    https://github.com/qeeqbox/social-analyzer

    View Slide

  58. Social Analyzer

    View Slide

  59. OSINT Tutorial
    https://www.ehacking.net/2020/05/the-complete-osint-tutorial-to-find-personal-information-about-anyone.html

    View Slide

  60. Bellingcat.com
    https://www.bellingcat.com/

    View Slide

  61. Bellingcat.com
    https://www.bellingcat.com/

    View Slide

  62. Bellingcat.com
    https://www.bellingcat.com/

    View Slide

  63. Bellingcat.com
    https://www.bellingcat.com/

    View Slide

  64. Recap

    View Slide

  65. Lesson Learned
    ▪ You pay what you use. And we used a
    lot sometimes…
    ▪ Not all police investigators are computer
    geeks.
    ▪ Various difficulties to get the data from
    social medias >> restricted API’s, Cambridge
    Analytica & Facebook, etc.

    View Slide

  66. Thank you!
    Manuel Meyer
    www.manuelmeyer.net
    @manumeyer1
    [email protected]
    Stefan Koch
    www.stefanko.ch
    @koecse
    [email protected]

    View Slide

  67. View Slide