$30 off During Our Annual Pro Sale. View Details »

DWX2021: Open-Source Intelligenz (OSINT) mit Big Data auf Azure - Geht das?

DWX2021: Open-Source Intelligenz (OSINT) mit Big Data auf Azure - Geht das?

In dieser Session stellen wir ein Forschungsprojekt vor, welches wir für einen Kunden durchgeführt haben. Es ging darum die Fragestellung zu klären, ob mit Open-Source Intelligence-, Big Data- und Cloudtechnologien ein System gebaut werden kann, mit welchem Daten aus Social Media Kanälen in Echtzeit ausgewertet werden können. Die dafür verwendeten Technologien drehen sich um Azure Databricks, Data Lake, C# und Graphdatenbanken. Wir erzählen von unseren Erfahrungen, Learnings und Resultaten.

Manuel Meyer

June 29, 2021

More Decks by Manuel Meyer

Other Decks in Programming


  1. www.manuelmeyer.net/ www.stefanko.ch @manumeyer1 @koecse OSINT mit Big Data auf Azure

    – Geht das? Developer Week 2021 Manuel Meyer, Stefan Koch
  2. Manuel Meyer helps customers: • to kick-start the Azure journey.

    • to architect, implement and optimize their Azure Solutions www.manuelmeyer.net www.azurezurichusergroup.com @manumeyer1
  3. Stefan Koch • Earns his bread and butter at Trivadis

    as a BI Consultant • Can move silently through the cloud • As dexterous with the gun as with the keyboard stefan.koch@trivadis.com @koecse
  4. Agenda ▪ OSINT? ▪ Projekt Morpheus ▪ OSINT in Azure

    ▪ OSINT Tools & Tutorials.
  5. None
  6. None
  7. None
  8. https://www.youtube.com/watch?v=7C20JmCt_3Q

  9. 2016 "celebrities who feared their phone conversations were being hacked"

    • Dual-Boot • End-to-end Encryption • Instant Messaging • Calls • Kill Code 2’000$ for 6 Months
  10. 60’000 users Gendarmerie National: «90% of subscribers are criminals» British

    National Crime Agency: «No evidence of non-criminals using it» «The industry standard of organized crime» 2020
  11. 2017 Gendarmerie discovers first devices And starts the investigation 2019

    EU Funding Infiltration Distribution in the EU… https://en.wikipedia.org/wiki/EncroChat
  12. None
  13. 746 Festnahmen 8 Tonnen Kokain 1.2 Tonnen Crystal Meth 19

    Drogenlabors 100 Waffen 55 Luxusautos 63 Mio Euro Cash 1 Folterkammer. Das Ende vom EncroChat!
  14. None
  15. None
  16. None
  17. Public Sources • Most wanted • Social Media Profiles •

    Twitter • …
  18. OSINT?

  19. None
  20. «… is a multi-factor methodology for collecting, analyzing and making

    decisions about publicly available data sources to be used in an intelligence context» OSINT
  21. «… is a multi-factor methodology for collecting, analyzing and making

    decisions about publicly available data sources to be used in an intelligence context» Intelligence
  22. «Information, especially secret information gathered about an actual or potential

    enemy or adversary» Intelligence
  23. None
  24. The need for new technology

  25. None
  26. Initial Project Project Morpheus

  27. None
  28. The setup

  29. A Data Plattform in a computer

  30. Indexsearch with ElasticSearch

  31. Graph analysis with Neo4j

  32. Apache Spark

  33. None
  34. we have felt the answer

  35. Morpheus on steorids (in the Azure Cloud)

  36. Twitter https://developer.twitter.com/en

  37. Streamsets https://streamsets.com/

  38. Demo Streamsets

  39. Azure Event Hub https://azure.microsoft.com/en-us/services/event-hubs/

  40. Azure Data Lake https://azure.microsoft.com/en-us/solutions/data-lake/

  41. Azure Data Lake https://azure.microsoft.com/en-us/solutions/data-lake/ LEGACY ADLS v2 (Storage Account with

    hierarchical file System)
  42. Azure Data Lake https://azure.microsoft.com/en-us/solutions/data-lake/ ADLS v2 (Storage Account with hierarchical

    file System)
  43. Azure Databricks https://databricks.com/

  44. Azure Databricks

  45. Demo Databricks

  46. How to get the Data out of the Lake? https://azure.microsoft.com/de-de/services/synapse-analytics/

  47. Azure Synapse Analytics https://azure.microsoft.com/de-de/services/synapse-analytics/

  48. Azure Synapse Analytics https://www.jamesserra.com/archive/2020/08/sql-on-demand-in-azure-synapse-analytics/

  49. The needle in the haystack?

  50. ElasticSearch https://www.elastic.io/

  51. Demo ElasticSearch

  52. Azure Cognitive Services https://azure.microsoft.com/de-de/services/cognitive-services/

  53. Sentiment Analysis

  54. Global source, means many languages

  55. Translator

  56. OSINT Tools Social Analyzer https://github.com/qeeqbox/social-analyzer

  57. OSINT Tools Social Analyzer https://github.com/qeeqbox/social-analyzer

  58. Social Analyzer

  59. OSINT Tutorial https://www.ehacking.net/2020/05/the-complete-osint-tutorial-to-find-personal-information-about-anyone.html

  60. Bellingcat.com https://www.bellingcat.com/

  61. Bellingcat.com https://www.bellingcat.com/

  62. Bellingcat.com https://www.bellingcat.com/

  63. Bellingcat.com https://www.bellingcat.com/

  64. Recap

  65. Lesson Learned ▪ You pay what you use. And we

    used a lot sometimes… ▪ Not all police investigators are computer geeks. ▪ Various difficulties to get the data from social medias >> restricted API’s, Cambridge Analytica & Facebook, etc.
  66. Thank you! Manuel Meyer www.manuelmeyer.net @manumeyer1 manuel.meyer@trivadis.com Stefan Koch www.stefanko.ch

    @koecse stefan.koch@trivadis.com
  67. None