Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Building Security Data Lake
Search
Richard Fan
December 13, 2023
Technology
0
16
Building Security Data Lake
vBrownBag podcast
Building Security Data Lake
https://youtu.be/6qQ7_asdI4I?si=CSsn0jz2vo00Y02Q
Richard Fan
December 13, 2023
Tweet
Share
More Decks by Richard Fan
See All by Richard Fan
JAWS Pankration 2024 - Achieve software supply chain security using AWS Nitro Enclaves and GitHub Actions
richardfan1126
0
44
Preserving privacy on data collaboration with AWS Clean Rooms
richardfan1126
0
43
Achieve software supply chain security using AWS Nitro Enclaves and GitHub Actions
richardfan1126
0
130
When Data Collaboration Meets Privacy: Privacy-enhancing Technologies on AWS
richardfan1126
0
55
AWS Security Hub Central Configuration - An Easy way to monitor your Organization security posture
richardfan1126
0
64
Create your first AWS Nitro Enclaves application
richardfan1126
0
55
Other Decks in Technology
See All in Technology
セマンティックHTMLによる アクセシビリティ品質向上の基礎
zozotech
PRO
0
120
仕様は“書く”より“語る” - 分断を超えたチーム開発の実践 / 20251115 Naoki Takahashi
shift_evolve
PRO
1
1k
LINEスキマニ/LINEバイトにおけるバックエンド開発
lycorptech_jp
PRO
0
310
現地速報!Microsoft Ignite 2025 M365 Copilotアップデートレポート
kasada
1
1.2k
Dart and Flutter MCP serverで実現する AI駆動E2Eテスト整備と自動操作
yukisakai1225
0
570
大規模プロダクトで実践するAI活用の仕組みづくり
k1tikurisu
4
1.6k
Perlの生きのこり - YAPC::Fukuoka 2025
kfly8
0
120
なぜThrottleではなくDebounceだったのか? 700並列リクエストと戦うサーバーサイド実装のすべて
yoshiori
13
4.7k
身近なCSVを活用する!AWSのデータ分析基盤アーキテクチャ
koosun
0
1.8k
大規模モノレポの秩序管理 失速しない多言語化フロントエンドの運用 / JSConf JP 2025
shoota
0
240
FFMとJVMの実装から学ぶJavaのインテグリティ
kazumura
0
130
AI時代の戦略的アーキテクチャ 〜Adaptable AI をアーキテクチャで実現する〜 / Enabling Adaptable AI Through Strategic Architecture
bitkey
PRO
7
1.8k
Featured
See All Featured
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
Building an army of robots
kneath
306
46k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
2.9k
Designing for Performance
lara
610
69k
Building a Modern Day E-commerce SEO Strategy
aleyda
45
8.1k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
15k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.2k
The Art of Programming - Codeland 2020
erikaheidi
56
14k
Mobile First: as difficult as doing things right
swwweet
225
10k
Put a Button on it: Removing Barriers to Going Fast.
kastner
60
4.1k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
13k
RailsConf & Balkan Ruby 2019: The Past, Present, and Future of Rails at GitHub
eileencodes
140
34k
Transcript
Building Security Data Lake Richard Fan March 29, 2023
EXPRESSVPN Richard Fan Security Engineer from ExpressVPN A Builder and
Tech advocate AWS Community Builder • https://dev.to/richardfan1126 • https://medium.com/@richardfan1126 • https://github.com/richardfan1126 Who am I?
Security Data
EXPRESSVPN What is security data? Security Data
EXPRESSVPN Why do we need Security logs? • Detect threat
• Incident response • Compliance • Vulnerability management Security Data
EXPRESSVPN Security Data Storing (or Not storing) locally • No
correlation • Difficult to track • Time consuming during incident response Send to SIEM • Centralized • Analytics / threat detection • Strong query capability Capturing security logs
EXPRESSVPN The growing amount/complexity of security logs • Shift-left •
Adoption of cloud • 2 common approaches ◦ Drop less-important events ◦ Scale-up SIEM and send all events to it Security Data
EXPRESSVPN SIEM is not an ultimate solution • Too expensive
• Short retention period • Difficult to integrate with other data processor Security Data
Data Lake
EXPRESSVPN Data Lake comes in Data Lake • Store data
in large scale • Centralize data repository • Turn raw data into useful data • NOT a data archive • NOT a database (Security) Data Lake Security Data Lake • Threat detection • Event context • Real-time alert
EXPRESSVPN How to start? • Identify all your data sources
• Identify ingestion methods • Evaluate your situation ◦ Engineering ◦ Threat hunting ◦ SIEM options • Decide where SIEM fits in Data Lake
EXPRESSVPN Connector split Source split Data Lake SIEM in Security
data lake
EXPRESSVPN Data lake to SIEM SIEM to Data lake Data
Lake SIEM in Security data lake
Threat hunting
EXPRESSVPN Threat hunting life cycle in data lake Threat hunting
EXPRESSVPN Detection as Code • Better documentation • Code repository
/ Code Review (GitOps) • Common language • Vendor agnostic Threat hunting
EXPRESSVPN Detection as Code - Sigma Threat hunting
EXPRESSVPN Detection as Code - Sigma Threat hunting Splunk Elasticsearch
Our Story
EXPRESSVPN How do we build security data lake? Technology •
Ingestion • Storage ◦ S3 • Analytics • Detection-as-code ◦ Sigma Our story
EXPRESSVPN How do we build security data lake? Company •
SOC team • IT team • Security Engineering team • Cross-team collaboration • Security knowledge Our story
EXPRESSVPN Takeaway • Evaluate your current state • Start small
• Estimate cost • Embrace IaC / DaC • Don’t forget about people Our story
Thank you