Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enabling Enterprise Search Platform with Elasti...
Search
Kosho Owa
June 27, 2016
Technology
0
2.4k
Enabling Enterprise Search Platform with Elastic Stack
ElasticsearchとLogstashで始めるEnterprise Search Platform
Elasticsearch勉強会スライド
2016年6月27日
Kosho Owa
June 27, 2016
Tweet
Share
More Decks by Kosho Owa
See All by Kosho Owa
Introducing Machine Learning for the Elastic Stack
kosho
2
12k
Elastic Stack X-Pack 5.0 for IT Security Workshop
kosho
1
340
Elastic Stack X-Pack 5.0 for IT Ops Workshop
kosho
0
340
[Developers Summit 2017] Anomaly Detection with the Elastic Stack
kosho
1
720
Anomaly Detection with the Elastic Stack
kosho
1
1.8k
Getting Started with Elastic Cloud and Beats for Log Analytics
kosho
0
130
Elastic{ON} Seminar Tokyo 2016 Product Update
kosho
0
180
Introducing Elastic Cloud
kosho
0
80
Gearing Up for Elastic Stack, X-Pack 5.0 Releases
kosho
0
150
Other Decks in Technology
See All in Technology
GitHub Issue Templates + Coding Agentで簡単みんなでIaC/Easy IaC for Everyone with GitHub Issue Templates + Coding Agent
aeonpeople
1
170
MySQLのJSON機能の活用術
ikomachi226
0
150
仕様書駆動AI開発の実践: Issue→Skill→PRテンプレで 再現性を作る
knishioka
2
560
FinTech SREのAWSサービス活用/Leveraging AWS Services in FinTech SRE
maaaato
0
120
プロポーザルに込める段取り八分
shoheimitani
0
150
変化するコーディングエージェントとの現実的な付き合い方 〜Cursor安定択説と、ツールに依存しない「資産」〜
empitsu
4
1.3k
AI時代、1年目エンジニアの悩み
jin4
1
160
あたらしい上流工程の形。 0日導入からはじめるAI駆動PM
kumaiu
5
750
顧客の言葉を、そのまま信じない勇気
yamatai1212
1
320
toCプロダクトにおけるAI機能開発のしくじりと学び / ai-product-failures-and-learnings
rince
6
5.5k
【インシデント入門】サイバー攻撃を受けた現場って何してるの?
shumei_ito
0
1.5k
What happened to RubyGems and what can we learn?
mikemcquaid
0
230
Featured
See All Featured
Speed Design
sergeychernyshev
33
1.5k
The Limits of Empathy - UXLibs8
cassininazir
1
210
ReactJS: Keep Simple. Everything can be a component!
pedronauck
666
130k
sira's awesome portfolio website redesign presentation
elsirapls
0
140
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.7k
Efficient Content Optimization with Google Search Console & Apps Script
katarinadahlin
PRO
0
310
Digital Ethics as a Driver of Design Innovation
axbom
PRO
1
170
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3.3k
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2k
Between Models and Reality
mayunak
1
180
Joys of Absence: A Defence of Solitary Play
codingconduct
1
280
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
12
1.4k
Transcript
‹#› Kosho Owa, Solutions Architect, Elastic Elasticsearch Meetup Tokyo, 2016-06-27
ElasticsearchͱLogstashͰ࢝ΊΔ Enterprise Search Platform
Elastic StackͷϢʔεέʔε 2 ϩά + ੳ ݕࡧ
ElasticsearchͱLogstashͰ࢝ΊΔEnterprise Search • ϢʔβSambaͰڞ༗͞ΕͨετϨʔδʹυΩϡϝϯτΛอଘ͢Δ • υΩϡϝϯτͷՃɺߋ৽ɺআλΠϜϦʔʹै͠ɺυΩϡϝϯτΛ ΠϯσοΫε͢Δ • ϢʔβΣϒͷΠϯλʔϑΣΠεΛ௨ͯ͡ݕࡧ͢Δ (ࠓճͷείʔϓ֎)
3
ϑϩʔ 4 ࡞ɺߋ৽ɺআ͞ ΕͨϑΝΠϧͷࢹ ϑΝΠϧͷύʔε υΩϡϝϯτͷ࡞ɺ ߋ৽ɺআ ݕࡧΠϯλʔϑΣΠ εͷఏڙ σʔλͷೖ
υΩϡϝϯτͷ࡞ɺߋ৽ɺআͷݕग़ 5 ΠϯλʔϑΣΠε ಛ inotify ଟ͘ͷσΟετϦϏϡʔγϣϯͰར༻Մೳ ࢹରσΟϨΫτϦશͯྻڍ͢Δඞཁ͕͋Δ fanotify มߋ͕ߦΘΕΔલʹڐՄɾෆڐՄΛܾΊΒΕΔ ରԠ͍ͯ͠ͳ͍σΟετϦϏϡʔγϣϯ͕ଟ͍
Linux Security Module ϑΝΠϧʹର͢Δଟ͘ͷΦϖϨʔγϣϯʹରԠ͍ͯ͠Δ ΧʔωϧϞδϡʔϧͱͯ͠࡞͢Δඞཁ͕͋Δ Samba VFS - full_audit ଟ͘ͷσΟετϦϏϡʔγϣϯͰར༻Մೳ ࠪϩάʹϑΝΠϧͷมߋ͕ग़ྗ͞ΕΔ
vfs_full_audit ग़ྗαϯϓϧ 6 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|open|ok|w|sample.txt Jun
18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|fstat|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|kernel_flock|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|create_file|ok|0x12019f|file| open_if|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|stat|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|sys_acl_get_file|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|get_nt_acl|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|strict_lock|ok|sample.txt:0-9:0 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|pread|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|strict_unlock|ok|sample.txt: 0-9:0 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|fstat|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|ntimes|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|strict_lock|ok|sample.txt:0-9:1 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|fstat|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|pwrite|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|strict_unlock|ok|sample.txt: 0-9:1 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|kernel_flock|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|close|ok|sample.txt
Sambaͷઃఆ • SambaͰڞ༗͍ͯ͠ΔྖҬͷvfs objectͱͯ͠full_auditΛࢦఆ • pwrite, rename, unlink ΠϕϯτͷΈࢹ͢Δ 7
# /etc/samba/smb.conf [public] comment = Public Stuff path = /home/samba … vfs objects = full_audit full_audit:success = pwrite rename unlink full_audit:failure = none
υΩϡϝϯτͷύʔε Mapper Attachments Plugin 8 $ cd elasticserach $ bin/plugin
install mapper-attachments • Elasticsearch Plugins and Integrations > Mapper Plugins > Mapper Attachments Plugin -https:// www.elastic.co/guide/en/elasticsearch/plugins/2.3/mapper-attachments.html • ElasticsearchͷΠϯετʔϧ • PPT, XLS, PDFͳͲͷҰൠతͳϑΥʔϚοτͷυΩϡϝϯτΛTikaΛ༻ ͯ͠ςΩετใΛൈ͖ग़͠ɺΠϯσοΫε͢Δ • Elasticsearch 5.0.0Ͱingest-attachmentʹஔ͖͑ݟࠐΈ
Mappingͷ࡞ 9 $ curl -d localhost:9200/docs -d ‘ { "mappings"
: { "_default_" : { "properties" : { "file" : { "type" : "attachment", "fields" : { "content" : { "type" : "string", "store" : true, "term_vector" : "with_positions_offsets", "analyzer" : "kuromoji" }, "author" : { "type" : "string", "store" : true, "analyzer" : "kuromoji" }, "title" : { "type" : "string", "store" : true, "analyzer" : "kuromoji" }, "name" : { "type" : "string", "store" : true, "analyzer" : "kuromoji" }, "date" : { "type" : "string", "store" : true }, "keywords" : { "type" : "string", "store" : true }, "content_type" : { "type" : "string", "store" : true }, "content_length" : { "type" : "string", "store" : true }, "language" : { "type" : "string", "store" : true } }}}}}}’
σʔλͷೖ • ϑΝΠϧͷύεͷSHA-1ΛElasticsearchͷυΩϡϝϯτͷ_idͱͯ͠࠾༻͢Δ • ϑΝΠϧͷ༰ΛBase64ͰΤϯίʔυͯ͠ɺ_contentϑΟʔϧυͷ༰ͱ͠ ͯPUT͢Δ • ϑΝΠϧ͕আ໊લ͕มߋ͞Εͨ߹ʹݹ͍υΩϡϝϯτΛDELETE͢Δ 10 $
curl localhost:9200/docs/doc/`echo sample.txt | sha1sum | cut -d’ ‘ -f1` -d ‘ { "_content": “ewogICJtYXBwaW5nc....gfQp9Cg==" }' $ curl -XDELETE localhost:9200/docs/doc/`echo sample.txt | md5sum | cut -d’ ‘ -f1`
ΠϕϯτͷύΠϓϥΠϯॲཧ Logstash 11 input filter output stdin file syslog jdbc
kafka s3 … grok geoip anonymize date mutate ruby csv … elasticsearch file csv http kafka stdout syslog …
Logstash - input {} ࠪϩάͷมߋΛࢹ͢Δ 12 input { file {
path => "/var/log/messages" } }
Logstash - filter {} 1/2 full_audit ϩάͷύʔεͱɺ_idͷܭࢉ 13 filter {
grok { match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} %{DATA:process}: % {USER:user}\|%{IP:clientip}\|%{DATA:operation}\|%{DATA:result}\|%{GREEDYDATA:file}" } add_field => { "file_hash" => "%{file}" } } anonymize { key => "something_secret" algorithm => "SHA1" fields => ["file_hash"] } …
Logstash - filter {} 2/2 ϑΝΠϧ໊มߋ࣌ͷ_idͷܭࢉ 14 filter { …
if "rename" in [operation] { grok { match => { "file" => "%{DATA:file_prev}\|%{GREEDYDATA:file_aft}" } add_field => { "file_prev_hash" => "%{file_prev}" "file_aft_hash" => "%{file_aft}" } } anonymize { key => "something_secret" algorithm => "SHA1" fields => ["file_prev_hash", "file_aft_hash"] } }
Logstash - output {} 1/3 ϑΝΠϧͷ࡞ɺߋ৽ 15 output { if
"pwrite" in [operation] { exec { command => "temp=$(mktemp) ; echo \{ \"file\": \{ \"_content\": \”$(base64 /home/samba/ {file})\”, \"_name\": \"%{file}\"\}\} > $temp ; curl -XPUT localhost:9200/docs/doc/%{file_hash} - d@$temp; rm $temp" } } … }
Logstash - output {} 2/3 ϑΝΠϧ໊ͷมߋ 16 output { …
if "rename" in [operation] { exec { command => "curl -XDELETE localhost:9200/docs/doc/%{file_prev_hash}" } exec { command => "temp=$(mktemp) ; echo \{ \"file\": \{ \"_content\": \"$(base64 /home/samba/% {file_aft})\”, \"_name\": \"%{file_aft}\"\}\} > $temp ; curl -XPUT localhost:9200/docs/doc/% {file_aft_hash} -d@$temp; rm $temp" } } … }
Logstash - output {} 2/3 ϑΝΠϧͷআ 17 output { …
if "unlink" in [operation] { exec { command => "curl -XDELETE localhost:9200/docs/doc/%{file_hash}" } } }
Search 18 $ curl “localhost:9200/docs/_search?q=*&fields=file.title” { "took": 17, "timed_out": false,
"_shards": { "total": 1, "successful": 1, "failed": 0 }, "hits": { "total": 13, "max_score": 1, "hits": [ { "_index": "docs", "_type": "doc", "_id": "dfb573e25b4b4b612b6694edc63b9ad17450daf0", "_score": 1, "fields": { "file.title": [ “PDFαϯϓϧϑΝΠϧ" ]
Future Work 19 ݕࡧUI ൚༻ੑ ෆ߹ͷղফ ηΩϡϦςΟ
Thank you! ϒϩάͷຊޠίϯςϯπੋඇ͝ཡ͍ͩ͘͞ 20