Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enabling Enterprise Search Platform with Elasti...
Search
Kosho Owa
June 27, 2016
Technology
0
2.3k
Enabling Enterprise Search Platform with Elastic Stack
ElasticsearchとLogstashで始めるEnterprise Search Platform
Elasticsearch勉強会スライド
2016年6月27日
Kosho Owa
June 27, 2016
Tweet
Share
More Decks by Kosho Owa
See All by Kosho Owa
Introducing Machine Learning for the Elastic Stack
kosho
2
12k
Elastic Stack X-Pack 5.0 for IT Security Workshop
kosho
1
310
Elastic Stack X-Pack 5.0 for IT Ops Workshop
kosho
0
330
[Developers Summit 2017] Anomaly Detection with the Elastic Stack
kosho
1
710
Anomaly Detection with the Elastic Stack
kosho
1
1.8k
Getting Started with Elastic Cloud and Beats for Log Analytics
kosho
0
100
Elastic{ON} Seminar Tokyo 2016 Product Update
kosho
0
170
Introducing Elastic Cloud
kosho
0
76
Gearing Up for Elastic Stack, X-Pack 5.0 Releases
kosho
0
150
Other Decks in Technology
See All in Technology
AWSで推進するデータマネジメント
kawanago
1
1.3k
Webブラウザ向け動画配信プレイヤーの 大規模リプレイスから得た知見と学び
yud0uhu
0
230
BPaaSにおける人と協働する前提のAIエージェント-AWS登壇資料
kentarofujii
0
130
react-callを使ってダイヤログをいろんなとこで再利用しよう!
shinaps
1
230
生成AIでセキュリティ運用を効率化する話
sakaitakeshi
0
490
Agile PBL at New Grads Trainings
kawaguti
PRO
1
390
【初心者向け】ローカルLLMの色々な動かし方まとめ
aratako
7
3.4k
dbt開発 with Claude Codeのためのガードレール設計
10xinc
2
1.1k
Django's GeneratedField by example - DjangoCon US 2025
pauloxnet
0
120
Platform開発が先行する Platform Engineeringの違和感
kintotechdev
4
540
なぜスクラムはこうなったのか?歴史が教えてくれたこと/Shall we explore the roots of Scrum
sanogemaru
5
1.5k
新アイテムをどう使っていくか?みんなであーだこーだ言ってみよう / 20250911-rpi-jam-tokyo
akkiesoft
0
130
Featured
See All Featured
VelocityConf: Rendering Performance Case Studies
addyosmani
332
24k
Why You Should Never Use an ORM
jnunemaker
PRO
59
9.5k
How to Think Like a Performance Engineer
csswizardry
26
1.9k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.4k
Imperfection Machines: The Place of Print at Facebook
scottboms
268
13k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
18
1.1k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
30
9.6k
The Myth of the Modular Monolith - Day 2 Keynote - Rails World 2024
eileencodes
26
3k
GitHub's CSS Performance
jonrohan
1032
460k
Practical Orchestrator
shlominoach
190
11k
Facilitating Awesome Meetings
lara
55
6.5k
Mobile First: as difficult as doing things right
swwweet
224
9.9k
Transcript
‹#› Kosho Owa, Solutions Architect, Elastic Elasticsearch Meetup Tokyo, 2016-06-27
ElasticsearchͱLogstashͰ࢝ΊΔ Enterprise Search Platform
Elastic StackͷϢʔεέʔε 2 ϩά + ੳ ݕࡧ
ElasticsearchͱLogstashͰ࢝ΊΔEnterprise Search • ϢʔβSambaͰڞ༗͞ΕͨετϨʔδʹυΩϡϝϯτΛอଘ͢Δ • υΩϡϝϯτͷՃɺߋ৽ɺআλΠϜϦʔʹै͠ɺυΩϡϝϯτΛ ΠϯσοΫε͢Δ • ϢʔβΣϒͷΠϯλʔϑΣΠεΛ௨ͯ͡ݕࡧ͢Δ (ࠓճͷείʔϓ֎)
3
ϑϩʔ 4 ࡞ɺߋ৽ɺআ͞ ΕͨϑΝΠϧͷࢹ ϑΝΠϧͷύʔε υΩϡϝϯτͷ࡞ɺ ߋ৽ɺআ ݕࡧΠϯλʔϑΣΠ εͷఏڙ σʔλͷೖ
υΩϡϝϯτͷ࡞ɺߋ৽ɺআͷݕग़ 5 ΠϯλʔϑΣΠε ಛ inotify ଟ͘ͷσΟετϦϏϡʔγϣϯͰར༻Մೳ ࢹରσΟϨΫτϦશͯྻڍ͢Δඞཁ͕͋Δ fanotify มߋ͕ߦΘΕΔલʹڐՄɾෆڐՄΛܾΊΒΕΔ ରԠ͍ͯ͠ͳ͍σΟετϦϏϡʔγϣϯ͕ଟ͍
Linux Security Module ϑΝΠϧʹର͢Δଟ͘ͷΦϖϨʔγϣϯʹରԠ͍ͯ͠Δ ΧʔωϧϞδϡʔϧͱͯ͠࡞͢Δඞཁ͕͋Δ Samba VFS - full_audit ଟ͘ͷσΟετϦϏϡʔγϣϯͰར༻Մೳ ࠪϩάʹϑΝΠϧͷมߋ͕ग़ྗ͞ΕΔ
vfs_full_audit ग़ྗαϯϓϧ 6 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|open|ok|w|sample.txt Jun
18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|fstat|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|kernel_flock|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|create_file|ok|0x12019f|file| open_if|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|stat|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|sys_acl_get_file|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|get_nt_acl|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|strict_lock|ok|sample.txt:0-9:0 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|pread|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|strict_unlock|ok|sample.txt: 0-9:0 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|fstat|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|ntimes|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|strict_lock|ok|sample.txt:0-9:1 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|fstat|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|pwrite|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|strict_unlock|ok|sample.txt: 0-9:1 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|kernel_flock|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|close|ok|sample.txt
Sambaͷઃఆ • SambaͰڞ༗͍ͯ͠ΔྖҬͷvfs objectͱͯ͠full_auditΛࢦఆ • pwrite, rename, unlink ΠϕϯτͷΈࢹ͢Δ 7
# /etc/samba/smb.conf [public] comment = Public Stuff path = /home/samba … vfs objects = full_audit full_audit:success = pwrite rename unlink full_audit:failure = none
υΩϡϝϯτͷύʔε Mapper Attachments Plugin 8 $ cd elasticserach $ bin/plugin
install mapper-attachments • Elasticsearch Plugins and Integrations > Mapper Plugins > Mapper Attachments Plugin -https:// www.elastic.co/guide/en/elasticsearch/plugins/2.3/mapper-attachments.html • ElasticsearchͷΠϯετʔϧ • PPT, XLS, PDFͳͲͷҰൠతͳϑΥʔϚοτͷυΩϡϝϯτΛTikaΛ༻ ͯ͠ςΩετใΛൈ͖ग़͠ɺΠϯσοΫε͢Δ • Elasticsearch 5.0.0Ͱingest-attachmentʹஔ͖͑ݟࠐΈ
Mappingͷ࡞ 9 $ curl -d localhost:9200/docs -d ‘ { "mappings"
: { "_default_" : { "properties" : { "file" : { "type" : "attachment", "fields" : { "content" : { "type" : "string", "store" : true, "term_vector" : "with_positions_offsets", "analyzer" : "kuromoji" }, "author" : { "type" : "string", "store" : true, "analyzer" : "kuromoji" }, "title" : { "type" : "string", "store" : true, "analyzer" : "kuromoji" }, "name" : { "type" : "string", "store" : true, "analyzer" : "kuromoji" }, "date" : { "type" : "string", "store" : true }, "keywords" : { "type" : "string", "store" : true }, "content_type" : { "type" : "string", "store" : true }, "content_length" : { "type" : "string", "store" : true }, "language" : { "type" : "string", "store" : true } }}}}}}’
σʔλͷೖ • ϑΝΠϧͷύεͷSHA-1ΛElasticsearchͷυΩϡϝϯτͷ_idͱͯ͠࠾༻͢Δ • ϑΝΠϧͷ༰ΛBase64ͰΤϯίʔυͯ͠ɺ_contentϑΟʔϧυͷ༰ͱ͠ ͯPUT͢Δ • ϑΝΠϧ͕আ໊લ͕มߋ͞Εͨ߹ʹݹ͍υΩϡϝϯτΛDELETE͢Δ 10 $
curl localhost:9200/docs/doc/`echo sample.txt | sha1sum | cut -d’ ‘ -f1` -d ‘ { "_content": “ewogICJtYXBwaW5nc....gfQp9Cg==" }' $ curl -XDELETE localhost:9200/docs/doc/`echo sample.txt | md5sum | cut -d’ ‘ -f1`
ΠϕϯτͷύΠϓϥΠϯॲཧ Logstash 11 input filter output stdin file syslog jdbc
kafka s3 … grok geoip anonymize date mutate ruby csv … elasticsearch file csv http kafka stdout syslog …
Logstash - input {} ࠪϩάͷมߋΛࢹ͢Δ 12 input { file {
path => "/var/log/messages" } }
Logstash - filter {} 1/2 full_audit ϩάͷύʔεͱɺ_idͷܭࢉ 13 filter {
grok { match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} %{DATA:process}: % {USER:user}\|%{IP:clientip}\|%{DATA:operation}\|%{DATA:result}\|%{GREEDYDATA:file}" } add_field => { "file_hash" => "%{file}" } } anonymize { key => "something_secret" algorithm => "SHA1" fields => ["file_hash"] } …
Logstash - filter {} 2/2 ϑΝΠϧ໊มߋ࣌ͷ_idͷܭࢉ 14 filter { …
if "rename" in [operation] { grok { match => { "file" => "%{DATA:file_prev}\|%{GREEDYDATA:file_aft}" } add_field => { "file_prev_hash" => "%{file_prev}" "file_aft_hash" => "%{file_aft}" } } anonymize { key => "something_secret" algorithm => "SHA1" fields => ["file_prev_hash", "file_aft_hash"] } }
Logstash - output {} 1/3 ϑΝΠϧͷ࡞ɺߋ৽ 15 output { if
"pwrite" in [operation] { exec { command => "temp=$(mktemp) ; echo \{ \"file\": \{ \"_content\": \”$(base64 /home/samba/ {file})\”, \"_name\": \"%{file}\"\}\} > $temp ; curl -XPUT localhost:9200/docs/doc/%{file_hash} - d@$temp; rm $temp" } } … }
Logstash - output {} 2/3 ϑΝΠϧ໊ͷมߋ 16 output { …
if "rename" in [operation] { exec { command => "curl -XDELETE localhost:9200/docs/doc/%{file_prev_hash}" } exec { command => "temp=$(mktemp) ; echo \{ \"file\": \{ \"_content\": \"$(base64 /home/samba/% {file_aft})\”, \"_name\": \"%{file_aft}\"\}\} > $temp ; curl -XPUT localhost:9200/docs/doc/% {file_aft_hash} -d@$temp; rm $temp" } } … }
Logstash - output {} 2/3 ϑΝΠϧͷআ 17 output { …
if "unlink" in [operation] { exec { command => "curl -XDELETE localhost:9200/docs/doc/%{file_hash}" } } }
Search 18 $ curl “localhost:9200/docs/_search?q=*&fields=file.title” { "took": 17, "timed_out": false,
"_shards": { "total": 1, "successful": 1, "failed": 0 }, "hits": { "total": 13, "max_score": 1, "hits": [ { "_index": "docs", "_type": "doc", "_id": "dfb573e25b4b4b612b6694edc63b9ad17450daf0", "_score": 1, "fields": { "file.title": [ “PDFαϯϓϧϑΝΠϧ" ]
Future Work 19 ݕࡧUI ൚༻ੑ ෆ߹ͷղফ ηΩϡϦςΟ
Thank you! ϒϩάͷຊޠίϯςϯπੋඇ͝ཡ͍ͩ͘͞ 20