Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Enabling Enterprise Search Platform with Elasti...
Search
Kosho Owa
June 27, 2016
Technology
0
2.4k
Enabling Enterprise Search Platform with Elastic Stack
ElasticsearchとLogstashで始めるEnterprise Search Platform
Elasticsearch勉強会スライド
2016年6月27日
Kosho Owa
June 27, 2016
Tweet
Share
More Decks by Kosho Owa
See All by Kosho Owa
Introducing Machine Learning for the Elastic Stack
kosho
2
12k
Elastic Stack X-Pack 5.0 for IT Security Workshop
kosho
1
340
Elastic Stack X-Pack 5.0 for IT Ops Workshop
kosho
0
340
[Developers Summit 2017] Anomaly Detection with the Elastic Stack
kosho
1
720
Anomaly Detection with the Elastic Stack
kosho
1
1.8k
Getting Started with Elastic Cloud and Beats for Log Analytics
kosho
0
130
Elastic{ON} Seminar Tokyo 2016 Product Update
kosho
0
180
Introducing Elastic Cloud
kosho
0
80
Gearing Up for Elastic Stack, X-Pack 5.0 Releases
kosho
0
150
Other Decks in Technology
See All in Technology
AIと新時代を切り拓く。これからのSREとメルカリIBISの挑戦
0gm
0
1.2k
Digitization部 紹介資料
sansan33
PRO
1
6.8k
コスト削減から「セキュリティと利便性」を担うプラットフォームへ
sansantech
PRO
3
1.5k
顧客の言葉を、そのまま信じない勇気
yamatai1212
1
350
ブロックテーマでサイトをリニューアルした話 / 2026-01-31 Kansai WordPress Meetup
torounit
0
470
Bill One急成長の舞台裏 開発組織が直面した失敗と教訓
sansantech
PRO
2
380
Tebiki Engineering Team Deck
tebiki
0
24k
広告の効果検証を題材にした因果推論の精度検証について
zozotech
PRO
0
180
学生・新卒・ジュニアから目指すSRE
hiroyaonoe
2
620
データの整合性を保ちたいだけなんだ
shoheimitani
8
3.1k
量子クラウドサービスの裏側 〜Deep Dive into OQTOPUS〜
oqtopus
0
120
FinTech SREのAWSサービス活用/Leveraging AWS Services in FinTech SRE
maaaato
0
130
Featured
See All Featured
What Being in a Rock Band Can Teach Us About Real World SEO
427marketing
0
170
Code Review Best Practice
trishagee
74
20k
Everyday Curiosity
cassininazir
0
130
Taking LLMs out of the black box: A practical guide to human-in-the-loop distillation
inesmontani
PRO
3
2k
Java REST API Framework Comparison - PWX 2021
mraible
34
9.1k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
333
22k
Mind Mapping
helmedeiros
PRO
0
87
Mozcon NYC 2025: Stop Losing SEO Traffic
samtorres
0
140
Visualization
eitanlees
150
17k
Scaling GitHub
holman
464
140k
Practical Orchestrator
shlominoach
191
11k
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
170
Transcript
‹#› Kosho Owa, Solutions Architect, Elastic Elasticsearch Meetup Tokyo, 2016-06-27
ElasticsearchͱLogstashͰ࢝ΊΔ Enterprise Search Platform
Elastic StackͷϢʔεέʔε 2 ϩά + ੳ ݕࡧ
ElasticsearchͱLogstashͰ࢝ΊΔEnterprise Search • ϢʔβSambaͰڞ༗͞ΕͨετϨʔδʹυΩϡϝϯτΛอଘ͢Δ • υΩϡϝϯτͷՃɺߋ৽ɺআλΠϜϦʔʹै͠ɺυΩϡϝϯτΛ ΠϯσοΫε͢Δ • ϢʔβΣϒͷΠϯλʔϑΣΠεΛ௨ͯ͡ݕࡧ͢Δ (ࠓճͷείʔϓ֎)
3
ϑϩʔ 4 ࡞ɺߋ৽ɺআ͞ ΕͨϑΝΠϧͷࢹ ϑΝΠϧͷύʔε υΩϡϝϯτͷ࡞ɺ ߋ৽ɺআ ݕࡧΠϯλʔϑΣΠ εͷఏڙ σʔλͷೖ
υΩϡϝϯτͷ࡞ɺߋ৽ɺআͷݕग़ 5 ΠϯλʔϑΣΠε ಛ inotify ଟ͘ͷσΟετϦϏϡʔγϣϯͰར༻Մೳ ࢹରσΟϨΫτϦશͯྻڍ͢Δඞཁ͕͋Δ fanotify มߋ͕ߦΘΕΔલʹڐՄɾෆڐՄΛܾΊΒΕΔ ରԠ͍ͯ͠ͳ͍σΟετϦϏϡʔγϣϯ͕ଟ͍
Linux Security Module ϑΝΠϧʹର͢Δଟ͘ͷΦϖϨʔγϣϯʹରԠ͍ͯ͠Δ ΧʔωϧϞδϡʔϧͱͯ͠࡞͢Δඞཁ͕͋Δ Samba VFS - full_audit ଟ͘ͷσΟετϦϏϡʔγϣϯͰར༻Մೳ ࠪϩάʹϑΝΠϧͷมߋ͕ग़ྗ͞ΕΔ
vfs_full_audit ग़ྗαϯϓϧ 6 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|open|ok|w|sample.txt Jun
18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|fstat|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|kernel_flock|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|create_file|ok|0x12019f|file| open_if|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|stat|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|sys_acl_get_file|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|get_nt_acl|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|strict_lock|ok|sample.txt:0-9:0 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|pread|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|strict_unlock|ok|sample.txt: 0-9:0 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|fstat|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|ntimes|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|strict_lock|ok|sample.txt:0-9:1 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|fstat|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|pwrite|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|strict_unlock|ok|sample.txt: 0-9:1 Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|kernel_flock|ok|sample.txt Jun 18 05:50:08 ip-172-30-2-93 smbd_audit: nobody|172.30.2.196|close|ok|sample.txt
Sambaͷઃఆ • SambaͰڞ༗͍ͯ͠ΔྖҬͷvfs objectͱͯ͠full_auditΛࢦఆ • pwrite, rename, unlink ΠϕϯτͷΈࢹ͢Δ 7
# /etc/samba/smb.conf [public] comment = Public Stuff path = /home/samba … vfs objects = full_audit full_audit:success = pwrite rename unlink full_audit:failure = none
υΩϡϝϯτͷύʔε Mapper Attachments Plugin 8 $ cd elasticserach $ bin/plugin
install mapper-attachments • Elasticsearch Plugins and Integrations > Mapper Plugins > Mapper Attachments Plugin -https:// www.elastic.co/guide/en/elasticsearch/plugins/2.3/mapper-attachments.html • ElasticsearchͷΠϯετʔϧ • PPT, XLS, PDFͳͲͷҰൠతͳϑΥʔϚοτͷυΩϡϝϯτΛTikaΛ༻ ͯ͠ςΩετใΛൈ͖ग़͠ɺΠϯσοΫε͢Δ • Elasticsearch 5.0.0Ͱingest-attachmentʹஔ͖͑ݟࠐΈ
Mappingͷ࡞ 9 $ curl -d localhost:9200/docs -d ‘ { "mappings"
: { "_default_" : { "properties" : { "file" : { "type" : "attachment", "fields" : { "content" : { "type" : "string", "store" : true, "term_vector" : "with_positions_offsets", "analyzer" : "kuromoji" }, "author" : { "type" : "string", "store" : true, "analyzer" : "kuromoji" }, "title" : { "type" : "string", "store" : true, "analyzer" : "kuromoji" }, "name" : { "type" : "string", "store" : true, "analyzer" : "kuromoji" }, "date" : { "type" : "string", "store" : true }, "keywords" : { "type" : "string", "store" : true }, "content_type" : { "type" : "string", "store" : true }, "content_length" : { "type" : "string", "store" : true }, "language" : { "type" : "string", "store" : true } }}}}}}’
σʔλͷೖ • ϑΝΠϧͷύεͷSHA-1ΛElasticsearchͷυΩϡϝϯτͷ_idͱͯ͠࠾༻͢Δ • ϑΝΠϧͷ༰ΛBase64ͰΤϯίʔυͯ͠ɺ_contentϑΟʔϧυͷ༰ͱ͠ ͯPUT͢Δ • ϑΝΠϧ͕আ໊લ͕มߋ͞Εͨ߹ʹݹ͍υΩϡϝϯτΛDELETE͢Δ 10 $
curl localhost:9200/docs/doc/`echo sample.txt | sha1sum | cut -d’ ‘ -f1` -d ‘ { "_content": “ewogICJtYXBwaW5nc....gfQp9Cg==" }' $ curl -XDELETE localhost:9200/docs/doc/`echo sample.txt | md5sum | cut -d’ ‘ -f1`
ΠϕϯτͷύΠϓϥΠϯॲཧ Logstash 11 input filter output stdin file syslog jdbc
kafka s3 … grok geoip anonymize date mutate ruby csv … elasticsearch file csv http kafka stdout syslog …
Logstash - input {} ࠪϩάͷมߋΛࢹ͢Δ 12 input { file {
path => "/var/log/messages" } }
Logstash - filter {} 1/2 full_audit ϩάͷύʔεͱɺ_idͷܭࢉ 13 filter {
grok { match => { "message" => "%{SYSLOGTIMESTAMP:timestamp} %{HOSTNAME:hostname} %{DATA:process}: % {USER:user}\|%{IP:clientip}\|%{DATA:operation}\|%{DATA:result}\|%{GREEDYDATA:file}" } add_field => { "file_hash" => "%{file}" } } anonymize { key => "something_secret" algorithm => "SHA1" fields => ["file_hash"] } …
Logstash - filter {} 2/2 ϑΝΠϧ໊มߋ࣌ͷ_idͷܭࢉ 14 filter { …
if "rename" in [operation] { grok { match => { "file" => "%{DATA:file_prev}\|%{GREEDYDATA:file_aft}" } add_field => { "file_prev_hash" => "%{file_prev}" "file_aft_hash" => "%{file_aft}" } } anonymize { key => "something_secret" algorithm => "SHA1" fields => ["file_prev_hash", "file_aft_hash"] } }
Logstash - output {} 1/3 ϑΝΠϧͷ࡞ɺߋ৽ 15 output { if
"pwrite" in [operation] { exec { command => "temp=$(mktemp) ; echo \{ \"file\": \{ \"_content\": \”$(base64 /home/samba/ {file})\”, \"_name\": \"%{file}\"\}\} > $temp ; curl -XPUT localhost:9200/docs/doc/%{file_hash} - d@$temp; rm $temp" } } … }
Logstash - output {} 2/3 ϑΝΠϧ໊ͷมߋ 16 output { …
if "rename" in [operation] { exec { command => "curl -XDELETE localhost:9200/docs/doc/%{file_prev_hash}" } exec { command => "temp=$(mktemp) ; echo \{ \"file\": \{ \"_content\": \"$(base64 /home/samba/% {file_aft})\”, \"_name\": \"%{file_aft}\"\}\} > $temp ; curl -XPUT localhost:9200/docs/doc/% {file_aft_hash} -d@$temp; rm $temp" } } … }
Logstash - output {} 2/3 ϑΝΠϧͷআ 17 output { …
if "unlink" in [operation] { exec { command => "curl -XDELETE localhost:9200/docs/doc/%{file_hash}" } } }
Search 18 $ curl “localhost:9200/docs/_search?q=*&fields=file.title” { "took": 17, "timed_out": false,
"_shards": { "total": 1, "successful": 1, "failed": 0 }, "hits": { "total": 13, "max_score": 1, "hits": [ { "_index": "docs", "_type": "doc", "_id": "dfb573e25b4b4b612b6694edc63b9ad17450daf0", "_score": 1, "fields": { "file.title": [ “PDFαϯϓϧϑΝΠϧ" ]
Future Work 19 ݕࡧUI ൚༻ੑ ෆ߹ͷղফ ηΩϡϦςΟ
Thank you! ϒϩάͷຊޠίϯςϯπੋඇ͝ཡ͍ͩ͘͞ 20