Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Visualizing Your E-mail with Elastic Stack

Visualizing Your E-mail with Elastic Stack

警視庁の犯罪・防犯情報提供サービス「メールけいしちょう」で受信したメッセージを Elasticsearch でインデックスし、Kibana で可視化する方法を紹介します。

Kosho Owa

April 20, 2016
Tweet

More Decks by Kosho Owa

Other Decks in Technology

Transcript

  1. ର৅σʔλ • ܯࢹிͷϝʔϧ͚͍ͪ͠ΐ͏(ొ࿥ແྉ)
 http://www.keishicho.metro.tokyo.jp/about_mpd/joho/mail_info.html • ʮ൜ࡑൃੜ৘ใʯʮ๷൜৘ใʯΛϝʔϧ഑৴ • CC BY 2.1

    JP Ͱఏڙ 2 Subject: ۄ઒ܯ࡯ॺ(ࢠͲ΋ʢ๫ߦʣ) Body: 4݄16೔ʢ౔ʣɺޕޙ4࣌40෼͜Ζɺੈా୩۠Ԟ୔̍ஸ໨ͷ࿏্Ͱɺࣇಐ͕௨ߦதɺஉʹಥ ͖ඈ͹͞Ε·ͨ͠ɻʢ൜ਓʢஉʣͷಛ௃ʹ͍ͭͯ͸ɺ̑̌ࡀ୅ɺ170cm Ґɺத೑ɺޱͻ ͛ɺ஡৭ͬΆ্͍ҥɺࠇ৭ͬΆ͍ζϘϯʣ ʲ໰߹ͤઌʳۄ઒ܯ࡯ॺ 03-3705-0110ʢ಺ઢ2612ʣ
  2. Logstash Pipeline and Plugins ϓϥάΠϯՄೳͳΞʔΩςΫνϟʔͱɺ։ൃऀʹ༏͍͠ΤίγεςϜ 4 input {} filter {}

    output {} beats, file, graphite, http, imap, kafka, rss, redis, stdin, sqlite, s3, syslog, zenoss and etc. csv, cloudwatch, email, elasticsearch, exec, file, graphite, http, kafka, mongodb, nagios, redis, s3, syslog, stdout, zabbix and etc.
  3. Input Plugin - imap 5 input { imap { host

    => "imap.gmail.com" port => 993 user => "_IMAP_USER_" password => "_IMAP_PASSWORD_" folder => "_IMAP_FOLDER_" type => "_TYPE_" check_interval => 300 codec => plain { charset => "ISO-2022-JP" } } } • ϝʔϧຊจͷΤϯίʔυΛcodecͰࢦఆ͢Δ • ͋Β͔͡ΊIMAPͷfolderΛ࢓෼͚ • ෳ਺ͷλΠϓϝʔϧΛॲཧ͢Δ৔߹ʹ͸λά(tags)Λ෇Ճ͢Δ https://www.elastic.co/guide/en/logstash/current/plugins-inputs-imap.html • ஫: ίϛϡχςΟϓϥάΠϯ
  4. Filter Plugin • ϝʔϧͷຊจ͔Βൈ͖ग़͢ϑΟʔϧυ: city, area, place • λΠτϧ͔Βൈ͖ग़͢ϑΟʔϧυ: police_station,

    incident • λΠϜελϯϓͱͯ͠࠾༻: datetime 6 filter { grok { match => { "message" => "%{DATA:[@metadata][datetime]}͜Ζɺ%{NOTSPACE:city}(۠|ࢢ)% {NOTSPACE:area}(ͷ|෇ۙ)(%{NOTSPACE:place}|)Ͱɺ%{GREEDYDATA}" } } date { match => ["[@metadata][datetime]", "M݄d೔ʢEʣɺaK࣌m෼"] locale => ja timezone => "Asia/Tokyo" } grok { match => { "subject" => "%{NOTSPACE:police_station}ܯ࡯ॺ\(%{NOTSPACE:incident}\)" } } }
  5. ೖྗσʔλ grok ग़ྗ Filter Plugin - grok • ύλʔϯʹϚονͨ͠จࣈྻΛϑΟʔϧυʹؔ࿈෇͚ɺඇߏ଄σʔλΛߏ଄Խ͢Δ https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html

    7 “subject” => “ۄ઒ܯ࡯ॺ(ࢠͲ΋ʢ๫ߦʣ)” grok { match => { "subject" => "%{NOTSPACE:police_station}ܯ࡯ॺ\(%{NOTSPACE:incident}\)" } } “police_station” => "ۄ઒" “incident" => "(ࢠͲ΋ʢ๫ߦʣ)"
  6. ೖྗσʔλ date ग़ྗ Filter Plugin - date ೔෇ϑΟʔϧυΛύʔε͠ɺLogstashͷΠϕϯτ೔෇ͱͯ͠࢖༻ 8 "datetime"

    => “4݄16೔ʢ౔ʣɺޕલ7࣌40෼” "@timestamp" => "2016-04-16T07:40:00.000Z" • ೔෇ͷऔಘʹࣦഊͨ͠৔߹ʹ͸ɺॲཧ೔͕࣌@timestampͱͯ͠࠾༻͞ΕΔ (tag_on_failure => true ΋ݕ౼) https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html date { match => ["[@metadata][datetime]", "M݄d೔ʢEʣɺaK࣌m෼"] locale => ja timezone => "Asia/Tokyo" }
  7. Output Plugin - elasticsearch 9 output { stdout { codec

    => dots } elasticsearch { hosts => ["http://127.0.0.1:9200/"] index => "mail-%{+YYYY.MM}" } } • stdout { codec => dots } ͰɺҰ݅ॲཧ͝ͱʹυοτΛग़ྗ͢Δ • ΠϯσοΫε͕ద੾ͳαΠζʹͳΔΑ͏ɺΠϯσοΫε໊Λݕ౼͢Δ https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html
  8. Logstash Tips • ग़ྗ࣌ʹύΠϓϥΠϯΛදࣔ • ϫʔΧʔ਺Λద੾ʹઃఆ͢Δ • ҟͳΔछྨͷσʔλ͸ɺLogstash΁ͷೖྗલʹ࢓෼͚͓ͯ͘ • grok

    ϔϧύʔπʔϧΛ࢖͏ http://grokdebug.herokuapp.com http://grokconstructor.appspot.com 10 output { stdout { codec => rubydebug } } $ logstash -w [NUMBER OF WORKERS] -f [PATH TO CONFIG]
  9. Elasticsearch - Mapping • text (analyzed strings), keyword(not_analyzed strings)ϑΟʔϧυ͸5.0͔Βಋೖ •

    textϑΟʔϧυͷanalyzerʹkuromojiΛࢦఆ͢Δ • terms aggregationΛߦ͏ͨΊʹɺmulti-fieldػೳΛ࢖ͬͯkeywordϑΟʔϧυΛࢦఆ͢Δ 11 PUT /_template/mail-1 { "template": "mail-*", "mappings": { "_default_": { "properties": { "message": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } }, "analyzer": "kuromoji" },... }}}}