Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Elasticsearch for SQL Users
Search
Shaunak Kashyap
March 16, 2016
Technology
0
60
Elasticsearch for SQL Users
As presented at Great Wide Open 2016
Shaunak Kashyap
March 16, 2016
Tweet
Share
More Decks by Shaunak Kashyap
See All by Shaunak Kashyap
Best Practices for building HTTP APIs
ycombinator
2
83
Elasticsearch Hands On Workshop
ycombinator
0
41
Other Decks in Technology
See All in Technology
Large Vision Language Modelを用いた 文書画像データ化作業自動化の検証、運用 / shibuya_AI
sansan_randd
0
130
『OCI で学ぶクラウドネイティブ 実践 × 理論ガイド』 書籍概要
oracle4engineer
PRO
3
170
「使い方教えて」「事例教えて」じゃもう遅い! Microsoft 365 Copilot を触り倒そう!
taichinakamura
0
250
成長自己責任時代のあるきかた/How to navigate the era of personal responsibility for growth
kwappa
4
300
Where will it converge?
ibknadedeji
0
200
OpenAI gpt-oss ファインチューニング入門
kmotohas
2
1.1k
"プロポーザルってなんか怖そう"という境界を超えてみた@TSUDOI by giftee Tech #1
shilo113
0
170
なぜAWSを活かしきれないのか?技術と組織への処方箋
nrinetcom
PRO
1
170
【Kaigi on Rails 事後勉強会LT】MeはどうしてGirlsに? 私とRubyを繋いだRail(s)
joyfrommasara
0
210
20201008_ファインディ_品質意識を育てる役目は人かAIか___2_.pdf
findy_eventslides
2
590
Git in Team
kawaguti
PRO
3
330
Adminaで実現するISMS/SOC2運用の効率化 〜 アカウント管理編 〜
shonansurvivors
4
420
Featured
See All Featured
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.7k
For a Future-Friendly Web
brad_frost
180
9.9k
A Tale of Four Properties
chriscoyier
160
23k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
Optimising Largest Contentful Paint
csswizardry
37
3.4k
Writing Fast Ruby
sferik
629
62k
Imperfection Machines: The Place of Print at Facebook
scottboms
269
13k
A Modern Web Designer's Workflow
chriscoyier
697
190k
The Language of Interfaces
destraynor
162
25k
Build your cross-platform service in a week with App Engine
jlugia
232
18k
Faster Mobile Websites
deanohume
310
31k
Music & Morning Musume
bryan
46
6.8k
Transcript
1 Shaunak Kashyap Developer Advocate at Elastic @shaunak Elasticsearch for
SQL users
2 The Elastic Stack Elasticsearch Store, Index & Analyze Kibana
User Interface Security Monitoring Alerting Plugins Logstash Beats Ingest Elastic Cloud: Elasticsearch as a Service Hosted Service
3 Agenda Search queries Data modeling Architecture 1 2 3
2 4 Agenda Search queries Data modeling Architecture 1 3
5 Agenda Search queries Data modeling 1 2 3 Architecture
6 Search Queries https://www.flickr.com/photos/samhames/4422128094
7 CREATE TABLE IF NOT EXISTS emails ( sender VARCHAR(255)
NOT NULL, recipients TEXT, cc TEXT, bcc TEXT, subject VARCHAR(1024), body MEDIUMTEXT, datetime DATETIME ); CREATE INDEX emails_sender ON emails(sender); CREATE FULLTEXT INDEX emails_subject ON emails(subject); CREATE FULLTEXT INDEX emails_body ON emails(body); curl -XPOST 'http://localhost:9200/enron' -d' { "mappings": { "email": { "properties": { "sender": { "type": "string", "index": "not_analyzed" }, "recipients": { "type": "string", "index": "not_analyzed" }, "cc": { "type": "string", "index": "not_analyzed" }, "bcc": { "type": "string", "index": "not_analyzed" }, "subject": { "type": "string", "analyzer": "english" }, "body": { "type": "string", "analyzer": "english" } } } } Schemas
8 Loading the data
9 [LIVE DEMO] • Search for text in a single
field • Search for text in multiple fields • Search for a phrase https://github.com/ycombinator/es-enron
10 Other Search Features Stemming Synonyms Did you mean? •
Jump, jumped, jumping • Queen, monarch • Monetery => Monetary
11 Data Modeling https://www.flickr.com/photos/samhames/4422128094 https://www.flickr.com/photos/ericparker/7854157310
12 To analyze or not to analyze? PUT cities/city/1 {
"city": "Atlanta", "population": 447841 } PUT cities/city/2 { "city": "New Albany", "population": 8829 } PUT cities/city/3 { "city": "New York", "population": 8406000 } POST cities/_search { "query": { "match": { "city": "New Albany" } } } QUERY + = ?
13 To analyze or not to analyze? PUT cities/city/1 {
"city": "Atlanta", "population": 447841 } PUT cities/city/2 { "city": "New Albany", "population": 8829 } PUT cities/city/3 { "city": "New York", "population": 8406000 } Term Document IDs Albany 2 New 2,3 Atlanta 1 York 3
14 To analyze or not to analyze? PUT cities {
"mappings": { "city": { "properties": { "city": { "type": "string", "index": "not_analyzed" } } } } } MAPPING Term Document IDs New Albany 2 New York 3 Atlanta 1
PUT blog/post/1 { "author_id": 1, "title": "...", "body": "..." }
PUT blog/post/2 { "author_id": 1, "title": "...", "body": "..." } PUT blog/post/3 { "author_id": 1, "title": "...", "body": "..." } 15 Relationships: Application-side joins PUT blog/author/1 { "name": "John Doe", "bio": "..." } POST blog/author/_search { "query": { "match": { "name": "John" } } } QUERY 1 POST blog/post/_search { "query": { "match": { "author_id": <each id from query 1 result> } } } QUERY 2
PUT blog/post/1 { "author_name": "John Doe", "title": "...", "body": "..."
} PUT blog/post/2 { "author_name": "John Doe", "title": "...", "body": "..." } 16 Relationships: Data denormalization POST blog/post/_search { "query": { "match": { "author_name": "John" } } } QUERY PUT blog/post/3 { "author_name": "John Doe", "title": "...", "body": "..." }
17 Relationships: Nested objects PUT blog/author/1 { "name": "John Doe",
"bio": "...", "blog_posts": [ { "title": "...", "body": "..." }, { "title": "...", "body": "..." }, { "title": "...", "body": "..." } ] } POST blog/author/_search { "query": { "match": { "name": "John" } } } QUERY
18 Relationships: Parent-child documents PUT blog/author/1 { "name": "John Doe",
"bio": "..." } POST blog/post/_search { "query": { "has_parent": { "type": "author", "query": { "match": { "name": "John" } } } QUERY PUT blog { "mappings": { "author": {}, "post": { "_parent": { "type": "author" } } } } PUT blog/post/1?parent=1 { "title": "...", "body": "..." } PUT blog/post/2?parent=1 { "title": "...", "body": "..." } PUT blog/post/3?parent=1 { "title": "...", "body": "..." }
19 Architecture https://www.flickr.com/photos/samhames/4422128094 https://www.flickr.com/photos/haribote/4871284379/
20 RDBMS Triggers database by Creative Stall from the Noun
Project 1 2
21 Async replication to Elasticsearch 1 2 3 ESSynchronizer flow
by Yamini Ahluwalia from the Noun Project
22 Async replication to Elasticsearch with Logstash 1 2 3
23 Forked writes from application 1 2
24 Forked writes from application (more robust) 1 2 queue
by Huu Nguyen from the Noun Project ESSynchronizer 3 4
25 Forked writes from application (more robust with Logstash) 1
2 3 4
26 Questions? @shaunak https://www.flickr.com/photos/nicknormal/2245559230/