Lock in $30 Savings on PRO—Offer Ends Soon! ⏳
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Elasticsearch for SQL Users
Search
Shaunak Kashyap
March 16, 2016
Technology
0
60
Elasticsearch for SQL Users
As presented at Great Wide Open 2016
Shaunak Kashyap
March 16, 2016
Tweet
Share
More Decks by Shaunak Kashyap
See All by Shaunak Kashyap
Best Practices for building HTTP APIs
ycombinator
2
84
Elasticsearch Hands On Workshop
ycombinator
0
41
Other Decks in Technology
See All in Technology
Playwrightのソースコードに見る、自動テストを自動で書く技術
yusukeiwaki
13
5.2k
MLflowで始めるプロンプト管理、評価、最適化
databricksjapan
1
140
EM歴1年10ヶ月のぼくがぶち当たった苦悩とこれからへ向けて
maaaato
0
270
研究開発×プロダクトマネジメントへの挑戦 / ly_mlpm_meetup
sansan_randd
0
110
SSO方式とJumpアカウント方式の比較と設計方針
yuobayashi
7
590
今年のデータ・ML系アップデートと気になるアプデのご紹介
nayuts
1
260
学習データって増やせばいいんですか?
ftakahashi
2
300
MapKitとオープンデータで実現する地図情報の拡張と可視化
zozotech
PRO
1
130
Lessons from Migrating to OpenSearch: Shard Design, Log Ingestion, and UI Decisions
sansantech
PRO
1
120
AI時代の開発フローとともに気を付けたいこと
kkamegawa
0
2.9k
大企業でもできる!ボトムアップで拡大させるプラットフォームの作り方
findy_eventslides
1
700
寫了幾年 Code,然後呢?軟體工程師必須重新認識的 DevOps
cheng_wei_chen
1
1.3k
Featured
See All Featured
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
1.8k
Why Our Code Smells
bkeepers
PRO
340
57k
For a Future-Friendly Web
brad_frost
180
10k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
253
22k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
508
140k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.8k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
285
14k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.2k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
34
2.5k
Site-Speed That Sticks
csswizardry
13
1k
The Language of Interfaces
destraynor
162
25k
Music & Morning Musume
bryan
46
7k
Transcript
1 Shaunak Kashyap Developer Advocate at Elastic @shaunak Elasticsearch for
SQL users
2 The Elastic Stack Elasticsearch Store, Index & Analyze Kibana
User Interface Security Monitoring Alerting Plugins Logstash Beats Ingest Elastic Cloud: Elasticsearch as a Service Hosted Service
3 Agenda Search queries Data modeling Architecture 1 2 3
2 4 Agenda Search queries Data modeling Architecture 1 3
5 Agenda Search queries Data modeling 1 2 3 Architecture
6 Search Queries https://www.flickr.com/photos/samhames/4422128094
7 CREATE TABLE IF NOT EXISTS emails ( sender VARCHAR(255)
NOT NULL, recipients TEXT, cc TEXT, bcc TEXT, subject VARCHAR(1024), body MEDIUMTEXT, datetime DATETIME ); CREATE INDEX emails_sender ON emails(sender); CREATE FULLTEXT INDEX emails_subject ON emails(subject); CREATE FULLTEXT INDEX emails_body ON emails(body); curl -XPOST 'http://localhost:9200/enron' -d' { "mappings": { "email": { "properties": { "sender": { "type": "string", "index": "not_analyzed" }, "recipients": { "type": "string", "index": "not_analyzed" }, "cc": { "type": "string", "index": "not_analyzed" }, "bcc": { "type": "string", "index": "not_analyzed" }, "subject": { "type": "string", "analyzer": "english" }, "body": { "type": "string", "analyzer": "english" } } } } Schemas
8 Loading the data
9 [LIVE DEMO] • Search for text in a single
field • Search for text in multiple fields • Search for a phrase https://github.com/ycombinator/es-enron
10 Other Search Features Stemming Synonyms Did you mean? •
Jump, jumped, jumping • Queen, monarch • Monetery => Monetary
11 Data Modeling https://www.flickr.com/photos/samhames/4422128094 https://www.flickr.com/photos/ericparker/7854157310
12 To analyze or not to analyze? PUT cities/city/1 {
"city": "Atlanta", "population": 447841 } PUT cities/city/2 { "city": "New Albany", "population": 8829 } PUT cities/city/3 { "city": "New York", "population": 8406000 } POST cities/_search { "query": { "match": { "city": "New Albany" } } } QUERY + = ?
13 To analyze or not to analyze? PUT cities/city/1 {
"city": "Atlanta", "population": 447841 } PUT cities/city/2 { "city": "New Albany", "population": 8829 } PUT cities/city/3 { "city": "New York", "population": 8406000 } Term Document IDs Albany 2 New 2,3 Atlanta 1 York 3
14 To analyze or not to analyze? PUT cities {
"mappings": { "city": { "properties": { "city": { "type": "string", "index": "not_analyzed" } } } } } MAPPING Term Document IDs New Albany 2 New York 3 Atlanta 1
PUT blog/post/1 { "author_id": 1, "title": "...", "body": "..." }
PUT blog/post/2 { "author_id": 1, "title": "...", "body": "..." } PUT blog/post/3 { "author_id": 1, "title": "...", "body": "..." } 15 Relationships: Application-side joins PUT blog/author/1 { "name": "John Doe", "bio": "..." } POST blog/author/_search { "query": { "match": { "name": "John" } } } QUERY 1 POST blog/post/_search { "query": { "match": { "author_id": <each id from query 1 result> } } } QUERY 2
PUT blog/post/1 { "author_name": "John Doe", "title": "...", "body": "..."
} PUT blog/post/2 { "author_name": "John Doe", "title": "...", "body": "..." } 16 Relationships: Data denormalization POST blog/post/_search { "query": { "match": { "author_name": "John" } } } QUERY PUT blog/post/3 { "author_name": "John Doe", "title": "...", "body": "..." }
17 Relationships: Nested objects PUT blog/author/1 { "name": "John Doe",
"bio": "...", "blog_posts": [ { "title": "...", "body": "..." }, { "title": "...", "body": "..." }, { "title": "...", "body": "..." } ] } POST blog/author/_search { "query": { "match": { "name": "John" } } } QUERY
18 Relationships: Parent-child documents PUT blog/author/1 { "name": "John Doe",
"bio": "..." } POST blog/post/_search { "query": { "has_parent": { "type": "author", "query": { "match": { "name": "John" } } } QUERY PUT blog { "mappings": { "author": {}, "post": { "_parent": { "type": "author" } } } } PUT blog/post/1?parent=1 { "title": "...", "body": "..." } PUT blog/post/2?parent=1 { "title": "...", "body": "..." } PUT blog/post/3?parent=1 { "title": "...", "body": "..." }
19 Architecture https://www.flickr.com/photos/samhames/4422128094 https://www.flickr.com/photos/haribote/4871284379/
20 RDBMS Triggers database by Creative Stall from the Noun
Project 1 2
21 Async replication to Elasticsearch 1 2 3 ESSynchronizer flow
by Yamini Ahluwalia from the Noun Project
22 Async replication to Elasticsearch with Logstash 1 2 3
23 Forked writes from application 1 2
24 Forked writes from application (more robust) 1 2 queue
by Huu Nguyen from the Noun Project ESSynchronizer 3 4
25 Forked writes from application (more robust with Logstash) 1
2 3 4
26 Questions? @shaunak https://www.flickr.com/photos/nicknormal/2245559230/