Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Elasticsearch for SQL Users
Search
Elastic Co
March 30, 2016
Programming
1
510
Elasticsearch for SQL Users
As given at Code PaLOUsa 2016
Elastic Co
March 30, 2016
Tweet
Share
More Decks by Elastic Co
See All by Elastic Co
Les Vendredis noirs : même pas peur ! - Breizhcamp
elastic
15
780
Confoo Montreal: Ingest node: enriching documents within Elasticsearch
elastic
16
810
Elastic{ON} 2018 - Sipping from the Firehose: Scalable Endpoint Data for Incident Response
elastic
6
4.2k
Elastic{ON} 2018 - A Security Analytics Platform for Today
elastic
3
11k
Elastic{ON} 2018 - The State of Geo in Elasticsearch
elastic
7
12k
Elastic{ON} 2018 - Reliable by design - Applying formal methods to distributed systems
elastic
5
4.7k
Elastic{ON} 2018 - Bigger, Faster, Stronger - Leveling Up Enterprise Logging
elastic
1
4.9k
Elastic{ON} 2018: Latest in Logstash
elastic
1
4.5k
Elastic{ON} 2018 - Lessons Learned from Workday's Search Application Journey from POC to Production
elastic
2
2.4k
Other Decks in Programming
See All in Programming
Fast JSX: Don't clone props object #28768
yossydev
1
100
Java 22 Overview
kishida
1
180
Ruby GitHub Packages
bkuhlmann
0
630
PHP8.3の機能を振り返る / Review of PHP 8.3 features
seike460
PRO
1
110
デフォルトにして至高、RubyMineの大好きな所
ruzia
0
380
Git Rebase
bkuhlmann
11
1.6k
スキーマ駆動開発による品質とスピードの両立 - 私達は何故、スキーマを書くのか
kentaroutakeda
0
170
大規模Reactアプリのリアーキテクチャ~8万行のTanStack Query移行の軌跡~
kj455
4
960
ONE WEDGE_company_guide
1wedge_one
0
480
GitHub Actionsで泣かないためにやっておきたい設定 / Recommended GHA settings to avoid crying
pinkumohikan
3
530
Snowflakeで眠ったデータを起こそう!
estie
0
120
VS Code をプロダクトにどう取り込むか
onomax
1
360
Featured
See All Featured
Building Applications with DynamoDB
mza
88
5.6k
RailsConf 2023
tenderlove
4
540
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
25
2.3k
Easily Structure & Communicate Ideas using Wireframe
afnizarnur
187
16k
Side Projects
sachag
451
41k
Exploring the Power of Turbo Streams & Action Cable | RailsConf2023
kevinliebholz
2
3.4k
For a Future-Friendly Web
brad_frost
172
9k
Why You Should Never Use an ORM
jnunemaker
PRO
51
8.6k
The Cult of Friendly URLs
andyhume
74
5.7k
Bash Introduction
62gerente
604
210k
Mobile First: as difficult as doing things right
swwweet
216
8.6k
Building Effective Engineering Teams - LeadDev
addyosmani
28
1.8k
Transcript
1 Shaunak Kashyap Developer Advocate at Elastic @shaunak Elasticsearch for
SQL users
The Elastic Stack 2 Store, Index & Analyze User Interface
Plugins Ingest Hosted Service
3 Agenda Search queries Data modeling Architecture 1 2 3
2 4 Agenda Search queries Data modeling Architecture 1 3
5 Agenda Search queries Data modeling 1 2 3 Architecture
6 Search Queries https://www.flickr.com/photos/samhames/4422128094
7 CREATE TABLE IF NOT EXISTS emails ( sender VARCHAR(255)
NOT NULL, recipients TEXT, cc TEXT, bcc TEXT, subject VARCHAR(1024), body MEDIUMTEXT, datetime DATETIME ); CREATE INDEX emails_sender ON emails(sender); CREATE FULLTEXT INDEX emails_subject ON emails(subject); CREATE FULLTEXT INDEX emails_body ON emails(body); curl -XPOST 'http://localhost:9200/enron' -d' { "mappings": { "email": { "properties": { "sender": { "type": "string", "index": "not_analyzed" }, "recipients": { "type": "string", "index": "not_analyzed" }, "cc": { "type": "string", "index": "not_analyzed" }, "bcc": { "type": "string", "index": "not_analyzed" }, "subject": { "type": "string", "analyzer": "english" }, "body": { "type": "string", "analyzer": "english" } } } } Schemas
8 Loading the data
9 [LIVE DEMO] • Search for text in a single
field • Search for text in multiple fields • Search for a phrase https://github.com/ycombinator/es-enron
10 Other Search Features Stemming Synonyms Did you mean? •
Jump, jumped, jumping • Queen, monarch • Monetery => Monetary
11 Data Modeling https://www.flickr.com/photos/samhames/4422128094 https://www.flickr.com/photos/ericparker/7854157310
12 To analyze or not to analyze? PUT cities/city/1 {
"city": "Louisville", "population": 597337 } PUT cities/city/2 { "city": "New Albany", "population": 8829 } PUT cities/city/3 { "city": "New York", "population": 8406000 } POST cities/_search { "query": { "match": { "city": "New Albany" } } } QUERY + = ?
13 To analyze or not to analyze? PUT cities/city/1 {
"city": "Louisville", "population": 597337 } PUT cities/city/2 { "city": "New Albany", "population": 8829 } PUT cities/city/3 { "city": "New York", "population": 8406000 } Term Document IDs Albany 2 New 2,3 Louisville 1 York 3
14 To analyze or not to analyze? PUT cities {
"mappings": { "city": { "properties": { "city": { "type": "string", "index": "not_analyzed" } } } } } MAPPING Term Document IDs New Albany 2 New York 3 Louisville 1
PUT blog/post/1 { "author_id": 1, "title": "...", "body": "..." }
PUT blog/post/2 { "author_id": 1, "title": "...", "body": "..." } PUT blog/post/3 { "author_id": 1, "title": "...", "body": "..." } 15 Relationships: Application-side joins PUT blog/author/1 { "name": "John Doe", "bio": "..." } POST blog/author/_search { "query": { "match": { "name": "John" } } } QUERY 1 POST blog/post/_search { "query": { "match": { "author_id": <each id from query 1 result> } } } QUERY 2
PUT blog/post/1 { "author_name": "John Doe", "title": "...", "body": "..."
} PUT blog/post/2 { "author_name": "John Doe", "title": "...", "body": "..." } 16 Relationships: Data denormalization POST blog/post/_search { "query": { "match": { "author_name": "John" } } } QUERY PUT blog/post/3 { "author_name": "John Doe", "title": "...", "body": "..." }
17 Relationships: Nested objects PUT blog/author/1 { "name": "John Doe",
"bio": "...", "blog_posts": [ { "title": "...", "body": "..." }, { "title": "...", "body": "..." }, { "title": "...", "body": "..." } ] } POST blog/author/_search { "query": { "match": { "name": "John" } } } QUERY
18 Relationships: Parent-child documents PUT blog/author/1 { "name": "John Doe",
"bio": "..." } POST blog/post/_search { "query": { "has_parent": { "type": "author", "query": { "match": { "name": "John" } } } QUERY PUT blog { "mappings": { "author": {}, "post": { "_parent": { "type": "author" } } } } PUT blog/post/1?parent=1 { "title": "...", "body": "..." } PUT blog/post/2?parent=1 { "title": "...", "body": "..." } PUT blog/post/3?parent=1 { "title": "...", "body": "..." }
19 Architecture https://www.flickr.com/photos/samhames/4422128094 https://www.flickr.com/photos/haribote/4871284379/
20 RDBMS Triggers database by Creative Stall from the Noun
Project 1 2
21 Async replication to Elasticsearch 1 2 3 ESSynchronizer flow
by Yamini Ahluwalia from the Noun Project
22 Async replication to Elasticsearch with Logstash 1 2 3
23 Forked writes from application 1 2
24 Forked writes from application (more robust) 1 2 queue
by Huu Nguyen from the Noun Project ESSynchronizer 3 4
25 Forked writes from application (more robust with Logstash) 1
2 3 4
26 Questions? @shaunak https://www.flickr.com/photos/nicknormal/2245559230/