Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
You're not using Elasticsearch?
Search
Timon Vonk
March 13, 2014
Technology
2
280
You're not using Elasticsearch?
An introduction to Elasticsearch 1.0 and how Tolq uses it for its translation memory
Timon Vonk
March 13, 2014
Tweet
Share
Other Decks in Technology
See All in Technology
Bet "Bet AI" - Accelerating Our AI Journey #BetAIDay
layerx
PRO
4
1.7k
Amazon Bedrock AgentCoreのフロントエンドを探す旅 (Next.js編)
kmiya84377
1
140
Amazon Qで2Dゲームを作成してみた
siromi
0
140
生成AI導入の効果を最大化する データ活用戦略
ham0215
0
160
AIに目を奪われすぎて、周りの困っている人間が見えなくなっていませんか?
cap120
1
640
家族の思い出を形にする 〜 1秒動画の生成を支えるインフラアーキテクチャ
ojima_h
3
1.1k
生成AIによるデータサイエンスの変革
taka_aki
0
3k
Infrastructure as Prompt実装記 〜Bedrock AgentCoreで作る自然言語インフラエージェント〜
yusukeshimizu
1
120
o11yツールを乗り換えた話
tak0x00
2
1.4k
Segment Anything Modelの最新動向:SAM2とその発展系
tenten0727
0
770
마라톤 끝의 단거리 스퍼트: 2025년의 AI
inureyes
PRO
1
750
【OptimizationNight】数理最適化のラストワンマイルとしてのUIUX
brainpadpr
2
480
Featured
See All Featured
Writing Fast Ruby
sferik
628
62k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
46
7.6k
Java REST API Framework Comparison - PWX 2021
mraible
33
8.8k
Product Roadmaps are Hard
iamctodd
PRO
54
11k
How GitHub (no longer) Works
holman
314
140k
Producing Creativity
orderedlist
PRO
347
40k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
110
19k
Mobile First: as difficult as doing things right
swwweet
223
9.9k
Six Lessons from altMBA
skipperchong
28
3.9k
How STYLIGHT went responsive
nonsquared
100
5.7k
Making the Leap to Tech Lead
cromwellryan
134
9.5k
Navigating Team Friction
lara
188
15k
Transcript
You’re not using ElasticSearch? Timon Vonk (@timonvonk)
About me • CTO @ Tolq, zero effort website translations
• Freelance hacker and consultant
None
The battlefield • Ferret • Xapian • Solr • Postgresql’s
TSVectors • Sphinx • Many more
What is ElasticSearch? • Search engine • Cloud in mind
• JSON API • Scriptable • Nosql • Great for things different than search
Why use it? • Cloud setup out of the box
• Fast indexing • Easy API • On the fly mappings • Very customisable
OMG WHAT IS LUCENE • Java library for search •
Only handles the search bit • Terms based vector algorithm
Making a query • Just send JSON: GET localhost:9200/example/peanuts/_search {
‘query’: { text: { ‘my_field’: ’many search terms’ }}} { took: 5, timed_out: false, _shards: { total: 5, successful: 5, failed: 0 }, hits: [ { _index: “example”, _type: “peanuts”, _score: 0.9, _source: { …data } } ] } }
Other types of queries • Terms, full text, boolean, fuzzy,
geolocation and lots more variants • Filters, aggregations, percolation, suggestions
Analysing • Pre-index and pre-search • This is when scoring
happens • Remove stop words, stemming, other normalisations • You can create your own analysers
Aggregations { “query”: … } { “aggregations”: { “rubyist_stats”: {
“stats”: { “field”: “meetup_visits” } } } } Aggregations are an upgrade over < 1.0 facets
Different kinds • min, max, avg, sum • stats, extended_stats
(all of the above + stdev/ mean) • percentile • counts • … all scriptable!
Also buckets! • Create subsets based on conditions • Nest
aggregations • … aaand scriptable
(Ruby) libraries • Good old Tire deprecated • Stretcher! •
elasticsearch-ruby • Also libraries for Go, Node, Javascript, etc • … it’s just json
How Tolq uses ElasticSearch • Suggest better translations for translators
• Fast access to text and translations for general search
Other fun stuff • Kibana • Hadoop • Hosted, autoscaling
providers
Also, it works great for text search! Thanks! We’ve launched!
http://www.tolq.com And hiring!