Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
solrとelasticsearchの比較
Search
genta kaneyama
November 26, 2012
Programming
15
5.6k
solrとelasticsearchの比較
elasticsearchの紹介です!
atnd.org/events/33718
genta kaneyama
November 26, 2012
Tweet
Share
More Decks by genta kaneyama
See All by genta kaneyama
MOSHでの生成AI活用の取り組み
penguinco
0
180
search and community in cookpad 2019
penguinco
2
1.9k
行動ログでプロダクトを改善するには/exploit user behavior for product
penguinco
4
9.2k
Solr @ CROSS2015 C4
penguinco
1
1.4k
how to improve search
penguinco
8
2k
Other Decks in Programming
See All in Programming
After go func(): Goroutines Through a Beginner’s Eye
97vaibhav
0
230
Serena MCPのすすめ
wadakatu
4
900
育てるアーキテクチャ:戦い抜くPythonマイクロサービスの設計と進化戦略
fujidomoe
1
150
ててべんす独演会〜Flowの全てを語ります〜
tbsten
1
220
階層構造を表現するデータ構造とリファクタリング 〜1年で10倍成長したプロダクトの変化と課題〜
yuhisatoxxx
3
920
CSC305 Lecture 03
javiergs
PRO
0
230
大規模アプリのDIフレームワーク刷新戦略 ~過去最大規模の並行開発を止めずにアプリ全体に導入するまで~
mot_techtalk
0
380
2分台で1500examples完走!爆速CIを支える環境構築術 - Kaigi on Rails 2025
falcon8823
3
3.2k
AIで開発生産性を上げる個人とチームの取り組み
taniigo
0
130
Чего вы не знали о строках в Python – Василий Рябов, PythoNN
sobolevn
0
160
SpecKitでどこまでできる? コストはどれくらい?
leveragestech
0
520
いま中途半端なSwift 6対応をするより、Default ActorやApproachable Concurrencyを有効にしてからでいいんじゃない?
yimajo
2
340
Featured
See All Featured
Java REST API Framework Comparison - PWX 2021
mraible
33
8.8k
The Cost Of JavaScript in 2023
addyosmani
53
9k
Building Applications with DynamoDB
mza
96
6.6k
Improving Core Web Vitals using Speculation Rules API
sergeychernyshev
19
1.2k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
What's in a price? How to price your products and services
michaelherold
246
12k
The Cult of Friendly URLs
andyhume
79
6.6k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
Testing 201, or: Great Expectations
jmmastey
45
7.7k
Bash Introduction
62gerente
615
210k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
9
850
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
49
3.1k
Transcript
elasticsearchͱSolrͷൺֱ ݉ࢁ ݩଠ @penguinana_ Monday, November 26, 12
ࣗݾհ • ݉ࢁ ݩଠ @penguinana_ • ϨγϐݕࡧνʔϜ @ http://cookpad.com/ •
Solr4.0 Monday, November 26, 12
SolrͷόʔδϣϯΞοϓΛ ݕ౼͍ͯ͠Δͱ͖... Monday, November 26, 12
Elasticsearch ௐͨ΄͏͕͍͍ͷͰʁ Monday, November 26, 12
• Luceneϕʔε • HTTP API • ࢄݕࡧOK • ຊޠOK Monday,
November 26, 12
• Luceneϕʔε • HTTP API • ࢄݕࡧOK • ຊޠOK طࢹײ
Monday, November 26, 12
http://solr-vs-elasticsearch.com/ Monday, November 26, 12
ײ • ػೳ໘Ͱෆͳ͍ • API͕։ൃऀʹ͍͞͠ • ༰қʹशಘͰ͖Δ • େنࢄݕࡧҎ֎Ͱ༗༻ •
SolrΛͬͯͳ͚Εͬͪ͜Λຊ൪ʹ͍ͨ ͍ʂ Monday, November 26, 12
αϯϓϧΛͬͯ ͻͱ௨Γઆ໌͠·͢ Monday, November 26, 12
http://blog.livedoor.jp/techblog/archives/65836960.html Monday, November 26, 12
livedoorάϧϝ • Ϩετϥϯใ(21.4ສళ) • ళ໊ɺѻ͍ͬͯΔྉཧɺॅॴɺҢ ܦɺΞΫηεɺ࠷دΓฑߦ͖͔ Βͷڑɺetc... Monday, November 26,
12
livedoorάϧϝ • ϨϏϡʔใ(20.5ສϨϏϡʔ) • ૯߹ධՁʢ5ஈ֊ʣ • งғؾɺஈɺαʔϏεɺຯ • ϨϏϡʔίϝϯτ Monday,
November 26, 12
https://github.com/penguinco/ld_gourmet_search Monday, November 26, 12
ElasticsearchΛ͏ • 1݅ొͯ͠ɺ1݅ݕࡧ • ຊޠͷѻ͍Λఆٛ • εΩʔϚఆٛ • औΓࠐΈ •
ݕࡧ • είΞϦϯάͳͲͷௐ Monday, November 26, 12
PUT curl -XPUT http://localhost:9200/twitter/tweet/1 -d ' { "user": "kimchy", "post_date":
"2012-11-26T20:12:00", "message": "Trying out elasticsearch", "score": 5 } ' index type id Monday, November 26, 12
PUT curl -XPUT http://localhost:9200/twitter/user/kimchy -d ' { "name" : "Shay
Banon" } ' index type id Monday, November 26, 12
GET curl -XGET http://localhost:9200/twitter/tweet/1 { "user": "kimchy", "post_date": "2012-11-26T20:12:00", "message":
"Trying out elasticsearch", "score": 5 } } index type id Monday, November 26, 12
SEARCH curl -XGET http://localhost:9200/twitter/tweet/_search -d '{ "query" : { "term"
: { "user": "kimchy" } } }' index type id { "user": "kimchy", "post_date": "2012-11-26T20:12:00", "message": "Trying out elasticsearch", "score": 5 } Monday, November 26, 12
REST API • υΩϡϝϯτͷՃɾআ • ઃఆͷՃɾআ • શ෦HTTP APIͰͰ͖Δ •
εΩʔϚϑϦʔ Monday, November 26, 12
ຊޠ $ curl -XGET 'localhost:9200/_analyze?pretty' -d 'ਆઘ' { "tokens" :
[ { "token" : "ਆ", "start_offset" : 0, "end_offset" : 1, "type" : "<IDEOGRAPHIC>", "position" : 1 }, { "token" : "ઘ", "start_offset" : 1, "end_offset" : 2, "type" : "<IDEOGRAPHIC>", "position" : 2 } ] } Monday, November 26, 12
ຊޠ AnalyzerΛมߋ͢Δ͜ͱͰରԠ kuromoji͕͑·͢ʂ http://www.hirotakaster.com/archives/2012/11/ elasticsearch-kuromoji-plugin.php Monday, November 26, 12
kuromoji $ cd elasticsearch $ bin/plugin -install elasticsearch/elasticsearch-analysis-kuromoji/1.0.0 $ git
clone git://github.com/elasticsearch/elasticsearch-analysis- kuromoji.git $ cd elasticsearch-analysis-kuromoji/ $ mvn clean package $ cp target/elasticsearch-analysis-kuromoji-1.2.0-SNAPSHOT.jar ../plugins/ analysis-kuromoji/elasticsearch-analysis-kuromoji-1.0.0.jar # restart elasticsearch Monday, November 26, 12
add analyzer $ curl -XPUT 'localhost:9200/test/' -d ' { "index":{
"analysis":{ "tokenizer" : { "kuromoji" : { "type":"kuromoji_tokenizer", "mode":"search" } }, "analyzer" : { "kuromoji_analyzer" : { "type" : "custom", "tokenizer" : "kuromoji_tokenizer" } } } } } ‘ Monday, November 26, 12
kuromoji $ curl -XGET 'localhost:9200/test/_analyze? analyzer=kuromoji_analyzer&pretty' -d 'ਆઘ' { "tokens"
: [ { "token" : "ਆઘ", "start_offset" : 0, "end_offset" : 2, "type" : "word", "position" : 1 } ] } Monday, November 26, 12
_analyze $ curl -XGET 'localhost:9200/test/_analyze? analyzer=kuromoji_analyzer&pretty' -d 'ؔࠃࡍۭߓ' { "tokens"
: [ {"token" : "ؔ",}, {"token" : "ؔࠃࡍۭߓ",}, {"token" : "ࠃࡍ",}, {"token" : "ۭߓ",} ] } Monday, November 26, 12
kuromojiΛσϑΥϧτʹ • default͍ͬͯ͏໊લͰanalyzerΛએݴ Monday, November 26, 12
ಉٛޠ • Solrಉ༷ಉٛޠ͕ϑΝΠϧͰॻ͚Δ • +WordNetܗࣜ͑Δ Monday, November 26, 12
analyzer Monday, November 26, 12
ຊޠͷ৺͋Δఔย͍ͨʂ Monday, November 26, 12
εΩʔϚఆٛ • εΩʔϚϑϦʔʂ • JSONͷܕ͕࠾༻͞ΕΔ • ڧ੍తʹఆٛͰ͖Δ(mapping) Monday, November 26,
12
mappingྫ $ curl -XPUT 'http://localhost:9200/twitter/tweet/ _mapping' -d ' { "tweet"
: { "properties" : { "message" : {"type" : "string", "store" : "yes"} } } } ' Monday, November 26, 12
Solrͱͷࠩ • SolrͷDynamicFieldΑΓ؆୯ • type • 1ίΞʹෳछྨͷdocΛೖΕΔ͜ ͱΛఆͯ͋ͬͯ͠ศར Monday, November
26, 12
import(ruby) ratings = [] CSV.foreach("ratings.csv") do |row| ratings << {
:id => row[:id].to_i, :restaurant_id => row[:restaurant_id].to_i, :body => row[:body], :type => 'rating' } end Tire.index 'livedoor_gourmet' do import ratings end Monday, November 26, 12
ݕࡧ curl -X GET 'http://localhost:9200/livedoor_gourmet/ restaurant/_search?pretty' -d ' { "query":{
"query_string":{ "query":"ϥʔϝϯ" } }, "sort":[{"access_count":"desc"}], "filter":{ "term":{"closed":"0"} } } ' Monday, November 26, 12
Solrͱͷࠩ • DSL͕݁ߏҧ͏ • filter, facet, grouping, highlightαϙʔτ • είΞϦϯάεΫϦϓτݴޠͰఆٛ
Ͱ͖Δ Monday, November 26, 12
είΞϦϯά • PVॱͰฒͨΒ͏·͍ͬͨ͆͘ • ݱ࣮ͷ݁ߏ͜͏͍͏͜ͱଟ͍ Monday, November 26, 12
είΞϦϯά • ڵຯͷ͋Δํͥͻ • εΫϦϓτݴޠͰఆٛͰ͖Δ • google: elasticsearch guide scoring
Monday, November 26, 12
ײ • ػೳ໘Ͱෆͳ͍ • API͕։ൃऀʹ͍͞͠ • ༰қʹशಘͰ͖Δ • େنࢄݕࡧҎ֎Ͱ༗༻ Monday,
November 26, 12
API Monday, November 26, 12
config curl͚ͩͰͰ͖Δ →ΞϓϦέʔγϣϯʹఆٛΛஔ͚Δ Monday, November 26, 12
ίΞՃ curl͚ͩͰͰ͖Δ →։ൃऀͻͱΓͰ݁Ͱ͖Δ Monday, November 26, 12
༰қʹशಘͰ͖Δ • ΄ͱΜͲͷૢ࡞curlͰ݁ • Solrͱڞ௨ͷࣝଟ͍ • luceneͷΫΤϦ͕͑Δ • qury DSLͪΐͬͱোน…
Monday, November 26, 12
ࢄݕࡧ Monday, November 26, 12
ࢄݕࡧ • number_of_shards • number_of_replicas • replication • async/sync •
write consistency(one, quorum, all) Monday, November 26, 12
multi-tenant • open/close index • write I/O throttling • merge
policy control • shard allocation • number_of_replicas per index Monday, November 26, 12
plugin Monday, November 26, 12
plugin $ bin/plugin -install Aconex/elasticsearch-head Monday, November 26, 12
ύϑΥʔϚϯε • ࣄྫଟ͘ݟ͔ͭΔ • foursquare, soundcloud, bugsense ...etc • ΫΤϦΩϟογϡ͕ͳ͍
• nginx, varnishͳͲͰΩϟογϡ͢Δ Monday, November 26, 12
·ͱΊ • ࢄݕࡧΛ͏ͳΒelasticsearch • ࢄݕࡧΛΘͳͯ͘ར͕ଟ͍ • ࠓޙΘΕΔػձ͕͋Δ͔ Monday, November 26,
12
see also... • http://www.elasticsearch.org/ • http://www.elasticsearch.org/guide/ • http://solr-vs-elasticsearch.com/ • github.com/elasticsearch
• http://blog.sematext.com/ • #elasticsearch Monday, November 26, 12