Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
solrとelasticsearchの比較
Search
genta kaneyama
November 26, 2012
Programming
15
5.4k
solrとelasticsearchの比較
elasticsearchの紹介です!
atnd.org/events/33718
genta kaneyama
November 26, 2012
Tweet
Share
More Decks by genta kaneyama
See All by genta kaneyama
MOSHでの生成AI活用の取り組み
penguinco
0
120
search and community in cookpad 2019
penguinco
2
1.8k
行動ログでプロダクトを改善するには/exploit user behavior for product
penguinco
4
9.1k
Solr @ CROSS2015 C4
penguinco
1
1.3k
how to improve search
penguinco
8
2k
Other Decks in Programming
See All in Programming
Jakarta EE meets AI
ivargrimstad
0
520
エンジニアとして関わる要件と仕様(公開用)
murabayashi
0
280
Remix on Hono on Cloudflare Workers
yusukebe
1
280
Outline View in SwiftUI
1024jp
1
320
.NET のための通信フレームワーク MagicOnion 入門 / Introduction to MagicOnion
mayuki
1
1.4k
macOS でできる リアルタイム動画像処理
biacco42
9
2.4k
Content Security Policy入門 セキュリティ設定と 違反レポートのはじめ方 / Introduction to Content Security Policy Getting Started with Security Configuration and Violation Reporting
uskey512
1
520
Jakarta Concurrencyによる並行処理プログラミングの始め方 (JJUG CCC 2024 Fall)
tnagao7
1
290
3rd party scriptでもReactを使いたい! Preact + Reactのハイブリッド開発
righttouch
PRO
1
600
リアーキテクチャxDDD 1年間の取り組みと進化
hsawaji
1
220
初めてDefinitelyTypedにPRを出した話
syumai
0
400
とにかくAWS GameDay!AWSは世界の共通言語! / Anyway, AWS GameDay! AWS is the world's lingua franca!
seike460
PRO
1
860
Featured
See All Featured
GitHub's CSS Performance
jonrohan
1030
460k
Building Your Own Lightsaber
phodgson
103
6.1k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
47
2.1k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
28
9.1k
GraphQLとの向き合い方2022年版
quramy
43
13k
Why You Should Never Use an ORM
jnunemaker
PRO
54
9.1k
Making Projects Easy
brettharned
115
5.9k
Intergalactic Javascript Robots from Outer Space
tanoku
269
27k
The Straight Up "How To Draw Better" Workshop
denniskardys
232
140k
10 Git Anti Patterns You Should be Aware of
lemiorhan
654
59k
Creating an realtime collaboration tool: Agile Flush - .NET Oxford
marcduiker
25
1.8k
It's Worth the Effort
3n
183
27k
Transcript
elasticsearchͱSolrͷൺֱ ݉ࢁ ݩଠ @penguinana_ Monday, November 26, 12
ࣗݾհ • ݉ࢁ ݩଠ @penguinana_ • ϨγϐݕࡧνʔϜ @ http://cookpad.com/ •
Solr4.0 Monday, November 26, 12
SolrͷόʔδϣϯΞοϓΛ ݕ౼͍ͯ͠Δͱ͖... Monday, November 26, 12
Elasticsearch ௐͨ΄͏͕͍͍ͷͰʁ Monday, November 26, 12
• Luceneϕʔε • HTTP API • ࢄݕࡧOK • ຊޠOK Monday,
November 26, 12
• Luceneϕʔε • HTTP API • ࢄݕࡧOK • ຊޠOK طࢹײ
Monday, November 26, 12
http://solr-vs-elasticsearch.com/ Monday, November 26, 12
ײ • ػೳ໘Ͱෆͳ͍ • API͕։ൃऀʹ͍͞͠ • ༰қʹशಘͰ͖Δ • େنࢄݕࡧҎ֎Ͱ༗༻ •
SolrΛͬͯͳ͚Εͬͪ͜Λຊ൪ʹ͍ͨ ͍ʂ Monday, November 26, 12
αϯϓϧΛͬͯ ͻͱ௨Γઆ໌͠·͢ Monday, November 26, 12
http://blog.livedoor.jp/techblog/archives/65836960.html Monday, November 26, 12
livedoorάϧϝ • Ϩετϥϯใ(21.4ສళ) • ళ໊ɺѻ͍ͬͯΔྉཧɺॅॴɺҢ ܦɺΞΫηεɺ࠷دΓฑߦ͖͔ Βͷڑɺetc... Monday, November 26,
12
livedoorάϧϝ • ϨϏϡʔใ(20.5ສϨϏϡʔ) • ૯߹ධՁʢ5ஈ֊ʣ • งғؾɺஈɺαʔϏεɺຯ • ϨϏϡʔίϝϯτ Monday,
November 26, 12
https://github.com/penguinco/ld_gourmet_search Monday, November 26, 12
ElasticsearchΛ͏ • 1݅ొͯ͠ɺ1݅ݕࡧ • ຊޠͷѻ͍Λఆٛ • εΩʔϚఆٛ • औΓࠐΈ •
ݕࡧ • είΞϦϯάͳͲͷௐ Monday, November 26, 12
PUT curl -XPUT http://localhost:9200/twitter/tweet/1 -d ' { "user": "kimchy", "post_date":
"2012-11-26T20:12:00", "message": "Trying out elasticsearch", "score": 5 } ' index type id Monday, November 26, 12
PUT curl -XPUT http://localhost:9200/twitter/user/kimchy -d ' { "name" : "Shay
Banon" } ' index type id Monday, November 26, 12
GET curl -XGET http://localhost:9200/twitter/tweet/1 { "user": "kimchy", "post_date": "2012-11-26T20:12:00", "message":
"Trying out elasticsearch", "score": 5 } } index type id Monday, November 26, 12
SEARCH curl -XGET http://localhost:9200/twitter/tweet/_search -d '{ "query" : { "term"
: { "user": "kimchy" } } }' index type id { "user": "kimchy", "post_date": "2012-11-26T20:12:00", "message": "Trying out elasticsearch", "score": 5 } Monday, November 26, 12
REST API • υΩϡϝϯτͷՃɾআ • ઃఆͷՃɾআ • શ෦HTTP APIͰͰ͖Δ •
εΩʔϚϑϦʔ Monday, November 26, 12
ຊޠ $ curl -XGET 'localhost:9200/_analyze?pretty' -d 'ਆઘ' { "tokens" :
[ { "token" : "ਆ", "start_offset" : 0, "end_offset" : 1, "type" : "<IDEOGRAPHIC>", "position" : 1 }, { "token" : "ઘ", "start_offset" : 1, "end_offset" : 2, "type" : "<IDEOGRAPHIC>", "position" : 2 } ] } Monday, November 26, 12
ຊޠ AnalyzerΛมߋ͢Δ͜ͱͰରԠ kuromoji͕͑·͢ʂ http://www.hirotakaster.com/archives/2012/11/ elasticsearch-kuromoji-plugin.php Monday, November 26, 12
kuromoji $ cd elasticsearch $ bin/plugin -install elasticsearch/elasticsearch-analysis-kuromoji/1.0.0 $ git
clone git://github.com/elasticsearch/elasticsearch-analysis- kuromoji.git $ cd elasticsearch-analysis-kuromoji/ $ mvn clean package $ cp target/elasticsearch-analysis-kuromoji-1.2.0-SNAPSHOT.jar ../plugins/ analysis-kuromoji/elasticsearch-analysis-kuromoji-1.0.0.jar # restart elasticsearch Monday, November 26, 12
add analyzer $ curl -XPUT 'localhost:9200/test/' -d ' { "index":{
"analysis":{ "tokenizer" : { "kuromoji" : { "type":"kuromoji_tokenizer", "mode":"search" } }, "analyzer" : { "kuromoji_analyzer" : { "type" : "custom", "tokenizer" : "kuromoji_tokenizer" } } } } } ‘ Monday, November 26, 12
kuromoji $ curl -XGET 'localhost:9200/test/_analyze? analyzer=kuromoji_analyzer&pretty' -d 'ਆઘ' { "tokens"
: [ { "token" : "ਆઘ", "start_offset" : 0, "end_offset" : 2, "type" : "word", "position" : 1 } ] } Monday, November 26, 12
_analyze $ curl -XGET 'localhost:9200/test/_analyze? analyzer=kuromoji_analyzer&pretty' -d 'ؔࠃࡍۭߓ' { "tokens"
: [ {"token" : "ؔ",}, {"token" : "ؔࠃࡍۭߓ",}, {"token" : "ࠃࡍ",}, {"token" : "ۭߓ",} ] } Monday, November 26, 12
kuromojiΛσϑΥϧτʹ • default͍ͬͯ͏໊લͰanalyzerΛએݴ Monday, November 26, 12
ಉٛޠ • Solrಉ༷ಉٛޠ͕ϑΝΠϧͰॻ͚Δ • +WordNetܗࣜ͑Δ Monday, November 26, 12
analyzer Monday, November 26, 12
ຊޠͷ৺͋Δఔย͍ͨʂ Monday, November 26, 12
εΩʔϚఆٛ • εΩʔϚϑϦʔʂ • JSONͷܕ͕࠾༻͞ΕΔ • ڧ੍తʹఆٛͰ͖Δ(mapping) Monday, November 26,
12
mappingྫ $ curl -XPUT 'http://localhost:9200/twitter/tweet/ _mapping' -d ' { "tweet"
: { "properties" : { "message" : {"type" : "string", "store" : "yes"} } } } ' Monday, November 26, 12
Solrͱͷࠩ • SolrͷDynamicFieldΑΓ؆୯ • type • 1ίΞʹෳछྨͷdocΛೖΕΔ͜ ͱΛఆͯ͋ͬͯ͠ศར Monday, November
26, 12
import(ruby) ratings = [] CSV.foreach("ratings.csv") do |row| ratings << {
:id => row[:id].to_i, :restaurant_id => row[:restaurant_id].to_i, :body => row[:body], :type => 'rating' } end Tire.index 'livedoor_gourmet' do import ratings end Monday, November 26, 12
ݕࡧ curl -X GET 'http://localhost:9200/livedoor_gourmet/ restaurant/_search?pretty' -d ' { "query":{
"query_string":{ "query":"ϥʔϝϯ" } }, "sort":[{"access_count":"desc"}], "filter":{ "term":{"closed":"0"} } } ' Monday, November 26, 12
Solrͱͷࠩ • DSL͕݁ߏҧ͏ • filter, facet, grouping, highlightαϙʔτ • είΞϦϯάεΫϦϓτݴޠͰఆٛ
Ͱ͖Δ Monday, November 26, 12
είΞϦϯά • PVॱͰฒͨΒ͏·͍ͬͨ͆͘ • ݱ࣮ͷ݁ߏ͜͏͍͏͜ͱଟ͍ Monday, November 26, 12
είΞϦϯά • ڵຯͷ͋Δํͥͻ • εΫϦϓτݴޠͰఆٛͰ͖Δ • google: elasticsearch guide scoring
Monday, November 26, 12
ײ • ػೳ໘Ͱෆͳ͍ • API͕։ൃऀʹ͍͞͠ • ༰қʹशಘͰ͖Δ • େنࢄݕࡧҎ֎Ͱ༗༻ Monday,
November 26, 12
API Monday, November 26, 12
config curl͚ͩͰͰ͖Δ →ΞϓϦέʔγϣϯʹఆٛΛஔ͚Δ Monday, November 26, 12
ίΞՃ curl͚ͩͰͰ͖Δ →։ൃऀͻͱΓͰ݁Ͱ͖Δ Monday, November 26, 12
༰қʹशಘͰ͖Δ • ΄ͱΜͲͷૢ࡞curlͰ݁ • Solrͱڞ௨ͷࣝଟ͍ • luceneͷΫΤϦ͕͑Δ • qury DSLͪΐͬͱোน…
Monday, November 26, 12
ࢄݕࡧ Monday, November 26, 12
ࢄݕࡧ • number_of_shards • number_of_replicas • replication • async/sync •
write consistency(one, quorum, all) Monday, November 26, 12
multi-tenant • open/close index • write I/O throttling • merge
policy control • shard allocation • number_of_replicas per index Monday, November 26, 12
plugin Monday, November 26, 12
plugin $ bin/plugin -install Aconex/elasticsearch-head Monday, November 26, 12
ύϑΥʔϚϯε • ࣄྫଟ͘ݟ͔ͭΔ • foursquare, soundcloud, bugsense ...etc • ΫΤϦΩϟογϡ͕ͳ͍
• nginx, varnishͳͲͰΩϟογϡ͢Δ Monday, November 26, 12
·ͱΊ • ࢄݕࡧΛ͏ͳΒelasticsearch • ࢄݕࡧΛΘͳͯ͘ར͕ଟ͍ • ࠓޙΘΕΔػձ͕͋Δ͔ Monday, November 26,
12
see also... • http://www.elasticsearch.org/ • http://www.elasticsearch.org/guide/ • http://solr-vs-elasticsearch.com/ • github.com/elasticsearch
• http://blog.sematext.com/ • #elasticsearch Monday, November 26, 12