Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
motemote-data-science-2
Search
kur0cky
August 01, 2020
Technology
2
620
motemote-data-science-2
ネタですが,Rからelasticsearchを使ったり,elasticsearchの便利な機能を一部紹介しています笑
kur0cky
August 01, 2020
Tweet
Share
More Decks by kur0cky
See All by kur0cky
The bootstrapping method for everyone
kur0cky
3
940
音楽理論と方向統計学の初歩/introduction of circular statistics and musicology
kur0cky
4
1.8k
NLP introduction in R 1
kur0cky
0
76
tidyverse tutorial 2
kur0cky
1
57
tidyverse tutorial 1
kur0cky
1
67
rating_introduction
kur0cky
1
840
motemote data science 1
kur0cky
1
530
Other Decks in Technology
See All in Technology
権威ドキュメントで振り返る2024 #年忘れセキュリティ2024
hirotomotaguchi
2
730
マルチプロダクト開発の現場でAWS Security Hubを1年以上運用して得た教訓
muziyoshiz
2
2.1k
Wvlet: A New Flow-Style Query Language For Functional Data Modeling and Interactive Data Analysis - Trino Summit 2024
xerial
1
110
【re:Invent 2024 アプデ】 Prompt Routing の紹介
champ
0
140
AI時代のデータセンターネットワーク
lycorptech_jp
PRO
1
280
WACATE2024冬セッション資料(ユーザビリティ)
scarletplover
0
190
C++26 エラー性動作
faithandbrave
2
670
フロントエンド設計にモブ設計を導入してみた / 20241212_cloudsign_TechFrontMeetup
bengo4com
0
1.9k
LINE Developersプロダクト(LIFF/LINE Login)におけるフロントエンド開発
lycorptech_jp
PRO
0
120
OpenAIの蒸留機能(Model Distillation)を使用して運用中のLLMのコストを削減する取り組み
pharma_x_tech
4
540
統計データで2024年の クラウド・インフラ動向を眺める
ysknsid25
2
830
サーバレスアプリ開発者向けアップデートをキャッチアップしてきた #AWSreInvent #regrowth_fuk
drumnistnakano
0
190
Featured
See All Featured
Typedesign – Prime Four
hannesfritz
40
2.4k
Imperfection Machines: The Place of Print at Facebook
scottboms
266
13k
The Psychology of Web Performance [Beyond Tellerrand 2023]
tammyeverts
45
2.2k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
32
2.7k
Building Adaptive Systems
keathley
38
2.3k
Rebuilding a faster, lazier Slack
samanthasiow
79
8.7k
The Straight Up "How To Draw Better" Workshop
denniskardys
232
140k
The Art of Programming - Codeland 2020
erikaheidi
53
13k
Making Projects Easy
brettharned
116
5.9k
Adopting Sorbet at Scale
ufuk
73
9.1k
Music & Morning Musume
bryan
46
6.2k
I Don’t Have Time: Getting Over the Fear to Launch Your Podcast
jcasabona
29
2k
Transcript
σʔλͰʮϞςʯΛIBDLͤΑ BU5PLZP3 LVSDLZ dσΟφʔฤd
ࣗݾհ w 5XJUUFSɿ!LVSDLZ@Z w ॴଐɿ໊ཧͷIPHFIPHFʢϲ݄ʣ w झຯɿԻָɼөըɼ͓ञɼλΠϐϯάɼFUD w ΈɿϞς͍ͨ 2
લճdөըσʔτฤd w POFUPPOFͳөըσʔτମݧΛఏڙ͢ΔͨΊʹɼࣗવͳձͷத͔ΒϨ ϏϡʔΛਪఆ͢Δ͜ͱΛతͱͨ͠ɽ w ͦͷͨΊʹɼөըυϝΠϯઐ༻ͷۃੑࣙॻΛ࡞͠ɼϨίϝϯυʹ׆༻ 3 https://speakerdeck.com/kur0cky/motemote-data-science-1
өըΛݟͯऴΘΓͰྑ͔ͬͨͷͩΖ͏͔
࣮͕ޠ͍ͬͯΔ
Կ͕͍͚ͳ͔ͬͨͷ͔
ࠗ׆αΠτʹΑΔͱ өըͷ༨ӆΛָ͠ΜͩΓɺөըͱ͍͏ڞ௨ͷͷձ͔Β૬खͷঁੑʹ͍ͭͯ͞Β ʹΕͨΓͱɺөըͷޙઈͷίϛϡχέʔγϣϯͷػձͱͳΓ·͢ɻ͜ͷνϟϯε Λ͍͔ͨ͢Ίʹɺөըͷޙʹ͝൧͓ͳͲͷ༧ఆΛೖΕ·͠ΐ͏ɻ ภݟʹ·ΈΕ͍ͯΔ͕ɼ͍ͬͨΜ͜ΕΛ৴͡Δɽ
POFUPPOFͳөը͚ͩͰμϝ ͦͷޙͷձ͕ॏཁ
ϩʔϧϓϨΠϯά 9 ͓͠Ζ͔ͬͨͶʂʂ దʹೖͬͪΌ͓͏͔ ͜Εതଧɽऑऀʹద͕Θ͔ΒΜ ͓͍͍͠ΠλϦΞϯ༧͓͍ͯͨ͠Μͩʂʂ ༻ҙप౸͗ͯ͢ॏ͍͠ɼ͓ෲ͕ݮͬͯͳ͍͔͠Εͳ͍ ͳʹ͔৯ͳ͕ΒΖ͏ʂ গ͠า͍ͯɼྑ͛͞ͳͱ͜Ζ୳ͦ͞͏͔ ૬खͷੑ֨ॴʹେ͖͘ґଘɽา͖͗͢/(ɽϦεΩʔ
ͦͷͰαΫͬͱݕࡧͰ͖Δ ΞϓϦ͕ॏཁ
3ͱFMBTUJDTFBSDI BU5PLZP3 LVSDLZ
w શจݕࡧγεςϜ w ಛɿ w ߴɽࢄܕͰεέʔϥϒϧɽΦʔϓϯιʔεɽ3&45"1*ɽ +40/ʹΑΔॊೈͳσʔλߏɽείΞϦϯάͷΧελϚΠζɽ 12 ՄࢹԽɾੳ ݕࡧ
σʔλऩू
Πϯετʔϧɾىಈ w .BDͷ߹ brew tap elastic/tap brew install elastic/tap/elasticsearch-full brew
install elastic/tap/kibana-full brew install elastic/tap/logstash-full brew install elastic/tap/metricbeat-full elasticsearch & kibana & w IUUQMPDBMIPTUͰใ͕ฦͬͯ͘Εىಈޭ w ,JCBOB IUUQMPDBMIPTU ͷ$POTPMFΛ͏ͱ৭ʑࢼ͍͢͠ 13 ଞͷ04ɿIUUQTXXXFMBTUJDDPHVJEFFOFMBTUJDTUBDLDVSSFOUJOTUBMMJOHFMBTUJDTUBDLIUNM
جຊ༻ޠ w ΠϯσοΫε w FMBTUJDTFBSDI͕ݕࡧɾղੳͷରͱ͢Δσʔλͷอଘઌ w υΩϡϝϯτλΠϓ w ΠϯσοΫεͷάϧʔϓ w
υΩϡϝϯτ w FMBTUJDTFBSDIʹอଘ͞Εͨσʔλ w ϑΟʔϧυ w υΩϡϝϯτʹؚ·ΕΔଐੑ 14 σʔλϕʔε ςʔϒϧ Ϩίʔυ ΧϥϜ 3%#Ͱ͍͏ͱɹɹ
؆୯ͳ͍ํ w +40/ͱ3&45"1*ͰઃఆɾΠϯσΩγϯάʢೖʣɾݕࡧͯ͢Λߦ͏ 15 • ݕࡧ GET index_name/_search { "query"
: { "match" : { "comment" : "σʔτ" } } } • ΠϯσΩγϯά PUT index_name/ { "name" : "ϥʔϝϯೋ", "genre" : "όʔɾμΠχϯά" } • ઃఆ PUT index_name/ { "settings" : { hogehoge }, "mappings" : { fugafuga } } ܗଶૉղੳશ֯ͷ౷ҰͳͲ ༷ʑͳઃఆΛهड़ ೖ͢Δσʔλ͕ͲͷΑ͏ͳ ϑΟʔϧυΛͪ͏Δͷ͔ɼͦͷܕهड़
3͔ΒFMBTUJDTFBSDIΛୟ͘ w FMBTUJDύοέʔδ͔Βୟ͘ɽ w σʔλϑϨʔϜΛΠϯσΩγϯάͰ͖Δɽ w શͯΛ3Ͱ͍݁ͤͨ͞ʂʂ w ΫΤϦɼϦετͰॻ͍ͯKTPOMJUFͰ+40/ʹ w
جຊͷૢ࡞ 1. conn <- connect(host="127.0.0.1", port=9200) 2. docs_bulk(conn, df, index) 3. Search(conn, index, body = <query>) 16
࣮ફ
త w өըσʔτײઓͰ͢Δ w ϩʔϧϓϨΠϯάͷ݁Ռɼ͍͔ͭ͘ͷཁ݅Λຬͨ͢ඞཁ͕͋Δ w ͦͷͷঢ়گʹ߹ΘͤͯαΫοͱܾΊΔ͜ͱ͕ॏཁ w جຊతʹ͋·Γา͔ͤͳ͍ w
Ͱ͖Δ͚࣭ͩͷྑ͍ళʹೖΔ 18
ͬͨ͜ͱ w ༻σʔλ w ौ୩ۙลͷϨετϥϯ݅ w ళ໊ɼδϟϯϧɼਓؾͷΫνίϛɼͦͷଞళฮใ 19 HFPDPEJOH"1* ݕࡧ6*
MFBqFUʹΑΔਤ %5ʹΑΔ៉ྷͳද ݱࡏͷऔಘ ݕࡧ εΫϨΠϐϯά
ཁ݅ͷୡ w LVSPNPKJQMVHJOʹΑΔલॲཧ w τʔΫϯԽɼશ֯౷ҰɼεςϛϯάɼFUD w σϑΥϧτͷείΞʢ#.ʣΛ͏ w ۙ͞ʹԠͨ͡ݮਰؔ͏ w
৯ϩάείΞ͏ 20 w ຊޠͷݕࡧ w ΫΤϦͱͷϚον w ͋·Γา͔ͤͳ͍ w ͍͍ళʹೖΔ https://www.elastic.co/guide/en/elasticsearch/plugins/current/analysis-kuromoji-analyzer.html
ڑʹԠͨ͡ݮਰؔ w ࠓճɼҢɾܦͦΕͧΕʹ ΨγΞϯΛ༻ w ଞʹɼࢦؔઢܗ͕͋Δ 21 https://www.elastic.co/guide/en/elasticsearch/reference/current/query- dsl-function-score-query.html
{ "query": { "function_score": { "query": { ී௨ͷΫΤϦ͜͜ʹॻ͘ }, "functions":
[ { "gauss": { "latitude": { "origin": [35.6591], "scale": [0.003] } } }, { "gauss": { "longitude": { "origin": [139.7003], "scale": [0.003] } } }, { "field_value_factor": { "field": ["score"], "factor": [3], "modifier": ["log"], "missing": [1] } } ], "score_mode": ["multiply"] } }, "size": [1000], "_source": ["name", "score", "genre", "tel_number"] } GVODUJPO@TDPSFʹΑΔείΞͷ౷߹ 22 Ңͷݮਰ ܦͷݮਰ ϑΟʔϧυͷ ͦͷͷΛ͏ ͜ΕΒͷֻ͚ࢉͰ ࠷ऴతͳείΞͱ͢Δ KTPOΈʹͯ͘͘ਃ͠༁ͳ͍Ͱ͢ɾɾɾ
σϞ
None
·ͱΊ w POFUPPOFөըσʔτମݧΛఏڙ͢ΔͨΊʹɼ૬खͷΈΛεϜʔζ ʹఆྔԽ͢Δ͚ͩͰμϝͩͬͨɽ w ʮөըσʔτײઓͰ͢Δʯͱ͍͏Ծઆͷͱɼྑ͍ళΛαΫͬͱ ݕࡧͰ͖ΔΞϓϦΛ࡞ͨ͠ɽ w ಛʹɼݱࡏҐஔ͔Βͷڑɾ৯ϩάͷείΞɾΫΤϦͱͷϚονΛ ߟྀͨ͠είΞϦϯάΛߦͬͨɽ
25 FMBTUJDTFBSDI͍͍ͧʂ ʢެࣜυΩϡϝϯτ͕ຊʹʣ
&OKPZ