Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
IVRyエンジニア忘年LT大会2024 LLM監視の最前線
Search
Hiroyuki Moriya
December 11, 2024
1
390
IVRyエンジニア忘年LT大会2024 LLM監視の最前線
Hiroyuki Moriya
December 11, 2024
Tweet
Share
More Decks by Hiroyuki Moriya
See All by Hiroyuki Moriya
音声データ解析パイプラインの Software Engineering / Context Engineering
gekko0114
0
260
LLM Observabilityによる 対話型音声AIアプリケーションの安定運用
gekko0114
2
470
kueueに新しいPriorityClassを足した話
gekko0114
0
780
JobSet超入門
gekko0114
1
1k
Featured
See All Featured
The Illustrated Children's Guide to Kubernetes
chrisshort
51
52k
Visualizing Your Data: Incorporating Mongo into Loggly Infrastructure
mongodb
49
9.9k
Thoughts on Productivity
jonyablonski
75
5.1k
Gemini Prompt Engineering: Practical Techniques for Tangible AI Outcomes
mfonobong
2
300
Claude Code のすすめ
schroneko
67
220k
Organizational Design Perspectives: An Ontology of Organizational Design Elements
kimpetersen
PRO
1
620
Build The Right Thing And Hit Your Dates
maggiecrowley
39
3.1k
Sam Torres - BigQuery for SEOs
techseoconnect
PRO
0
210
コードの90%をAIが書く世界で何が待っているのか / What awaits us in a world where 90% of the code is written by AI
rkaga
60
42k
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
200
The Limits of Empathy - UXLibs8
cassininazir
1
240
Testing 201, or: Great Expectations
jmmastey
46
8.1k
Transcript
confidencial LLMࢹͷ࠷લઢ IVRy ΤϯδχΞLTେձ 2024/12/11 Moriya Hiroyuki
confidencial 2 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ
confidencial 3 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ AIΤϯδχΞͱͯ͠ಇ͖࢝ΊͨԶɺLLMΛͬͯɺͨ͘͞Μ͓ۚΛՔ͍Ͱ͍Δε λʔτΞοϓΛܸͨ͠ɻ
confidencial 4 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ AIΤϯδχΞͱͯ͠ಇ͖࢝ΊͨԶɺLLMΛͬͯɺͨ͘͞Μ͓ۚΛՔ͍Ͱ͍Δε λʔτΞοϓΛܸͨ͠ɻ ʮͪΐͪΐͬͱ։ൃͨ͠ΒϘϩṶ͚Ͱ͖ΔΜʂʯͱؾ͍ͮͨԶɺىۀͯ͠ɺͻͨ ͢ΒPoCϓϩδΣΫτΛΫϥΠΞϯτʹఏڙ͢Δ͜ͱʹͨ͠ɻ
confidencial 5 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ AIΤϯδχΞͱͯ͠ಇ͖࢝ΊͨԶɺLLMΛͬͯɺͨ͘͞Μ͓ۚΛՔ͍Ͱ͍Δε λʔτΞοϓΛܸͨ͠ɻ ʮͪΐͪΐͬͱ։ൃͨ͠ΒϘϩṶ͚Ͱ͖ΔΜʂʯͱؾ͍ͮͨԶɺىۀͯ͠ɺͻͨ ͢ΒPoCϓϩδΣΫτΛΫϥΠΞϯτʹఏڙ͢Δ͜ͱʹͨ͠ɻ Զɺഎޙ͔Β͍ۙͮͯ͘Δrate limit੍ݶͱɺLatencyͷѱԽʹؾ͕͍͍ͭͯͳ͔ͬ ͨɻ
confidencial 6 Զͷ໊લɺߴߍੜ୳ఁ౻৽Ұ AIΤϯδχΞͱͯ͠ಇ͖࢝ΊͨԶɺLLMΛͬͯɺͨ͘͞Μ͓ۚΛՔ͍Ͱ͍Δε λʔτΞοϓΛܸͨ͠ɻ ʮͪΐͪΐͬͱ։ൃͨ͠ΒϘϩṶ͚Ͱ͖ΔΜʂʯͱؾ͍ͮͨԶɺىۀͯ͠ɺͻͨ ͢ΒPoCϓϩδΣΫτΛΫϥΠΞϯτʹఏڙ͢Δ͜ͱʹͨ͠ɻ Զɺഎޙ͔Β͍ۙͮͯ͘Δrate limit੍ݶͱɺLatencyͷѱԽʹؾ͕͍͍ͭͯͳ͔ͬ ͨɻ
ؾ͕͍ͭͨΒɺԶͷϓϩμΫτղ͕૬࣍͗ɺձࣾ࢈ͯ͠͠·͍ͬͯͨ...
confidencial 7
confidencial 8 ࠓɺ౻৽Ұ܅͕ɺ͜Μͳ݁Λܴ͑ͳ͍ͨΊʹͰ͖Δ͜ͱΛ͓͠͠·͢ɻ
confidencial ࣗݾհ 2024/08 ೖࣾ SWEɾػցֶशΤϯδχΞͳͲΛܦݧ LLM͕ίΞʹͳΓͦ͏ͳαʔϏεͩͱࢥͬͯIVRyʹೖࣾ Moriya Hiroyuki 9 AI
engineer
confidencial IVRyͰͷLLMΛར༻ͨ͠AIର 10 WebsocketΛར༻͠ΤϯυϢʔβʔͱLLM͕ϦΞϧλΠϜʹΓऔΓ͍ͯ͠Δ
confidencial LLM Fallback 11 ෳͷLLMΛར༻͢Δ͜ͱΛલఏʹFallbackػߏΛߏங APIͷStatus, Ratelimitσʔλ੍(ཧ੍)ΛͱʹৼΓ͚
confidencial LLM Fallback 12 ෳͷLLMΛར༻͢Δ͜ͱΛલఏʹFallbackػߏΛߏங APIͷStatus, Ratelimitσʔλ੍(ཧ੍)ΛͱʹৼΓ͚ ࢹ͢Ε ྑ͍ͷ͡Ό
confidencial ํ๏ 1ɿDataDog LLM observability 13 DataDog͕Ӷҙ։ൃதͷLLMࢹʹಛԽͨ͠ػೳɻ Latency, token, promptͳͲΛऔಘͰ͖Δɻ
confidencial 14 ʮ͜ΕͰɺOpenAIͷlatency͕ࢹͰ͖ΔΑ͏ʹͳͬͨͥʂʯ
confidencial 15 ͋ΕΕʙɺ͓͔͍͠Αʙ
confidencial 16 ͋ΕΕʙɺ͓͔͍͠Αʙ ΒͷϓϩμΫτɺfallbackػߏΛ ࣮͍ͯ͠Δͷʹɺ OpenAIͷlatency͔͠ࢹͰ͖ͯͳ͍Αʙ
confidencial ํ๏ 2ɿOpenLIT (OpenTelemetry) 17 OpenTelemetryن֨ʹଇͬͨɺLLMࢹʹಛԽͨ͠πʔϧɻ ༷ʑͳLLMΛࢹ͢Δ͜ͱ͕Ͱ͖Δɻ
confidencial 18 ʮ͜ΕͰɺ৭ʑͳmodelͷlatency͕ࢹͰ͖ΔΑ͏ʹͳͬͨͥʂʯ
confidencial 19 ͋ΕΕʙɺ͓͔͍͠Αʙ
confidencial 20 ͋ΕΕʙɺ͓͔͍͠Αʙ Βɺ৭ʑͳϞσϧΛ͏Μ͔ͩΒɺ provider͝ͱʹɺlatencyΛܭଌ͢Δඞཁ͕͋Δͷʹ LiteLLMશମͰͷlatency͔͠औΕͯͳ͍Αʙ
confidencial ํ๏ 3ɿDataDog Inferred services 21 DataDogʹࡌ͞ΕͨɺApp֎ͷϦΫΤετΛࢹͯ͘͠ΕΔػߏ
confidencial 22 ʮ͜ΕͰɺLiteLLMͰ͍ͬͯΔͯ͢ͷmodelΛࢹͰ͖ΔΑ͏ʹͳͬͨͥʂʯ
confidencial 23 ͋ΕΕʙɺ͓͔͍͠Αʙ
confidencial 24 ͋ΕΕʙɺ͓͔͍͠Αʙ ΒɺGeminiɺOpenAIͰ̍ͭͷmodelΛ ͏ͱݶΒͳ͍ͷʹɺ ݸผͷmodelͷlatencyΛऔಘ͢Δ͜ͱ Ͱ͖ͯͳ͍Αʙ
confidencial ·ͱΊ LLMࢹɺ·ͩ·ͩൃల్্Ͱݟ͕͋Γ·ͤΜʂ AIɾLLMΛ͍͜ͳͯ͠ϓϩμΫτʹೖΕ͍ͯ͘աఔͰɺ ࣗΒ͕Γ։͍͍ͯ͘ඞཁ͕͋Γ·͢ɻ ͥͻҰॹʹAIࢹΛ͍͖ͬͯ·͠ΐ͏ʂ 25