Upgrade to Pro — share decks privately, control downloads, hide ads and more …

datadog dash 2025 LLM observability for reliabi...

datadog dash 2025 LLM observability for reliability and stability

2025/06/11に開催された「DASH by Datadog」に、IVRyのMoriyaが登壇しました。

■リンク集
・IVRyコーポレートサイト:https://ivry.jp/company/
・IVRy採用ページ:https://ivry-jp.notion.site/
・IVRyイベント一覧:https://ivry.connpass.com/event/
・IVRy Tech Xアカウント:https://x.com/IVRy_tech

Transcript

  1. 2 LLMs can do anything. Easily build amazing products. Perfect

    self-driving cars within a few years. Automate everything so humans don’t need to work anymore.
  2. 3 Easily build amazing products. Perfect self-driving cars within a

    few years. Automate everything so humans don’t need to work anymore. LLMs can do anything. Not really.
  3. 5 About me Develop products that integrate LLM APIs Monitor

    & optimize LLM APIs for reliability and performance Hiroyuki Moriya AI engineer / SRE Speaker Intro
  4. 7 Founded/HQ: 2019. Tokyo, Japan Number of employees: 200+ people

    Product: AI/LLM based phone communication service Reach: 30,000+ accounts & 40 million+ incoming calls in total IVRy inc. Company Info “Revolutionizing the telephone experience and boosting productivity for businesses ”
  5. 12 Phone calls are still important communication tools in Japan

    Source: Rakuten Communications, “Survey on Call Handling at Small and Medium-Sized Businesses.”
  6. 14 Medical appointments Restaurant reservations Hotel bookings FAQ inquiries We

    power phone communication with AI for businesses of all sizes IVRy in action
  7. 16 Three key challenges for AI phone service Robust fault

    detection & recovery Challenge #3 Minimizing hallucinations Challenge #1 Ensuring natural conversation pace Challenge #2
  8. 18 Three solutions for AI phone service Robust fault detection

    & recovery Solution #3 Ensuring natural conversation pace Solution #2 Minimizing hallucinations Solution #1
  9. 21 Example AI workflow Break down a task into multiple

    specialized AI components. → Beer validation and error analysis, leading to more stable & reliable results.
  10. 23 Outputs from LLM APIs can change due to silent

    model updates Problem Output has changed
  11. 25 Monitor LLM API consistency every day Solution 1. Test

    cases 2. Run consistency tests 3. Notify / record results
  12. 27

  13. 28

  14. 29

  15. 30 Executing phone E2E tests after code merge Merge code

    Deploy latest code Execute automated phone E2E tests Monitor on Datadog LLM Observability
  16. 34 To minimize hallucinations, 1 Divide and conquer Divide one

    task into multiple, easier steps. Trust, but verify Verify LLM API responses regularly. 2 Summary 34
  17. 35 Three solutions for AI phone service Minimizing hallucinations Solution

    #1 Ensuring natural conversation pace Solution #2 Robust fault detection & recovery Solution #3
  18. 38 Fast, stable, and cheap Slower, more $$$ We choose

    fast, proven models over cuing-edge but slow ones—beer latency, fewer rate limits, lower cost. Stability & performance > latest models
  19. 42 To ensure natural conversation pace, 1 Done is beer

    than perfect Choose the model that aligns with your case. See the forest for the tree See the overall metrics for each client. 2 Summary 42
  20. 43 Three solutions for AI phone service Robust fault detection

    & recovery Solution #3 Minimizing hallucinations Solution #1 Ensuring natural conversation pace Solution #2
  21. 48 Built a robust fallback system using multiple LLMs. It

    routes requests based on API statuses. LLM fallback strategy
  22. 52 To implement the robust fault detection and recovery, Prepare

    for the worst Think the worst scenario and implement the robust recovery system. Summary 52
  23. 53 Key lessons for operating LLM APIs Divide and conquer

    / Trust, but verify 01 for minimizing hallucinations Done is beer than perfect / See the forest for the tree 02 for ensuring natural conversation pace Prepare for the worst 03 for robust fault detection & recovery