Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
[mercari GEARS 2025] Evals for LLMApps/Agents
Search
mercari
PRO
November 14, 2025
Technology
110
0
Share
[mercari GEARS 2025] Evals for LLMApps/Agents
mercari
PRO
November 14, 2025
More Decks by mercari
See All by mercari
[DevDojo] Getting Started with BI: Looker Essentials - 2025
mercari
PRO
0
110
[DevDojo] Introduction to LLMs & AI Agents - 2025
mercari
PRO
0
140
[mercari GEARS 2025] Techniques for Reliable Code Generation Using AI Agents
mercari
PRO
0
250
[mercari GEARS 2025] Foundations of AI - The Invisible Forces Driving Product Innovation
mercari
PRO
0
240
[mercari GEARS 2025] Building Foundation for Mercari’s Global Expansion
mercari
PRO
1
410
[mercari GEARS 2025] The Past, Present, and Future of Anti-Phishing Measures at Mercari
mercari
PRO
0
130
[mercari GEARS 2025] Backend Standardization with MCP
mercari
PRO
0
190
[mercari GEARS 2025] Transforming customer engagement with Google Customer Engagement Suite
mercari
PRO
0
310
[mercari GEARS 2025] PJ Aurora’s Vision and Automated UI Quality Evaluation Agents
mercari
PRO
0
290
Other Decks in Technology
See All in Technology
形式手法特論:公平性制約の位相的特徴づけ #kernelvm / Kernel VM Study Kansai 12th
ytaka23
1
280
Anthropic AIネイティブ・スタートアップ構築のプレイブック を理解する
nagatsu
0
190
【ハノーバーメッセ振り返りイベントat名古屋】データは集約からAI起点の収集に ~組織内・組織間でのデータ連携~
tanakaseiya
0
120
管理アカウント単一運用からAWS Organizationsに移行するの大変で滅
hiramax
0
250
TSKaigi 2026 - enumよ、さようなら
teamlab
PRO
3
550
論文紹介:Pixal3D (SIGGRAPH 2026)
tenten0727
0
740
大規模環境でどのように監視を実現する?
yuobayashi
1
150
はじめてのAI-DLC
yoshidashingo
2
550
ビジュアルプログラミングIoTLT vol.23
1ftseabass
PRO
0
130
まだ道半ば、AI-DLCを歩み始めている話
news_it_enj
2
180
20260528_生成AIを専属DSに_Howの次にすべきことを考える
doradora09
PRO
0
200
oracle-to-databricks-migration-with-llm-and-dbt
casek
0
160
Featured
See All Featured
Mind Mapping
helmedeiros
PRO
1
200
Un-Boring Meetings
codingconduct
0
300
Primal Persuasion: How to Engage the Brain for Learning That Lasts
tmiket
0
340
Building an army of robots
kneath
306
46k
Leo the Paperboy
mayatellez
7
1.8k
Building a Modern Day E-commerce SEO Strategy
aleyda
45
9k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.7k
Rails Girls Zürich Keynote
gr2m
96
14k
Winning Ecommerce Organic Search in an AI Era - #searchnstuff2025
aleyda
1
2k
The Cost Of JavaScript in 2023
addyosmani
55
9.9k
Kristin Tynski - Automating Marketing Tasks With AI
techseoconnect
PRO
0
250
The SEO Collaboration Effect
kristinabergwall1
1
460
Transcript
Evals for LLMApps/Agents Jehandad Kamal Mercari / Engineer @ AI/LLM
Team
Mercari AI Listing Support https://about.mercari.com/en/press/news/articles/20240910_aisupport/
Agents
Agent Development Loop
First SDLC for Agents ai.mercari.com
Types of Evals ( by technique ) ai.mercari.com
Types of Evals (by perspective )
Improved SDLC for Agents
Thank You!