AI Agentにおける評価指標とAgent GPA

© 2026 Snowflake Inc. All Rights Reserved AI Agent における評価手法と
Agent GPA Sho Tanaka Feb 2026

© 2026 Snowflake Inc. All Rights Reserved AI/ML, Dataの登壇やデモ開発を担当 -
ex-Google gTech Ads, ML/Data - MLOps community 運営 (2020~) - Google Developer Expert, AI/ML tsho / 田中翔 (Sho Tanaka) Linkedin.com/in/tsho Lead Developer Advocate @ Snowﬂake

© 2026 Snowflake Inc. All Rights Reserved AI Agent の活用事例
メルカリにおけるデータアナリティクス AI エージェント「Socrates」と ADK 活用事例 - Speaker Deck コクヨ、ジンズなどがAIエージェント自社開発「Snowflake Intelligence」日本提供

© 2026 Snowflake Inc. All Rights Reserved AI Agent /
LLM による代表的な評価指標

© 2026 Snowflake Inc. All Rights Reserved 例：ADK の評価指標 Why
Evaluate Agents - Agent Development Kit (ADK) LLM-as-a-judge Final_response_match_v2, rubric_based_ﬁnal_response_qual ity_v1 etc. Code-based / Deterministic コード・ルールベース/一致 tool_trajectory_avg_score Traditional NLP Metrics 従来の自然言語処理指標 response_match_score Human Evaluation 人間による評価 (機能として明示的な「指標」はないが、Web UI (Trace View) で支援)

© 2026 Snowflake Inc. All Rights Reserved Agent GPA と
TruLens

© 2026 Snowflake Inc. All Rights Reserved CS 329T: Trustworthy
Machine Learning

AI Agentにおける評価指標とAgent GPA

AI Agentにおける評価指標とAgent GPA

tsho

More Decks by tsho

Other Decks in Technology

Featured

Transcript

© 2026 Snowflake Inc. All Rights Reserved AI Agent における評価手法と

© 2026 Snowflake Inc. All Rights Reserved AI/ML, Dataの登壇やデモ開発を担当 -

© 2026 Snowflake Inc. All Rights Reserved AI Agentとは?

© 2026 Snowflake Inc. All Rights Reserved

© 2026 Snowflake Inc. All Rights Reserved AI Agent の活用事例

© 2026 Snowflake Inc. All Rights Reserved

© 2026 Snowflake Inc. All Rights Reserved AIOpsは2016年ごろにガートナーが定義したものもあるので注意

© 2026 Snowflake Inc. All Rights Reserved

© 2026 Snowflake Inc. All Rights Reserved 評価手法

© 2026 Snowflake Inc. All Rights Reserved AI Agent /

© 2026 Snowflake Inc. All Rights Reserved 例：ADK の評価指標 Why

© 2026 Snowflake Inc. All Rights Reserved Agent GPA と

© 2026 Snowflake Inc. All Rights Reserved

© 2026 Snowflake Inc. All Rights Reserved

© 2026 Snowflake Inc. All Rights Reserved

© 2026 Snowflake Inc. All Rights Reserved Agent GPA の論文

© 2026 Snowflake Inc. All Rights Reserved OSS としても提供中 https://github.com/truera/trulens

© 2026 Snowflake Inc. All Rights Reserved https://www.trulens.org/getting_started/quickstarts/web-search-agent-evaluation/#10-add- evaluations

© 2026 Snowflake Inc. All Rights Reserved

© 2026 Snowflake Inc. All Rights Reserved さいごに

© 2026 Snowflake Inc. All Rights Reserved Snowﬂake 上で Private

© 2026 Snowflake Inc. All Rights Reserved 参考

© 2026 Snowflake Inc. All Rights Reserved CS 329T: Trustworthy

© 2026 Snowflake Inc. All Rights Reserved https://learn.deeplearning.ai/

© 2026 Snowflake Inc. All Rights Reserved THANK YOU