Slide 18
Slide 18 text
Evalも(**の方が**)大事
“Using evals strategically can make a customer-facing product or internal tool more
reliable at scale, decrease high-severity errors, protect against downside risk, and give an
organization a measurable path to higher ROI. “
– OpenAI
“Good evaluations help teams ship AI agents more confidently. Without them, it’s easy to
get stuck in reactive loops — catching issues only in production, where fixing one failure
creates others.”
– Anthropic
https://openai.com/index/evals-drive-next-chapter-of-ai/
https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents