Why Eval-Driven Development Is Your Path To Production https://www.forbes.com/councils/forbestechcouncil/2025/04/04/escaping- ai-demo-hell-why-eval-driven-development-is-your-path-to-production/
to Solve the #1 Blocker for Getting AI Agents in Production | LangChain Interrupt https://interrupt.langchain.com/videos/building-reliable-agents- agent-evaluations
- Training | Microsoft Learn https://learn.microsoft.com/en- us/training/modules/characterize-devops- continous-collaboration-improvement/3-explore- continuous-improvement
LLM Outputs with Human Preferences LLM の出力に対する評価基準 が、評価を進めるにつれてユ ーザー自身によって変化また は洗練されていく [2404.12272] Who Validates the Validators? Aligning LLM-Assisted Evaluation of LLM Outputs with Human Preferences https://arxiv.org/abs/2404.12272