in Tool Learning? (紹介しない) • ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use 推論 • Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought • Test-time Computing: from System-1 Thinking to System-2 Thinking 学習 • Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training • Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments • AgentRefine: Enhancing Agent Generalization through Refinement Tuning • TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action メモリ • ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning 自己進化 • Lifelong Learning of Large Language Model based Agents: A Roadmap
Agent for Generalized Applications • A Multimodal Social Agent • SOP-Agent: Empower General Purpose AI Agent with Domain-Specific SOPs • Authenticated Delegation and Authorized AI Agents • Agents Are Not Enough • Efficient Multi-Agent Collaboration with Tool Use for Online Planning in Complex Table Question Answering • Enhancing LLM Reasoning with Multi-Path Collaborative Reactive and Reflection agents • Agentic Systems: A Guide to Transforming Industries with Vertical AI Agents • Large language models for artificial general intelligence (AGI): A survey of foundational principles and approaches Agentic AI Systems • AIOpsLab: A Holistic Framework to Evaluate AI Agents for Enabling Autonomous Clouds • Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems • User Simulation in the Era of Generative AI: User Modeling, Synthetic Data Generation, and System Evaluation • OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System
on Agentic RAG • OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking • Search-o1: Agentic Search-Enhanced Large Reasoning Models Software Agents • Towards Advancing Code Generation with Large Language Models: A Research Roadmap • Training Software Engineering Agents and Verifiers with SWE-Gym(紹介しない) • SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution(紹介しない) Digital Agents • UI-TARS: Pioneering Automated GUI Interaction with Native Agents • Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks • OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis • A3: Android Agent Arena for Mobile GUI Agents(紹介しない) • InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection(紹介しない)
and Analysis of small Uncrewed Aerial Systems Data Agents • Towards Human-Guided, Data-Centric LLM Co-Pilots • MDSF: Context-Aware Multi-Dimensional Data Storytelling Framework based on Large language Model Research Agents • PaSa: An LLM Agent for Comprehensive Academic Paper Search • DOLPHIN: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback(紹介しない) • Agent Laboratory: Using LLM Agents as Research Assistants • LLM4SR: A Survey on Large Language Models for Scientific Research Embodied Agents • EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents(紹介しない) Multi Agent Systems • Multi-Agent Collaboration Mechanisms: A Survey of LLMs
Scheduled tasks in ChatGPT • Introducing Citations on the Anthropic API • Perplexity now has a mobile assistant on Android • Perplexity launches Sonar, an API for AI search
build agents ブログ • 3 Predictions for the Future of AI Agents in 2025 • AI Agents 2024 Rewind - A Year of Building and Learning • The Agentic AI Era: After the Dawn, Here’s What to Expect • Introducing Agentic Document Workflows • Integrating AI Agents into Companies