Slide 3
Slide 3 text
論⽂ 6⽉分
計画
• Octo-planner: On-device Language Model for Planner-Action Agents
• FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents
• Ask-before-Plan: Proactive Language Agents for Real-World Planning
• CoAct: A Global-Local Hierarchy for Autonomous Agent Collaboration
• SELFGOAL: Your Language Agents Already Know How to Achieve High-level Goals
• NATURAL PLAN: Benchmarking LLMs on Natural Language Planning
• Graph-enhanced Large Language Models in Asynchronous Plan Reasoning
• A Human-Like Reasoning Framework for Multi-Phases Planning Task with Large Language Models
• Meta-Task Planning for Language Agents
⻑いコンテキスト理解
• Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA
• LLM In-Context Recall is Prompt Dependent
• Needle In A Multimodal Haystack
• Multimodal Needle in a Haystack: Benchmarking Long-Context Capability of Multimodal Large Language Models
• BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack
• DrVideo: Document Retrieval Based Long Video Understanding
• Can Long-Context Language Models Subsume Retrieval, RAG, SQL, and More?
• Chain of Agents: Large Language Models Collaborating on Long-Context Tasks
• Are Long-LLMs A Necessity For Long-Context Tasks?