AI Agents of the Week: Papers You Should Know About
Get ahead of the curve with LLM Watch
Key themes this week include continual learning via memory, more adaptive planning strategies, and practical frameworks to scale agent performance:
Continual Learning & Memory: New approaches enable LLM-based agents to improve over time without retraining their base models. One framework (“Memento”) uses an episodic memory and reinforcement learning to let agents learn from past trials, achieving state-of-the-art results on challenging “deep research” tasks without any model fine-tuning. Another system (“SEDM”) transforms memory from a passive knowledge base into an active, self-optimizing module. By verifying what gets stored, consolidating useful information, and diffusing general insights, it boosts reasoning accuracy while avoiding endless memory growth.
Smarter Planning & Collaboration: Two papers tackle the when and how of reasoning. The first (“Jupiter”) turns complex data analysis into a search problem - using Monte Carlo Tree Search (MCTS) to plan code-based solutions. This allowed modest-sized models (7B–14B) to solve ~77–86% of tasks on a new data analysis benchmark, matching or beating GPT-4 on those tasks. The second (“Meta-Policy Deliberation”) improves multi-agent reasoning by letting agents dynamically choose to persist, refine, or concede in a debate. A new RL algorithm (SoftRankPO) then trains agents in this meta-cognitive routine, yielding a 4–5% absolute accuracy gain across challenging reasoning benchmarks vs. prior multi-agent methods.
Practical Performance Gains: A new system-level framework (“Auras”) tackled the speed bottleneck in embodied agents. By decoupling perception from generation and running them in parallel, it achieved over 2.5x higher throughput for real-time decision-making without any loss of accuracy. This suggests agents can be both fast and smart, handling high-frequency inputs in dynamic environments.
In summary, autonomous AI agents are becoming continuous learners with more efficient reasoning and execution. Below we dive into five notable papers, detailing their core innovations, why they matter for autonomous agents, the specific problems they address (memory, planning, collaboration, etc.), and the new capabilities or future possibilities they unlock.
Keep reading with a 7-day free trial
Subscribe to LLM Watch to keep reading this post and get 7 days of free access to the full post archives.