LLM Watch

LLM Watch

Share this post

LLM Watch
LLM Watch
The Week in AI Agents: Papers You Should Know About
The Week in AI Agents

The Week in AI Agents: Papers You Should Know About

Keeping up with AI doesn't have to be tedious

Pascal Biese's avatar
Pascal Biese
Apr 27, 2025
∙ Paid
22

Share this post

LLM Watch
LLM Watch
The Week in AI Agents: Papers You Should Know About
1
Share

Another exciting week in AI agents, showcasing both new capabilities and deeper understanding of their limits.

Researchers demonstrated for the first time an AI coding agent that can rewrite and improve its own code, boosting its performance dramatically without human intervention.

Meanwhile, a “Text-to-Decision” framework showed that teaching robots through plain language descriptions can enable zero-shot mastery of new tasks, bridging the gap between human instructions and autonomous action​.

On the decision-making front, scientists pinpointed why large language model (LLM) agents often act greedily or fail to use their knowledge, and showed how a bit of reinforcement learning can make them far more strategic​.

In embodied environments, a neuro-symbolic agent nicknamed “WALL-E 2.0” combined logical world knowledge with LLM planning to achieve near-perfect success in complex open-world tasks, leapfrogging past prior methods with minimal training​.

Finally, a comprehensive study of multi-LLM systems revealed why simply adding more agents doesn’t always help: the team cataloged 14 distinct failure modes (from miscommunication to lack of oversight) that explain why multi-agent frameworks often barely outperform single agents.

Together, these advances paint a clear picture of the state of AI agents: they are becoming more autonomous, teachable, and efficient. Which is why experts urged caution – calling for “minimum safeguards” to ensure that as agents gain the ability to self-improve and act independently, they remain under meaningful human control.

Let’s start the deep dive!

Keep reading with a 7-day free trial

Subscribe to LLM Watch to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Pascal Biese
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share