What "Deep Agents" Are and Why It Matters

Yet another buzzword that might actually matter

Jul 31, 2025

Quick Glossary (for the uninitiated)

- System prompt: The initial instructions that tell an AI model how to behave throughout a conversation - like giving a new employee their job description
- Context engineering: The art of structuring information and instructions to get better AI outputs (evolved from "prompt engineering")
- Planning tools: Functions that help AI break down complex tasks into steps - sometimes just forcing it to think before acting
- Sub-agents: Specialized AI instances spawned to handle specific subtasks, like hiring contractors for parts of a project
- No-op: A programming term for an operation that does nothing - like a "think about it" button that just makes you pause

Here's something that should sound familiar: the AI community has once again rebranded something we've been doing all along, and somehow, this time it might actually matter. Remember when, just recently, "prompt engineering" suddenly became "context engineering"? Well, we're watching it happen again with "deep agents."

But before you roll your eyes at yet another buzzword, let me tell you why this particular relabeling could represent something genuinely important - a qualitative shift in how we think about AI systems that mirrors other evolutionary moments in computing.

The Pattern We Keep Missing

A recent LangChain blog post introduced the concept of "deep agents" along with a new library, and it simpy reflects a collective “aha”-moment a lot of researchers and practicioners have been experiencing over the last months: While everyone was obsessing over the latest models, it was actually the systems built around them that make all the difference.

You probably know the feeling - when you realize that what seems revolutionary is actually the natural convergence of tools we've had all along. It's like watching someone discover that peanut butter and chocolate taste good together. Of course they do. We just needed someone to point it out.

The components of a "deep agent" read like a grocery list of things we've been using separately:

- A detailed system prompt (we've been crafting these since GPT-3)

- A planning tool (even if it's just a no-op that forces the model to think)

- Sub-agents for task delegation

- Access to a file system for memory and collaboration

Sound familiar? It should. Claude Code, OpenAI's Deep Research, and Manus aren't doing anything magically new - they're mostly combining these ingredients in a way that suddenly makes agents feel, well, deeper.

Fun fact: The same can be said about the original ChatGPT. As Yann LeCun pointed out at the time, there was “nothing revolutionary”. And despite that, we all know by now how much of an impact ChatGPT had on the world of AI and beyond.

What "Deep" Actually Means

The shift from “shallow” to “deep” agents is fundamentally about changing our approach. A shallow agent is like a smart assistant who can only handle one request at a time: "Set a timer for 10 minutes." Done. Next request. It might take several steps to fulfill the task, but it’s not deviating from its pretty straightforward workflow. A deep agent is more like a research assistant you'd actually want to hire: give them a complex task, and they'll break it down, pursue multiple threads, take notes, and come back with something substantial.

What's been recognized is that the difference lies in the scaffolding around the LLM. We're seeing context engineering principles applied to agent behavior rather than individual prompts.

One could say that I just described a regular agent and that this has always been the baseline - and I wouldn't disagree. But with the rampant agentwashing going on, it might not be the worst idea to have one more term to differentiate more complex agents from all the rebranded chatbots. I know it can be annoying, but I see it as the lesser evil.

The Todo List That Does Nothing (And Why That's All It Takes)

One particularly interesting detail from the deep agents implementation is the planning tool that literally “does nothing”. Claude Code uses a "TodoWrite" tool that's just a no-op - it exists solely to force the agent to articulate its plan. This is peak "working with AI" wisdom: sometimes the best tools are the ones that trick the model into better behavior.

This reminds me of rubber duck debugging, where programmers explain their code to an inanimate object. The duck doesn't do anything, but the act of explanation often reveals the solution. Deep agents sometimes use the same principle - forcing structured thinking through tool use, even when the tool itself is meaningless. Of course, the next step would be to come up with tooling that adds to this process. But if all it takes to get 0 to 80% is “nothing”, then that seems like a great start. A practical way to look at this is viewing it as a quick reminder in a stressful situation. You might be a great performer on stage, but there’s always a chance that you will forget your lines - and that’s what prompters are for.

What This Means for the Rest of Us

Here's where it gets interesting for anyone building with AI. The emergence of deep agents signals that we've crossed a threshold in system design sophistication. The models haven't suddenly gotten smarter, we've gotten better at making them useful.

Think about it this way: we've had all the ingredients for French cooking for centuries, but it took someone to write down the recipes and techniques before everyone could make a decent béarnaise. Deep agents are similar - they're a recipe for combining tools we already have into something more powerful than the sum of its parts.

The Real Innovation Is the Mindset

What’s most interesting to the engineers among us is what it represents about our growing sophistication in working with AI. We're moving from "what can this model do?" to "how can we orchestrate multiple models and tools to accomplish complex goals?"

The fact that an open-source framework can be built over a weekend that replicates much of what the best available tools (e.g., Claude Code) are doing tells you everything you need to know. This is about design patterns that anyone can implement, not proprietary breakthroughs. Obviously, this is often hypothetical. It’s similar to how “anyone” could - from a purely technical standpoint - replicate a lot of modern art. Yet I would bet that most of you aren’t selling duct-taped bananas for $6.2 million. The new bottleneck is knowing what to build and in what context.

So What Should You Do?

If you're building with AI, the lesson is clear: stop thinking about individual model capabilities and start thinking about systems. The winners in the AI application space won't be those with access to the best models (those are increasingly commoditized), but those who best understand how to combine simple components into sophisticated behaviors.

Try this experiment: take any complex task you currently do manually and break it down into steps a patient but not-too-bright assistant could follow. Add a planning phase, some way to take notes, and the ability to delegate subtasks. Congratulations - you've just designed a deep agent.

The tools are all there. We just needed someone to point out that they taste good together.

The shift from shallow to deep agents is less a technological revolution and more a collective "aha" moment about system design. Like the evolution from prompt to context engineering, it's a recognition that the real gains come not from better models, but from better ways of using them. And that's something we can all participate in, starting today.

LLM Watch

Discussion about this post