Agent Report #1: AI Agents Are Here to Stay
OpenAI Agents SDK, Gemini 2.5 Pro, Oracle AI Agent Studio and Manus AI
The field of artificial intelligence has recently experienced what one might consider to be a paradigm shift.
While generative AI and large language models dominated headlines throughout 2023 and 2024, we're now witnessing the emergence of a new frontier: autonomous AI agents. This past month has been particularly eventful with several groundbreaking developments that signal a move from theoretical possibilities to practical implementations.
In this report, we'll explore the most significant developments in (autonomous) AI agents from last month, analyzing both the technological breakthroughs and their implications for the future of AI and human-machine collaboration.
More specifically, we’ll cover:
The emergence of general-purpose autonomous agents
Advancements in reasoning capabilities powering the latest agents
New platforms and developer tools democratizing agent creation
Industry applications and real-world impact
Cutting-edge research shaping the future of autonomous agents
Ethical considerations in the age of autonomous AI
The Dawn of General-Purpose Autonomous Agents
March has witnessed what might be remembered as a turning point in AI history with the introduction of several systems claiming unprecedented levels of autonomy. The most talked-about development has been China's Manus AI, developed by Butterfly Effect.
Manus AI: A Step Toward AGI?
Manus has generated significant buzz by positioning itself as a general-purpose AI capable of performing an impressive range of tasks - from complex operations like purchasing property and developing video games to more routine activities such as booking holidays. What makes Manus particularly noteworthy is its purported ability to bridge the gap between conceptualization and execution, a crucial step toward more general artificial intelligence.
The developers have boldly claimed that Manus represents a new era of human-machine collaboration, with capabilities approaching what many would consider early-stage Artificial General Intelligence (AGI). These claims have sparked intense debate within the AI community about whether we're truly witnessing a fundamental breakthrough or simply an impressive integration of existing technologies.
Reality Check: Performance in Practice
Initial reviews of Manus have been mixed, revealing both impressive capabilities and significant limitations. Some experts have described working with the system as akin to collaborating with a "highly intelligent intern" - capable of independent work but still requiring oversight. Others have reported instances where the agent stumbles on seemingly simple tasks, makes incorrect assumptions, or gets caught in feedback loops.
These experiences highlight the continued gap between ambitious promises and technological reality. The inconsistent performance raises important questions about how we define and measure true autonomy in AI systems. Is Manus truly autonomous in the sense that it can reliably complete complex tasks without human intervention? Or does its performance suggest that we're still in the early stages of what could eventually become truly autonomous systems?
Enhanced Reasoning: The Cognitive Infrastructure of Autonomous Agents
An autonomous agent's effectiveness is fundamentally tied to its underlying reasoning capabilities. This month saw significant advancements in this area, most notably with Google's release of Gemini 2.5 Pro Experimental.
Keep reading with a 7-day free trial
Subscribe to LLM Watch to keep reading this post and get 7 days of free access to the full post archives.