article
Technology writer
Large language models (LLMs) have transformed what’s possible with artificial intelligence, demonstrating an incredible breadth of capability. But while these models can handle almost any task, the real opportunity for startups lies in the specialization—in taking this powerful, general-purpose technology and focusing it to solve specific problems exceptionally well.
Benedict Evans, a technology analyst, argues that “An LLM by itself is not a product—it’s a technology that can enable a tool or a feature, and it needs to be unbundled or rebundled into new framings, UX, and tools to become useful.”
Increasingly, it appears that AI agents are the answer to this question, evolving from simple task handlers into more sophisticated forms: single agents focused on specific domains and multi-agent systems that work in concert. This evolution represents both the bundling that Evans points out as necessary and the opportunity that startups are seeking in a crowded market. In this article, we’ll explain single-agent vs multi-agent systems, explore the industry’s excitement behind these technologies, and walk through the potential impact on businesses.
Excited about AI agents but not sure where to start?
Explore the DigitalOcean Gen AI Platform, which offers a fully managed service, easy implementation, and flexible customization for building and deploying AI agents. Features include:
RAG workflows: Create intelligent agents that reference your data.
Guardrails: Create safer, enjoyable, on-brand agent experiences.
Function calling: Give your agents the ability to answer with real-time information.
Agent routing: Create agents that can take on multiple tasks.
Fine-tuned models: Create custom models with your data.
Submit a form to learn more about our pricing and the potential to receive free credits for testing.
Single AI agents are intelligent systems that make independent decisions, adapt to their environments, and pursue varying means to achieve predefined goals. Unlike AI chatbots, which rely on human input and prompt-based instructions, single-agent systems are autonomous and require little to no human involvement. These systems excel as specialized tasks, whether that’s analyzing vast data sets, automating software testing, or managing customer service tickets with a human-esque understanding. Think of a single agent as a highly skilled specialist who can work independently within its domain of expertise, learning from experience, and improving its performance (and understanding) over time.
Unlike single-agent systems that operate independently, multi-agent systems orchestrate multiple AI agents to work together toward shared goals. These systems create a collaborative environment where specialized agents can communicate, coordinate their actions, and divide complex tasks into manageable portions.
Human teams often outperform individual workers. Similarly, multi-agent systems take advantage of the collective capabilities of various agents to tackle problems that would be too complex (or time-consuming) for a single agent. The power of these systems is their ability to craft solutions that emerge only when multiple specialists combine their expertise and efforts.
To the uninitiated, the shift of attention from AI chat interfaces to AI agents and single-agent systems can sound like empty hype. But this shift is long coming.
A decade ago, the Society of Automotive Engineers visualized the evolution of automation through a five-step process leading from no automation to full automation. Their visualization focuses on self-driving cars, but the five-step process has become canonical for other industries because it helps map out the ways AI products evolve from assisting to automating.
(Source)
In this context, AI chat interfaces were closer to stage two, whereas AI agents promise the potential to take AI through stages three, four, and five.
As swyx and Alessio Fanelli write, “The relationship between automation and autonomy is subtle but important.” In stages three and four, humans will still be kicking off automated processes and potentially approving steps in those processes along the way. Through stages four and five, single agents start to become capable of autonomy, meaning human involvement is minimal.
The potential here is hard to understate: An AI coding assistant like GitHub CoPilot, for example, accelerates software development by providing suggestions and autocompleting new lines of code. But a truly intelligent agent could handle much of the coding itself, meaning the work of software developers would change on a more fundamental level.
As Steve Newman, the co-founder of Writely, which eventually became Google Docs, writes, “When AIs can successfully pursue long-term goals—planning, reacting, adapting, interacting, solving problems, asking for help, coordinating with people and other AIs—that’s when the world begins to change.”
AI agents might seem like science fiction, but there are many examples of single agents in practice.
Operator, from OpenAI, will launch in 2025 and can direct a user’s computer to act on that user’s behalf by writing code, booking travel, and more.
Reflexion is an autonomous AI agent that uses dynamic memory and self-reflection to outperform GPT-4 benchmarks.
Claude, from Anthropic, gained the ability to function as an agent in Claude 3.5 Sonnet. It can now move a cursor around a computer screen, click through menus, and input information through a virtual keyboard.
Second provides AI agents that autonomously migrate code and perform upgrades.
There are also numerous startups that provide software for building AI agents, such as Spell, Lindy, and Fixie.
For more examples, see the AI agents market map from Letta below.
The excitement around single agents points to another stage of development: Multi-agent systems.
AI agents are discrete, individual services, whereas a multi-agent system refers, as the name implies, to various ways of weaving single agents together into systems that can accomplish more than the sum of their parts. Multi-agent systems put a team of single agents together.
Because single agents are individually capable of autonomously achieving goals, multi-agent systems can communicate and collaborate between agents to achieve shared goals. Like collaborating humans, multiple interacting intelligent agents—especially multiple specialized agents—can get more done more efficiently by dividing the workload and completing processes in parallel.
Aura Ventures even considers multi-agent systems (here referred to as AI agent fleets) to be the next stage of autonomous AI agent development.
Multi-agent systems available today include:
AutoGen, an open-source framework for building AI agent systems.
CrewAI, a multi-agent platform that helps companies automate workflows.
MetaGPT, a multi-agent framework that allows developers to assign different roles to different GPTs so that they can collaborate on complex tasks.
Ken Collins, VP of Product at Torc, warns, however, that “Multi-agent systems are complex and hard to get right.” Multi-agent architecture is especially difficult because single-agent systems are already complex, making multi-agent systems even more complex if not designed well. As such, it’s helpful for builders and onlookers alike to examine the blockers and milestones on the way to truly autonomous multi-agent systems.
Just recently, Ilya Sutskever, co-founder of OpenAI, said that the results from scaling up pre-training have plateaued. That doesn’t mean AI progress itself has plateaued, but points to the fact that the evolution of AI will follow a twisting path, not a straight one.
In the past few years, the industry has witnessed a remarkable pace of AI development, but that rapid pace doesn’t mean the evolution of AI will be linear or predictable. There will be bumps in the road, hard limits that companies will work around, opportunities for generalized systems and opportunities for specialized agents, and surprising breakthroughs that change the industry’s next steps.
For example, Kanjun Qiu, co-founder and CEO of AI research lab Imbue, found in her work that “Reinforcement learning is not a good vehicle for planning and reasoning.” Reinforcement learning has been an essential component of machine learning and LLM development until now, but the methodology also poses limitations.
As Qui says, the agents her lab was teaching were able to perform a variety of low-level actions thanks to reinforcement learning but couldn’t perform higher-level actions. It remains to be seen whether the industry will reach fully autonomous AI agents through iteration or through a larger paradigm shift in how LLMs are trained and AI agents are built. It also remains to be seen just how difficult combining highly specialized agents will be.
Startups developing AI agents, multiple agents, and multi-agent systems face a range of obstacles, but as each is overcome, onlookers can track important milestones toward substantial progress.
Reliability: AI chatbots still sometimes produce incorrect information or AI hallucinations. Until AI agents can be trusted to be as accurate as human agents, businesses won’t trust them to run autonomously, much less collaboratively.
Exploration and iteration: Current LLM products, as Newman writes, “are trained to produce a final product in a single pass; they aren’t really taught to iterate, revise, or replan.” AI agents need to be able to explore and iterate as humans do, and multi-agent systems will need to be able to link those efforts together.
Creativity: AI output has already proven useful, but much of it—particularly when it comes to text output—is bland. AI agents and multi-agent systems need to become creative and insightful before they can compete with humans.
Reasoning skills: Current LLMs are still relatively easy to trick (seemingly simple questions, like “How many R’s in ‘strawberry’?” still trip up modern models, for example). AI agents and multi-agent systems will have to demonstrate much more robust and resilient reasoning skills to be trusted in business environments.
Benchmarks—one of the most common ways the industry tracks what AI systems can and can’t do—will be essential to supporting and measuring the development of AI agents and multi-agent systems. As of now, however, current benchmarks are limited.
Qui says that the “biggest thing holding the field back is lack of benchmarks that let us explore things like planning and curiosity.” As a result, she says, “it’s important to use both public and internal benchmarks.”
Public benchmarks will allow startups to communicate the abilities of their agents in ways the public can understand, but while AI agents and multi-agent systems remain emerging technologies, startups will also have to develop their own internal benchmarks to measure and track progress.
When ChatGPT launched in 2022, people got a glimpse of the future but not a full picture. To get an accurate sense of the future, we need to look past ChatGPT and see the precedent that came before it.
As Evans writes, “Machine learning’s breakthrough was over a decade ago now, and yet we are still inventing new use-cases for it—people are still creating companies based on realizing that X or Y is a problem, realizing that it can be turned into pattern recognition, and then going out and selling that problem.”
The future is here, but that doesn’t mean all of it is here. For founders, this is exciting news: There are still many more problems to be solved. The immediate future, as Evans implies, will involve years of finding new use cases and building new solutions on previous breakthroughs.
DigitalOcean’s new GenAI Platform empowers developers to easily integrate AI agent capabilities into their applications without managing complex infrastructure. This fully-managed service streamlines the process of building and deploying sophisticated AI agents, allowing you to focus on innovation rather than backend complexities.
Key features of the GenAI Platform include:
Direct access to foundational models from Meta, Mistral AI, and Anthropic
Intuitive tools for customizing agents with your own data and knowledge bases
Robust safety features and performance optimization tools
Ready to supercharge your applications with AI? Sign up for early access to DigitalOcean’s GenAI Platform today!
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.