This week a startup called Cognition AI caused a bit of a stir by releasing a demo showing an artificial intelligence program called Devin performing work usually done by well-paid software engineers. Chatbots like ChatGPT and Gemini can generate code, but Devin went further, planning how to solve a problem, writing the code, and then testing and implementing it.
Devin’s creators brand it as an “AI software developer.” When asked to test how Meta’s open source language model Llama 2 performed when accessed via different companies hosting it, Devin generated a step-by-step plan for the project, generated code needed to access the APIs and run benchmarking tests, and created a website summarizing the results.
It’s always hard to judge staged demos, but Cognition has shown Devin handling a wide range of impressive tasks. It wowed investors and engineers on X, receiving plenty of endorsements, and even inspired a few memes—including some predicting Devin will soon be responsible for a wave of tech industry layoffs.
Devin is just the latest, most polished example of a trend I’ve been tracking for a while—the emergence of AI agents that instead of just providing answers or advice about a problem presented by a human can take action to solve it. A few months back I test drove Auto-GPT, an open source program that attempts to do useful chores by taking actions on a person’s computer and on the web. Recently I tested another program called vimGPT to see how the visual skills of new AI models can help these agents browse the web more efficiently.
I was impressed by my experiments with those agents. Yet for now, just like the language models that power them, they make quite a few errors. And when a piece of software is taking actions, not just generating text, one mistake can mean total failure—and potentially costly or dangerous consequences. Narrowing the range of tasks an agent can do to, say, a specific set of software engineering chores seems like a clever way to reduce the error rate, but there are still many potential ways to fail.
Not only startups are building AI agents. Earlier this week I wrote about an agent called SIMA, developed by Google DeepMind, which plays video games including the truly bonkers title Goat Simulator 3. SIMA learned from watching human players how to do more than 600 fairly complicated tasks such as chopping down a tree or shooting an asteroid. Most significantly, it can do many of these actions successfully even in an unfamiliar game. Google DeepMind calls it a “generalist.”
I suspect that Google has hopes that these agents will eventually go to work outside of video games, perhaps helping use the web on a user’s behalf or operate software for them. But video games make a good sandbox for developing and testing agents, by providing complex environments in which they can be tested and improved. “Making them more precise is something that we’re actively working on,” Tim Harley, a research scientist at Google DeepMind, told me. “We’ve got various ideas.”
You can expect a lot more news about AI agents in the coming months. Demis Hassabis, the CEO of Google DeepMind, recently told me that he plans to combine large language models with the work his company has previously done training AI programs to play video games to develop more capable and reliable agents. “This definitely is a huge area. We’re investing heavily in that direction, and I imagine others are as well.” Hassabis said. “It will be a step change in capabilities of these types of systems—when they start becoming more agent-like.”
Artificial Intelligence (AI) has been making significant strides in recent years, revolutionizing various industries and transforming the way we live and work. One area that has gained considerable attention is the development of chatbots, which are computer programs designed to simulate human conversation. While chatbots have proven to be useful in certain applications, many experts believe that the future of AI lies in AI agents, a more advanced form of technology that goes beyond simple chatbot interactions.
Chatbots have become increasingly popular in customer service, providing quick and automated responses to frequently asked questions. They are designed to understand natural language and provide relevant information or assistance. However, chatbots have limitations. They often struggle with complex queries or understanding context, leading to frustrating user experiences. This is where AI agents come into play.
AI agents are more sophisticated AI systems that can perform a wide range of tasks beyond basic conversation. They can understand and interpret complex data, make decisions, and even take actions on behalf of the user. Unlike chatbots, which are primarily text-based, AI agents can interact through various mediums such as voice, images, and videos. This opens up a whole new realm of possibilities for AI applications.
One area where AI agents are expected to have a significant impact is in personal assistants. Companies like Amazon with their Alexa and Google with their Assistant have already made strides in this field. These AI agents can perform tasks such as setting reminders, playing music, ordering products online, and even controlling smart home devices. As the technology advances, AI agents will become even more capable of understanding user preferences and providing personalized recommendations.
Another promising application for AI agents is in healthcare. With the ability to analyze vast amounts of medical data, AI agents can assist doctors in diagnosing diseases, predicting outcomes, and suggesting treatment plans. This can lead to more accurate diagnoses and improved patient care. Additionally, AI agents can provide support to patients by answering questions, monitoring vital signs, and reminding them to take medication.
AI agents also have the potential to revolutionize the field of education. They can provide personalized learning experiences, adapting to each student’s unique needs and pace of learning. AI agents can assess students’ strengths and weaknesses, offer tailored feedback, and suggest additional resources for further study. This individualized approach can greatly enhance the effectiveness of education and help bridge the gap between traditional classroom teaching and online learning.
Furthermore, AI agents can play a crucial role in the field of cybersecurity. With the increasing number of cyber threats, AI agents can continuously monitor networks, detect anomalies, and respond to potential attacks in real-time. They can also analyze patterns and trends to identify vulnerabilities and suggest preventive measures. By leveraging AI agents, organizations can strengthen their security infrastructure and protect sensitive data from malicious actors.
In conclusion, while chatbots have been a stepping stone in the development of AI technology, the future lies in AI agents. These advanced systems have the potential to revolutionize various industries, from customer service to healthcare, education, and cybersecurity. With their ability to understand complex data, make decisions, and take actions, AI agents offer a more sophisticated and personalized user experience. As technology continues to advance, we can expect AI agents to become an integral part of our daily lives, making tasks easier, more efficient, and ultimately enhancing our overall well-being.