Back to news
Insights & Thoughts – Mar 19, 2024

Exploring AI Agents: The Evolution from Chatbots to Intelligent Autonomous Systems

We recently had the chance to sit down with our friend and entrepreneur Stelio Tzonis to discuss the increasing role of Artificial Intelligence (AI) in venture capital in a context where reports show that AI now accounts for 25% of all U.S. venture investments. We discussed how the emergence of Large Language Models affects the stock market, evidenced by NVIDIA's skyrocketing value and the overall "hype" around AI technologies. Stelio, who has a reputation for seeing trends and applications of new technologies much ahead of the pack, shared his insights on some concrete impacts of AI in our society, and it’s clear we’re on the cusp of something much bigger. He explained how AI is evolving from passive applications that require human input (ChatGPT) to autonomous agents that can act independently and handle complex tasks, planning, reasoning, and acting with humans in the loop as well.

In this article, we would like to share the key takeaways from our conversation and some examples of how you can concretely enhance your skills in this area. Hopefully, these tips could enable you to become more productive by building custom agents for your use or give you a glimpse of what’s to come.

Chatbots

Since Open AI launched its application ChatGPT in November 2022, Chatbots have become a regular part of our daily lives, personally and professionally. These tools have been designed to answer questions based on the data set on which they were trained. However, a significant limitation was their access to only a restricted amount of information; we can recall the early days of OpenAI, where lawyers faced scrutiny after filing a motion with fake data sets caused by AI hallucinations.

APIs (tools)

Fast forward to March 2023, OpenAI unveiled its latest GPT4 model alongside the GPT4 API. This allowed developers to incorporate the model into their applications and expand the capabilities of LLMs. As a result, tools and internet data became more accessible to these models. For instance, ChatGPT could access the internet through the Bing API. Yet, despite these advancements, user prompts are still required to operate these applications, and misinformation generated by AI is still creating a frenzy on social media—(hence the disclaimer from all application providers to avoid liabilities).

Retriever-Augmented Generation (RAG)

In 2020, META introduced the Retriever-Augmented Generation (RAG) framework, which enhances LLMs by enabling them to pull in specific, relevant information from vast data pools. Instead of relying solely on pre-learned information, RAG actively searches for and incorporates specific and up-to-date details to generate responses, making the AI more accurate, informative, and contextually aware.

One highly disruptive use case for this framework is in biotech, where RAG can significantly accelerate drug discovery and research. RAG can assist researchers by quickly retrieving relevant studies, data sets, and existing drug information related to a specific disease or genetic marker. This use of RAG not only speeds up the research process but also enhances the ability to identify novel drug candidates, ultimately leading to more effective treatments.

Into the World of AI Agents [The new kids on the block]

Nowadays, research is moving beyond simple command-based interactions to something far more sophisticated—AI agents. This tech is still in its infancy; however, as Stelio argues, it's moving much faster than anyone could have expected.

So, what exactly is an AI agent? Well, the essence of it lies in its capacity for independent thought and action. Unlike traditional models that operate on direct user prompts, AI agents are goal-oriented. They have a "brain" in the form of an LLM, a combination of prompts that allows them to think through problems and take actions using tools through APIs. The agent follows an iterative process of planning, thinking, and acting, adapting its plan based on observation of the outcome of its actions until it reaches the goal set by the user. The agent could involve the user in its process to get precise information or validation before proceeding. In such cases, the agent views the human as another agent (proxy).

There are two main types of AI Agents: Autonomous AI Agents and AI Multi-Agents:

Autonomous AI Agents

An Autonomous AI Agent is usually a single entity designed to function independently and respond to its surroundings. LLM-based agents encompass chatbots for customer service, which can operate on their own, financial advisor bots, legal research assistants, as well as intelligent personal assistants for scheduling and reminders. These examples are common illustrations of such agents.

Multi-agents (MAS)

Multi-agent systems (MAS) introduce a collaborative or competitive dynamic among agents, each specializing in distinct tasks. These systems tackle more complex problems than a single agent could.

Software programs such as Langchain and CrewAI are leading the way in offering frameworks for developing customized AI agent systems. For instance, in his blog post, Joao Moura, the maintainer of CrewAI on GitHub, demonstrates how researchers, analysts, and investment advisor agents can collaborate to create a recommendation on any publicly traded stock. Other examples he highlights include how he can make a landing page with a single prompt and use agents to plan his trips from planning to booking.

Further tip: if you want to try CrewAI, you can use the OpenAI Custom GPT CrewAI Assist to help build your custom AI Agents.

Multimodal Systems

Parallel to the development of autonomous agents is the emergence of multimodal AI.

This technology transcends the limitations of text-based interactions by incorporating visual, auditory, and sensor-based data. Integrating multiple data types allows AI to understand the world better, leading to more accurate interpretations and actions. This integration promises to unlock new applications and efficiencies across a broad spectrum of industries, from robotics to all sorts of personalized digital interfaces. For instance, a multimodal AI system in the healthcare sector could synergize data from medical imaging (visual), patient voice recordings (auditory), and wearable health monitors (sensor-based data) to provide a more comprehensive and nuanced patient assessment. This integrated approach enables the AI to identify subtle health patterns or anomalies that might be overlooked when analyzing these data types in isolation, thereby supporting early diagnosis and personalized treatment plans. This application demonstrates AI's capability to handle complex, multifaceted health data and highlights its potential to revolutionize patient care and outcomes.

A Game-Changer Across the Board

However, the potential of multimodal AI extends far beyond healthcare. Imagine smart homes that adjust to our moods, autonomous vehicles that easily navigate the complexities of the road, and personalized digital interfaces that understand us better than ever before. Multimodal AI agents, equipped with the ability to perceive the world through multiple senses, promise a future where technology is more intuitive, responsive, and aligned with human needs.

Multimodal AI represents a significant leap forward in our quest to create machines that understand and interact with the world in a more human-like manner. As we stand on the brink of this new era, the possibilities seem limitless. Multimodal AI is not just a technological upgrade; it's a transformative force that will redefine our relationship with technology.

"An agent will be able to help you with all your activities if you want it to. With permission to follow your online interactions and real-world locations, it will develop a powerful understanding of the people, places, and activities you engage in. It will get your personal and work relationships, hobbies, preferences, and schedule. You'll choose how and when it steps in to help with something or ask you to make a decision." - Bill Gates, November 09, 2023. source

The Future Landscape Shaped by AI

Our conversation with Stelio painted a vivid picture of a future where AI agents and multimodal AI are integral to our daily lives. From personal assistants who manage our schedules and preferences to legal and financial advisors who navigate through vast amounts of data to provide personalized advice, the potential applications are boundless.

However, this future also comes with its set of challenges and ethical considerations. Issues such as data privacy, intellectual property rights, and the potential for misuse necessitate a careful approach to developing and deploying these technologies. However, Stelio emphasized that he remains optimistic, viewing these challenges as opportunities for innovation and improvement.

Conclusion

As we stand on the brink of a new era in AI, these insights offer both excitement and caution. The evolution of autonomous agents and multimodal AI is not just a technological revolution; it’s a paradigm shift in how we envision the role of machines in society. While the journey ahead is fraught with challenges, the potential for these technologies to enhance our lives, streamline our workflows, and open up new avenues for innovation is undeniable. As we navigate this uncharted territory, the focus on ethical, secure, and responsible AI will be more crucial than ever.

This article was co-written by Olivier Laplace, Managing Partner at Vi Partners and Bruno Trivelli, EPFL robotics engineer based on conversations with Stelio Tzonis.  Your reactions and suggestions are welcome. Please contact us directly.

Vi Partners has been investing Technology and Healthcare for the past two decades, and more and more actively in SaaS & AI, with, for examples, investments in Vara, sibyllabiotech, Unique, Picterra, Acodis and Morgen.

Latest News

Related News