15 Agentic AI Frameworks & Systems from 2024

This year, we started our “AI Agents and Agentic Workflows” series to explore everything about AI agents step by step: all the vocabulary, how they work, and how to build them. The huge interest in this series and the large number of studies conducted on agents showed that it was one of the most popular and important themes of the year. In 2025, most likely, agents will reach new highs – we will be covering that for you. Now, let’s review the agentic systems that have emerged this year.

Here is a list of 15 agentic systems and frameworks of 2024:

GUI & Computer-Use Agents

GUI Agents: A Survey. GUI Agents mimic how humans interact with software, like clicking, typing, and navigating screens. This survey organizes their key features: how they see, think, plan, and act; and highlights challenges, future improvements, and essential tools for researchers. Claude 3.5's computer use feature, which lets the model click, type and navigate screens like a human, was announced in October 2024 — we covered it first in our weekly digest. → GUI Agents: A Survey
Agent S is an open agentic framework that helps AI interact with computers through GUIs, automating complex tasks like a human user would. It learns new systems, plans long tasks, and adapts to changing interfaces, improving task execution and reasoning with smart planning tools and multimodal AI models. → Agent S: An Open Agentic Framework that Uses Computers Like a Human
AgentStore is a platform designed to automate complex computer tasks by combining various digital agents, similar to an app store for AI. It allows users to add third-party agents, using "MetaAgent" with the “TokenAgent” strategy to manage them, balancing specialized and general skills for different tasks. → AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant
Reasoning & Multi-Agent Frameworks
MALT improves reasoning with multi-agent LLM training. It shows how multiple LLMs can collaborate to solve complex problems, taking on different roles— generator, verifier, and refiner — to improve each other's outputs in a step-by-step process. → MALT: Improving Reasoning with Multi-Agent LLM Training
PRefLexOR method helps AI models teach themselves to reason better by refining their thinking through repeated steps. Combining ideas from preference optimization and reinforcement learning, it uses tools like dynamic knowledge graphs and sampling to refine answers for deep, reflective reasoning. This makes it useful for tasks like scientific research and cross-domain applications. → PRefLexOR: Preference-based Recursive Language Modeling for Exploratory Optimization of Reasoning and Agentic Thinking
Bel Esprit is a conversational agent that helps create AI pipelines by combining multiple models to handle complex tasks. It uses a multi-agent system where smaller agents work together to understand user needs, select suitable models, and build the pipeline. → Bel Esprit: Multi-Agent Framework for Building AI Model Pipelines
DynaSaur is an agent framework where agents dynamically create and execute actions as programs in a general-purpose language. These actions can be reused later, making the system more adaptive. DynaSaur can effectively handle unexpected scenarios and achieves top performance on the GAIA leaderboard. → DynaSaur: Large Language Agents Beyond Predefined Actions
Autonomous Research & Data Science
Agent K v1.0 is an autonomous agent which manages the entire data science lifecycle. It learns from experience using a flexible reasoning framework to tackle diverse tasks. Without human input, it optimizes data processing, feature engineering, and model tuning, achieving a 92.5% success rate across domains like NLP and computer vision. → Kolb-Based Experiential Learning for Generalist Agents with Human-Level Kaggle Data Science Performance
The AI Scientist is a framework that enables AI to conduct scientific research independently. It generates ideas, writes code, runs experiments, visualizes results, and drafts full research papers, even simulating peer reviews. Applied to machine learning topics, it produces papers at a cost of under $15 each. → The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery
AutoKaggle is a framework that uses multi-agent system to automate processes like coding, debugging, and testing in handling tabular data tasks, while allowing user input at any stage. With tools for data cleaning, feature engineering, and modeling, it streamlines workflows. → AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
AgentInstruct is a Microsoft Research’s framework that automatically generates high-quality synthetic data, using simple inputs like text or code. It created a dataset to teach models skills like writing, coding, and comprehension, that led to huge performance boosts across benchmarks. → AgentInstruct: Toward Generative Teaching with Agentic Flows
World Modeling & Simulation
Automated Design of Agentic Systems (ADAS) allows a "meta-agent" to write and improve agents using code. This method created agents that outperformed current designs and worked well across different tasks, showing the potential for smarter, more adaptable AI systems. → Automated Design of Agentic Systems
WALL-E is an agent system that combines an LLM with learned rules to understand and navigate environments. It aligns LLM predictions with real-world dynamics and uses Model-Predictive Control (MPC) to plan actions efficiently. → WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents
“Generative Agent Simulations of 1,000 People” proposes a new agent architecture that simulates human behavior by modeling the attitudes and actions of 1,052 real individuals. Using LLMs and interview data, these agents replicate human responses with 85% accuracy, comparable to how consistent people are with their own answers over time. → LLM Agents Grounded in Self-Reports Enable General-Purpose Simulation of Individuals
Generative World Explorer (Genex) is a framework that enables AI agents to mentally explore 3D environments, like urban scenes, updating their understanding without constant physical exploration, similar to how humans imagine unseen parts of the world. → Generative World Explorer
For a 2025 update, see our deep-dive on Qwen-Agent framework
For a practitioner's perspective on deploying agents safely, see our interview with Rubrik's CPO Anneka Gupta