AI Agent Memory Architectures: 7 Emerging Frameworks

Memory is a core component of modern AI agents, and now it is gaining more attention as agents tackle longer tasks and more complex environments. It is responsible for many things: it helps agents store past experiences, retrieve useful information, keep track of context, and use what happened before to make better decisions later. To better understand the current landscape, we’ve compiled a list of fresh memory architectures and frameworks shaping how AI agents remember, learn, and reason over time:

Agentic Memory (AgeMem)
This framework unifies short-term memory (STM) and long-term memory (LTM) inside the agent itself, so a memory management becomes part of the agent’s decision-making process. Agents identify what to store, retrieve, summarize, or discard. Plus, training with reinforcement learning improves performance and memory efficiency on long tasks. → Read more
Memex
An indexed experience memory mechanism that stores full interactions in an external memory database and keeps only compact summaries and indices in context. The agent can retrieve exact past information when needed. This improves long-horizon reasoning while keeping context small. → Read more
MemRL
Helps AI agents improve over time using episodic memory instead of retraining. The system stores past experiences and learns which strategies work best through reinforcement learning. This way, MemRL separates stable reasoning from flexible memory and lets agents adapt and get better without updating model weights. → Read more
UMA (Unified Memory Agent)
It is an RL-trained agent that actively manages its memory while answering questions. It uses a dual memory system: a compact global summary plus a structured key–value Memory Bank that supports CRUD operations (create, update, delete, reorganize). It has improved long-horizon reasoning and state tracking. → Read more
Pancake
A high-performance hierarchical memory system for LLM agents that speeds up large-scale vector memory retrieval. It combines 3 techniques: 1) multi-level index caching (to exploit access patterns), 2) a hybrid graph index shared across multiple agents, and 3) coordinated GPU–CPU execution for fast updates and search. → Read more
Conditional memory
A model/agent selectively looks up stored knowledge during inference instead of activating everything. This is implemented with techniques like sparse memory tables (e.g., Engram N-gram lookup), key–value memory slots, routing/gating networks that decide when to query memory, and hashed indexing for O(1) retrieval. This lets agents access specific knowledge cheaply without increasing model size or context. → Read more in our article
Multi-Agent Memory from a Computer Architecture Perspective
A short but interesting paper that envisions memory for multi-agent LLM systems as a computer architecture. It introduces ideas such as shared vs. distributed memory, a three-layer memory hierarchy (I/O, cache, memory), highlights missing protocols for cache sharing and memory access between agents, and emphasizes memory consistency as a key challenge. → Read more

Subscribe to get it in your inbox

Also, subscribe to our X, Threads and BlueSky

to get unique content on every social media

7 Emerging Memory Architectures for AI Agents

Reply

#6: The Flywheel: What Happens When Workflows Run Themselves

AI 101: From Prompt Engineering to Skill Engineering

FOD#155: Continual Learning in LLMs: Why AI Models Need Sleep