Agent Memory in 2026: What Actually Works for Persistent AI

977 repos, 5 domains, 10+ names for the same concept. A decision guide for builders navigating the most fragmented category in AI infrastructure.

Graham Rowe · April 01, 2026 · Updated daily with live data
rag agents vector-db mcp embeddings

Your AI agent forgets everything between sessions. You know this is a problem. You search GitHub for "agent memory" and find 977 repos across five different domains, using ten different names for roughly the same concept. Welcome to the most fragmented category in AI infrastructure.

This guide exists because the fragmentation IS the story. Agent memory in 2026 is pre-nucleation: builders are shipping faster than the ecosystem can agree on vocabulary, architecture, or even which database to use. PT-Edge tracks all 977 of these repos and scores them daily on quality, maintenance, and adoption. Here's what that data says about what actually works.

The context window paradox: why memory systems aren't optional

Start with the assumption most builders make: "I have a 1M token context window, so I don't need a memory system." The data says otherwise.

Newsletter coverage from the last 90 days tells a clear story. Claude supports 1M tokens. GPT-5.4 supports 1M tokens. But Latent Space reported in March 2026 that GPT-5.4 accuracy degrades to 36% at 512K-1M tokens. Context windows have plateaued at 1M for two years running, constrained by HBM and DRAM limits. And even within that 1M, practical performance drops off a cliff above 256K tokens.

This is the paradox: the context window is large enough that you think you don't need memory, but too unreliable at scale to actually replace it. Every project in this guide exists because that gap is real. The 18 newsletter articles covering memory and context in the last 90 days — from Latent Space, Zvi, Simon Willison, and One Useful Thing — all converge on the same conclusion: persistent memory systems are becoming THE way to give agents long-term state.

The landscape at scale: 977 repos, no agreed vocabulary

PT-Edge classifies agent memory repos under the subcategory "agent-memory-systems" across five domains: RAG (132 repos), agents (122), MCP (358), vector-db (146), and embeddings (219). Add in adjacent categories like agent-memory-infrastructure (208 repos), claude-code-memory (70), and session-context-memory (59), and the total universe exceeds 1,300 repos touching memory in some form.

The category names tell you everything: agent-memory-systems, agent-memory-infrastructure, agent-memory-architectures, session-context-memory, claude-code-memory, ai-session-persistence. Ten different subcategory names for variations on the same problem. When an ecosystem can't agree on what to call itself, that's a classic pre-nucleation signal — the space is real, the demand is real, but consolidation hasn't happened yet.

The creation velocity confirms it: 55 new MCP memory repos appeared in the last 7 days alone. Session-context-memory shows 12x acceleration week-over-week. All of this is "creation without buzz" — builders are shipping memory infrastructure before the narrative catches up.

The leaders: four projects defining the category

ProjectScoreStarsCommits (30d)Approach
mem0 72/100 49,646 180 Multi-level memory (user, session, agent state)
claude-mem 87/100 34,460 82 Progressive disclosure, skill-based search
memvid 57/100 13,421 5 Single-file .mv2 format, sub-5ms retrieval
cognee 80/100 13,204 372 Graph + vector hybrid, ontology grounding

mem0 (72/100, 49,646 stars) is the category definer. Apache-2.0 licensed, 180 commits in the last 30 days, with Python and JavaScript SDKs plus integrations into LangGraph and CrewAI. It implements multi-level memory — user, session, and agent state — and claims 26% higher accuracy with 90% lower token usage compared to full-context approaches. If you're starting from zero and want the most established option, mem0 is where you begin.

claude-mem (34,460 stars) is the Claude Code ecosystem play. It captures everything Claude does, compresses it with AI, and provides progressive disclosure — surfacing relevant memories without flooding the context window. With 82 commits and 8 releases in the last 30 days, it's the fastest-growing memory plugin in the coding agent space.

Cognee (13,204 stars, 372 commits in 30 days) takes a different approach: graph plus vector hybrid retrieval with ontology grounding and audit trails. If your use case requires understanding relationships between memories — not just retrieving similar ones — Cognee's knowledge graph approach is the most mature option. The 372 commits in 30 days makes it the most actively developed project in the entire category.

Memvid (13,421 stars) is the radical simplicity play. It stores memories in a single .mv2 file with sub-5ms retrieval — no database server, no network calls, no infrastructure. Written in Rust with Node.js, Python, and Rust SDKs. At 13.4K stars with only 6 commits in 30 days, it's a case where the core design was right enough that it doesn't need constant iteration.

Architecture decision 1: storage backend

The most fundamental choice in agent memory is where memories live. The research reveals four distinct approaches, each with real trade-offs.

Vector database (Qdrant, Chroma, Milvus, LanceDB)

The established path. Store memories as embeddings, retrieve by semantic similarity. Projects like GPTCache (7,963 stars, 467,367 downloads/month) and memory-lancedb-pro (2,280 stars, 250 commits in 30 days) build on this foundation. The advantage is mature tooling and well-understood retrieval semantics. The risk is that similarity search alone misses temporal relationships, causal chains, and structured knowledge.

Graph database (Cognee, Graphiti, Memary)

Graphiti (23,665 stars) from Zep builds real-time knowledge graphs for agents. Memary (2,576 stars) combines entity knowledge graphs with episodic memory streams. The graph approach excels when your agent needs to reason about relationships — "what projects does this user work on?" rather than "what's similar to this query?" The trade-off is operational complexity and the need for graph database expertise.

SQL-native (Memori, Memobase)

Memori (12,351 stars) makes the case that you don't need a specialised database at all. It intercepts LLM conversations automatically, stores memories in SQL, and claims 81.95% accuracy at approximately 5% of the token usage of full-context approaches. Memobase (2,599 stars) takes a similar SQL-first approach with user profiles and timestamped timelines, delivering sub-100ms retrieval and 40-50% token cost reduction. If you already run Postgres or MySQL, the argument for adding a vector database just for memory becomes harder to justify.

File-based (Memvid, ReMe, Acontext)

The emerging counter-narrative. Memvid's single-file approach, ReMe's human-readable Markdown persistence (2,185 stars, 52 commits in 30 days), and Acontext's skills-as-Markdown-files pattern all point the same direction: sometimes the simplest storage is a file. No server, no schema migration, no connection pooling. A Hacker News post titled "You Don't Need a Vector Database" (March 2026) resonated because this sentiment is real.

Architecture decision 2: memory structure

Beyond storage, you need to decide what kind of memories your agent keeps. The research shows three distinct types, and the most sophisticated systems implement all three.

Memory TypeWhat it storesBest projects
Episodic What happened — session logs, conversation traces, interaction history mem0, Memary, EverMemOS
Semantic What things mean — entity relationships, knowledge graphs, user profiles Cognee, Graphiti, Memobase
Procedural How to do things — learned skills, preferences, workflows OpenMemory, MemMachine, MemOS

MemMachine (4,826 stars) implements all three explicitly: episodic memory backed by graph storage, profile memory in SQL, and working memory for the current session. OpenMemory (3,604 stars) takes a similar multi-sector approach with temporal reasoning built in. MemOS (6,790 stars, 283 commits in 30 days) reports 43.70% accuracy gains over OpenAI Memory and 35.24% token reduction.

The practical takeaway: if your agent just needs to remember past conversations, episodic memory alone works fine (mem0, Memvid). If your agent needs to build and maintain a model of its user or domain, you need semantic memory (Cognee, Graphiti). If your agent needs to learn and improve at tasks over time, you need procedural memory — and the projects implementing it are the newest and least proven.

The Claude Code memory ecosystem

A rich ecosystem has formed specifically around giving Claude Code and similar coding agents persistent memory. This matters because coding agents have unique memory requirements: they need to remember project structure, coding conventions, past decisions, and the reasoning behind architectural choices.

ProjectScoreStarsWhat it does
claude-mem 87/100 34,460 A Claude Code plugin that automatically captures everything Claude does...
OpenViking 85/100 7,606 OpenViking is an open-source context database designed specifically for AI...
MemOS 69/100 6,790 AI memory OS for LLM and Agent systems(moltbot,clawdbot,openclaw), enabling...
memory-lancedb-pro 60/100 2,280 Enhanced LanceDB memory plugin for OpenClaw — Hybrid Retrieval (Vector +...
cipher 67/100 3,578 Byterover Cipher is an opensource memory layer specifically designed for...

OpenViking (7,606 stars, 371 commits in 30 days) from ByteDance's Volcengine provides a three-tier hierarchical storage system (L0/L1/L2) with both directory-based and semantic retrieval. Cipher (3,578 stars) implements dual-layer memory specifically for code concepts and AI reasoning traces, supporting multiple LLMs and vector stores. The 70 repos in the claude-code-memory subcategory and 19 new ones per week show this ecosystem is still expanding rapidly.

The "you don't need a vector database" counter-narrative

Not everyone is adding infrastructure. A growing contingent argues that agent memory should be simpler, not more complex.

The evidence: Memvid stores everything in a single file. ReMe persists memories as human-readable Markdown. Acontext (3,154 stars from memodb-io) extracts conversation traces into editable Markdown files with no embeddings needed. SimpleMem (3,182 stars) uses a three-stage semantic compression pipeline and claims 64% better performance than Claude's native memory.

On Hacker News, "You Don't Need a Vector Database" generated 24 comments in March 2026. A "file-based agent memory framework" Show HN got engagement. A Google PM open-sourced an "Always On Memory Agent" that ditches vector DBs entirely. The pattern: builders who've tried the full vector-DB-plus-embeddings stack are finding that simpler approaches often work as well or better for agent memory specifically, because agent memories are typically short, structured, and relatively few compared to document corpora.

This doesn't mean vector databases are wrong. It means the default assumption that agent memory requires one is worth questioning. If your agent handles dozens to thousands of memories, a file or SQL table may outperform a vector database on latency, operational simplicity, and debuggability.

What's shipping: release velocity as a quality signal

Talk is cheap. Releases aren't. Here's what the last 30 days of release activity looks like for the most active memory projects:

ProjectScoreReleases (30d)Commits (30d)Downloads/mo
mcp-memory-service 73/100 25 153
claude-mem 87/100 8 82 7,318
cognee 80/100 7 372
mem0 72/100 5 180
memvid 57/100 2 5

mcp-memory-service (1,504 stars) stands out: 25 releases in 30 days. That's nearly one release per day. It provides persistent memory for AI agent pipelines with Claude integration, and the shipping pace suggests a team responding to real user feedback at speed.

Cognee's 372 commits in 30 days and mem0's 180 commits tell a similar story. The top memory projects are not in maintenance mode. They're in rapid iteration, adding features, fixing bugs, and responding to an ecosystem that's pulling hard on memory infrastructure.

Where this is heading: platform absorption risk

The biggest open question for every project in this guide: will the platforms absorb them?

The signals are clear. Claude launched Memory Export in March 2026. Microsoft Copilot Tasks added persistent multi-session memory in February. IBM published research showing that reusing agent strategies (a form of procedural memory) improves task completion rates. When three of the largest AI companies ship memory features within 60 days of each other, that's not coincidence — that's platform convergence.

The newsletter signal — 18 articles in 90 days — points to memory becoming a first-class platform feature rather than a third-party add-on. The question isn't whether platforms will have memory, but whether purpose-built memory projects offer enough advantage to survive alongside platform-native options.

The likely survivors are projects that do things platforms won't: cross-platform memory that works across Claude, GPT, and Gemini (mem0's positioning); memory architectures that are fundamentally different from what platforms ship (Cognee's knowledge graphs, Memvid's single-file approach); and memory for specific verticals where the general platform solution isn't good enough (coding agents, autonomous agents, multi-agent systems).

The projects most at risk are thin memory wrappers that add a persistence layer without a differentiated retrieval or storage mechanism. If your entire value proposition is "remember things between sessions," the platform will do that natively.

How to use this data

Every project in this guide has a quality-scored page in our directory, updated daily. You can:

Quality scores update daily from live GitHub, PyPI, and npm data. In a space moving this fast — 55 new repos per week, 25 releases per month from the most active project — yesterday's recommendation can be wrong tomorrow. The scores do the monitoring so you don't have to.

Related analysis