JingbiaoMei/ATM-Bench

ATM-Bench: A benchmark for long-term personalized memory QA spanning ~4 years of multimodal data (images, videos, emails). Features referential queries, evidence-grounded answering, and multi-source reasoning. Paper: "According to Me: Long-Term Personalized Referential Memory QA"

/ 100

Experimental

No Package No Dependents

Maintenance 13 / 25

Adoption 5 / 25

Maturity 9 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

MIT

Category

agent-memory-systems

Last pushed

Mar 12, 2026

Commits (30d)

GitHub

Agent Memory Systems · 122 tools

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/JingbiaoMei/ATM-Bench"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Featured in

Agent Memory in 2026: What Actually Works for Persistent AI We Audited crewAI's AI Dependencies: Here's What the Data Says

Higher-rated alternatives

MemoriLabs/Memori

SQL Native Memory Layer for LLMs, AI Agents & Multi-Agent Systems

volcengine/OpenViking

OpenViking is an open-source context database designed specifically for AI Agents(such as...

zjunlp/LightMem

[ICLR 2026] LightMem: Lightweight and Efficient Memory-Augmented Generation

mem0ai/mem0

Universal memory layer for AI Agents

memodb-io/memobase

User Profile-Based Long-Term Memory for AI Chatbot Applications.

Explore RAG Tools

All categories Trending RAG directory Insights