zilliztech/GPTCache

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.

/ 100

Established

Implements semantic similarity matching using embeddings (ONNX, etc.) and pluggable vector stores (Faiss) to cache conceptually similar queries, reducing redundant API calls beyond exact-match caching. Provides both a Python library with adapter support for OpenAI and a Docker server exposing cache functionality via HTTP for language-agnostic integration, with configurable temperature-based cache bypass logic.

7,963 stars and 467,367 monthly downloads. No commits in the last 6 months. Available on PyPI.

Stale 6m

Maintenance 2 / 25

Adoption 20 / 25

Maturity 25 / 25

Community 19 / 25

How are scores calculated?

Stars

7,963

Forks

570

Language

Python

License

MIT

Featured in

Agent Memory in 2026: What Actually Works for Persistent AI

Related tools

aiming-lab/SimpleMem

SimpleMem: Efficient Lifelong Memory for LLM Agents

zilliztech/memsearch

A Markdown-first memory system, a standalone library for any AI agent. Inspired by OpenClaw.

ascottbell/maasv

Memory Architecture as a Service — cognition layer for AI assistants. 3-signal retrieval,...

TeleAI-UAGI/telemem

TeleMem is a high-performance drop-in replacement for Mem0, featuring semantic deduplication,...

RichmondAlake/memorizz

MemoRizz: A Python library serving as a memory layer for AI applications. Leverages popular...

Explore Embedding Tools

All categories Trending Embeddings directory Insights