zilliztech/GPTCache
Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
Implements semantic similarity matching using embeddings (ONNX, etc.) and pluggable vector stores (Faiss) to cache conceptually similar queries, reducing redundant API calls beyond exact-match caching. Provides both a Python library with adapter support for OpenAI and a Docker server exposing cache functionality via HTTP for language-agnostic integration, with configurable temperature-based cache bypass logic.
7,963 stars and 467,367 monthly downloads. No commits in the last 6 months. Available on PyPI.
Stars
7,963
Forks
570
Language
Python
License
MIT
Category
Last pushed
Jul 11, 2025
Monthly downloads
467,367
Commits (30d)
0
Dependencies
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/zilliztech/GPTCache"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
aiming-lab/SimpleMem
SimpleMem: Efficient Lifelong Memory for LLM Agents
zilliztech/memsearch
A Markdown-first memory system, a standalone library for any AI agent. Inspired by OpenClaw.
ascottbell/maasv
Memory Architecture as a Service — cognition layer for AI assistants. 3-signal retrieval,...
TeleAI-UAGI/telemem
TeleMem is a high-performance drop-in replacement for Mem0, featuring semantic deduplication,...
RichmondAlake/memorizz
MemoRizz: A Python library serving as a memory layer for AI applications. Leverages popular...