messkan/prompt-cache
Cut LLM costs by up to 80% and unlock sub-millisecond responses with intelligent semantic caching.A drop-in, provider-agnostic LLM proxy written in Go with sub-millisecond response
Implements a two-stage semantic similarity verification system combining high/low thresholds for direct cache hits and misses, with optional LLM-based intent verification for ambiguous cases to prevent hallucination risks. Integrates with multiple embedding providers (OpenAI, Mistral, Claude/Voyage) via a plugin architecture, exposing an OpenAI-compatible API endpoint (`/v1`) for transparent middleware deployment. Includes production monitoring with Prometheus metrics, structured logging, cache management API, and graceful shutdown for Kubernetes/containerized environments.
209 stars.
Stars
209
Forks
19
Language
Go
License
MIT
Category
Last pushed
Jan 25, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/vector-db/messkan/prompt-cache"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
RediSearch/RediSearch
A query and indexing engine for Redis, providing secondary indexing, full-text search, vector...
redis/redis-vl-python
Redis Vector Library (RedisVL) -- the AI-native Python client for Redis.
redis-developer/redis-ai-resources
✨ A curated list of awesome community resources, integrations, and examples of Redis in the AI ecosystem.
luyug/GradCache
Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint
redis-developer/redis-product-search
Visual and semantic vector similarity with Redis Stack, FastAPI, PyTorch and Huggingface.