messkan/prompt-cache

Cut LLM costs by up to 80% and unlock sub-millisecond responses with intelligent semantic caching.A drop-in, provider-agnostic LLM proxy written in Go with sub-millisecond response

/ 100

Emerging

Implements a two-stage semantic similarity verification system combining high/low thresholds for direct cache hits and misses, with optional LLM-based intent verification for ambiguous cases to prevent hallucination risks. Integrates with multiple embedding providers (OpenAI, Mistral, Claude/Voyage) via a plugin architecture, exposing an OpenAI-compatible API endpoint (`/v1`) for transparent middleware deployment. Includes production monitoring with Prometheus metrics, structured logging, cache management API, and graceful shutdown for Kubernetes/containerized environments.

209 stars.

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 13 / 25

Community 14 / 25

How are scores calculated?

Stars

209

Forks

Language

License

MIT

Higher-rated alternatives

RediSearch/RediSearch

A query and indexing engine for Redis, providing secondary indexing, full-text search, vector...

redis/redis-vl-python

Redis Vector Library (RedisVL) -- the AI-native Python client for Redis.

redis-developer/redis-ai-resources

✨ A curated list of awesome community resources, integrations, and examples of Redis in the AI ecosystem.

luyug/GradCache

Run Effective Large Batch Contrastive Learning Beyond GPU/TPU Memory Constraint

redis-developer/redis-product-search

Visual and semantic vector similarity with Redis Stack, FastAPI, PyTorch and Huggingface.

Explore Vector Databases

All categories Trending Vector Database directory Insights