EfficientContext/ContextPilot

Accelerating Long Context LLM Inference with Accuracy-Preserving Context Optimization in SGLang, vLLM, llama.cpp, RAG, and Agentic AI.

/ 100

Established

Maintains a **Context Index** of cached content blocks and applies **reordering and deduplication** to align overlapping context into common prefixes, maximizing KV cache hits across requests. Integrates transparently with vLLM, SGLang, and llama.cpp via hooks and OpenAI-compatible APIs, with optional GPU-accelerated index computation for production-scale inference and validated support for RAG, agentic AI, and memory-augmented chat workloads.

Available on PyPI.

Maintenance 13 / 25

Adoption 13 / 25

Maturity 20 / 25

Community 6 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

Apache-2.0

Related agents

ultracontext/ultracontext

Open Source Context infrastructure for AI agents. Auto-capture and share your agents' context everywhere.

dunova/ContextGO

Local-first context & memory runtime for multi-agent AI coding teams. MCP-free. Rust/Go accelerated.

dgenio/contextweaver

Budget-aware context compilation and context firewall for tool-heavy AI agents.

LogicStamp/logicstamp-context

A Context Compiler for TypeScript. Deterministic, diffable architectural contracts and...

astrio-ai/atlas

Coding agent for legacy code modernization

Explore AI Agents

All categories Trending AI Agent directory Insights