llmware-ai/llmware
Unified framework for building enterprise RAG pipelines with small, specialized models
Brings together prepackaged quantized models (50+ specialized for RAG tasks like extraction, classification, and summarization) and a modular RAG pipeline with multi-format document parsing, vector embedding with multiple backends (Chromadb, Milvus), and hybrid query capabilities (text, semantic, metadata filters). The unified ModelCatalog interface abstracts over diverse inference engines—GGUF, OpenVINO, ONNX-Runtime, HuggingFace—enabling the same code to run on-device across CPUs, GPUs, and NPUs on Windows, Mac, and Linux. Prompt objects orchestrate end-to-end knowledge retrieval and generation, automatically batching sources to fit model context windows while tracking provenance for fact-checking against source materials.
14,864 stars and 1,177 monthly downloads. Actively maintained with 12 commits in the last 30 days. Available on PyPI.
Stars
14,864
Forks
2,964
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 21, 2026
Monthly downloads
1,177
Commits (30d)
12
Dependencies
6
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/rag/llmware-ai/llmware"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
Sinapsis-AI/sinapsis-chatbots
Monorepo for sinapsis templates supporting LLM based Agents
aimclub/ProtoLLM
Framework for prototyping of LLM-based applications
Azure-Samples/azureai-foundry-finetuning-raft
A recipe that will walk you through using either Meta Llama 3.1 405B or OpenAI GPT-4o deployed...
pkargupta/taxoadapt
Dynamically constructs and adapts an LLM-generated taxonomy to a given corpus across multiple dimensions.
xi029/Qwen3-VL-MoeLORA
在千问最新的多模态image-text模型Qwen3-VL-4B-Instruct 进行多种lora微调对比效果,通过langchain+RAG+多智能体(Multi-Agent)进行部署