eth-sri/lmql
A language for constraint-guided and efficient LLM programming.
Embeds LLM calls natively within Python syntax while applying declarative constraints via `where` clauses to guide model output—enabling control over token length, stopping phrases, and datatypes at the language level. Executes programs using advanced decoding strategies (beam search, best_k, argmax, sample) with optimizations like speculative execution and tree-based caching for faster inference. Supports OpenAI, Azure OpenAI, and Hugging Face Transformers models with async/parallel execution, plus integrations with LangChain and LlamaIndex.
4,161 stars. No commits in the last 6 months.
Stars
4,161
Forks
219
Language
Python
License
Apache-2.0
Category
Last pushed
May 22, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/eth-sri/lmql"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
thu-pacman/chitu
High-performance inference framework for large language models, focusing on efficiency,...
sophgo/LLM-TPU
Run generative AI models in sophgo BM1684X/BM1688
NotPunchnox/rkllama
Ollama alternative for Rockchip NPU: An efficient solution for running AI and Deep learning...
Deep-Spark/DeepSparkHub
DeepSparkHub selects hundreds of application algorithms and models, covering various fields of...
HuaizhengZhang/AI-Infra-from-Zero-to-Hero
🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for...