EricLBuehler/mistral.rs

Fast, flexible LLM inference

62
/ 100
Established

Supports any Hugging Face model with zero configuration, handling multimodal inputs (text, vision, video, audio) and quantization formats (GGUF, GPTQ, AWQ, FP8) seamlessly. Built on continuous batching, FlashAttention, PagedAttention, and optional multi-GPU tensor parallelism for optimized throughput. Provides Python/Rust SDKs, an integrated web UI, hardware auto-tuning, and agentic capabilities including tool calling and MCP client support.

6,681 stars. Actively maintained with 18 commits in the last 30 days.

No Package No Dependents
Maintenance 17 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 19 / 25

How are scores calculated?

Stars

6,681

Forks

540

Language

Rust

License

MIT

Last pushed

Feb 27, 2026

Commits (30d)

18

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/transformers/EricLBuehler/mistral.rs"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.