mistralai/mistral-inference
Official inference library for Mistral models
Provides efficient multi-GPU distributed inference via `torchrun` for large models like Mixtral 8x7B/8x22B, with built-in support for function calling across all models. Leverages `xformers` for optimized transformer operations and exposes both Python and CLI interfaces (`mistral-demo`, `mistral-chat`) for interactive testing and deployment. Supports diverse model families including specialized variants (Codestral for code, Mathstral for math, Pixtral for vision) alongside standard base and instruction-tuned versions.
10,705 stars.
Stars
10,705
Forks
1,024
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Feb 26, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/mistralai/mistral-inference"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
dvmazur/mixtral-offloading
Run Mixtral-8x7B models in Colab or consumer desktops
open-compass/MixtralKit
A toolkit for inference and evaluation of 'mixtral-8x7b-32kseqlen' from Mistral AI
vicuna-tools/vicuna-installation-guide
The "vicuna-installation-guide" provides step-by-step instructions for installing and...
pleisto/yuren-13b
Yuren 13B is an information synthesis large language model that has been continuously trained...
hkproj/mistral-llm-notes
Notes on the Mistral AI model