Rust LLM Infrastructure LLM Tools
Low-level Rust libraries and tools for building, running, and managing LLMs locally—including model merging, inference engines, tokenization, and architecture implementations. Does NOT include application frameworks, API clients, or higher-level orchestration platforms.
There are 104 rust llm infrastructure tools tracked. 1 score above 50 (established tier). The highest-rated is trymirai/uzu at 56/100 with 1,492 stars. 1 of the top 10 are actively maintained.
Get all 104 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=rust-llm-infrastructure&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
trymirai/uzu
A high-performance inference engine for AI models |
|
Established |
| 2 |
lipish/llm-connector
LLM Connector - A unified interface for connecting to various Large Language... |
|
Emerging |
| 3 |
justrach/bhumi
⚡ Bhumi – The fastest AI inference client for Python, built with Rust for... |
|
Emerging |
| 4 |
rustformers/llm
[Unmaintained, see README] An ecosystem of Rust libraries for working with... |
|
Emerging |
| 5 |
keyvank/femtoGPT
Pure Rust implementation of a minimal Generative Pretrained Transformer |
|
Emerging |
| 6 |
kreuzberg-dev/liter-llm
Universal LLM API client — 142+ providers, 11 native language bindings,... |
|
Emerging |
| 7 |
mplekh/rust-microgpt
Port of Andrej Karpathy's python microGPT to Rust |
|
Emerging |
| 8 |
ShelbyJenkins/llm_client
The Easiest Rust Interface for Local LLMs and an Interface for Deterministic... |
|
Emerging |
| 9 |
luckenco/rsai
Predictable development for unpredictable models. Let the compiler handle the chaos. |
|
Emerging |
| 10 |
EggerMarc/tools-rs
Serialize your functions with tools-rs! |
|
Emerging |
| 11 |
InfraWhisperer/llmtop
htop for your LLM inference cluster |
|
Emerging |
| 12 |
haasonsaas/uranium
High-security storage vault for Large Language Model (LLM) weights with... |
|
Emerging |
| 13 |
npuichigo/openai_trtllm
OpenAI compatible API for TensorRT LLM triton backend |
|
Emerging |
| 14 |
visualstudioblyat/yule
Run AI models locally Prove what ran |
|
Emerging |
| 15 |
antirez/gguf-tools
GGUF implementation in C as a library and a tools CLI program |
|
Emerging |
| 16 |
microsoft/aici
AICI: Prompts as (Wasm) Programs |
|
Emerging |
| 17 |
darkautism/llmserver-rs
A Rust-based, OpenAI-style API server for large language models (LLMs) |
|
Emerging |
| 18 |
brontoguana/ktop
Terminal system resource monitor for hybrid LLM workloads |
|
Emerging |
| 19 |
reinterpretcat/qwen3-rs
An educational Rust project for exporting and running inference on Qwen3 LLM family |
|
Emerging |
| 20 |
Michael-A-Kuykendall/schoolmarm
Production-grade GBNF grammar-constrained decoding for LLMs. Zero... |
|
Emerging |
| 21 |
fabriziopfannl/llm-autobatch
Turn single LLM calls into fast micro-batches. Rust core, Python API. |
|
Emerging |
| 22 |
FerrisMind/inspector-gguf
A powerful GGUF file inspection tool with a graphical and command-line interface |
|
Emerging |
| 23 |
Mattbusel/llm_affector
An async Rust library for LLM-based content analysis, providing... |
|
Emerging |
| 24 |
rosarp/llm-lsp
Language Server Protocol for accessing Large Language Models |
|
Emerging |
| 25 |
Lallapallooza/gpt.rs
Rust LLM playground: build, train, generate on pluggable backends |
|
Emerging |
| 26 |
yigitkonur/cli-batch-requester
10K+ req/s batch API client for LLM endpoints — Rust, async, load-balanced |
|
Emerging |
| 27 |
tmetsch/rusty_llm
Rust based AI LLM inference service |
|
Experimental |
| 28 |
GammaTauAI/opentau
Using Large Language Models for Repo-wide Type Prediction |
|
Experimental |
| 29 |
uky007/FerrugoCC
Rust-based reverse optimization (code obfuscation) C Compiler |
|
Experimental |
| 30 |
Mattbusel/llm-sync
CRDT and vector clock primitives for distributed LLM agent state synchronization |
|
Experimental |
| 31 |
paiml/apr-cookbook
Examples of .apr format models |
|
Experimental |
| 32 |
Mattbusel/llm-wasm
LLM inference primitives for WebAssembly — cache, retry, routing, guards,... |
|
Experimental |
| 33 |
Mattbusel/llm-diff
Output diffing and versioning for LLM outputs — semantic diff, version... |
|
Experimental |
| 34 |
aprxi/talu
Talu is a single-binary, local-first LLM runtime with a Zig core and... |
|
Experimental |
| 35 |
codito/arey
Simple large language model playground app |
|
Experimental |
| 36 |
netdur/hugind
vLLM for poor GPUs |
|
Experimental |
| 37 |
rodmarkun/flyllm
A Rust library for unifying LLM backends as an abstraction layer with load... |
|
Experimental |
| 38 |
jaggederest/locque
Locque, a dependently-typed LLM first programming language |
|
Experimental |
| 39 |
okayasl/normy
Ultra-fast, zero-copy text normalization for Rust NLP pipelines & tokenizers |
|
Experimental |
| 40 |
Jack17432/positivity
A Rust crate that provides a generic method to determine non-negativity for... |
|
Experimental |
| 41 |
hyperpolymath/patallm-gallery
Gallery of LLM patterns and implementations |
|
Experimental |
| 42 |
antoineMoPa/rust-text-experiments
Tiny LLM in rust / candle |
|
Experimental |
| 43 |
lspecian/crabinfer
Safe, fast, memory-aware on-device LLM inference SDK for iOS — built in Rust... |
|
Experimental |
| 44 |
chenhunghan/mlx-training-rs
A CLI in Rust to generate synthetic data for MLX friendly training |
|
Experimental |
| 45 |
richardanaya/epistemology
A simple and clear way of hosting llama.cpp as a private HTTP API using Rust |
|
Experimental |
| 46 |
yybit/pllm
Portable LLM - A rust library for LLM inference |
|
Experimental |
| 47 |
jondot/awesome-rust-llm
🦀 A curated list of Rust tools, libraries, and frameworks for working with... |
|
Experimental |
| 48 |
HelgeSverre/sema
A Lisp with first-class LLM primitives, implemented in Rust |
|
Experimental |
| 49 |
qora-protocol/QORA-LLM-3B
Pure Rust inference engine for the SmolLM3-3B language model. No Python... |
|
Experimental |
| 50 |
usemarbles/langmail
Email preprocessing for LLMs. Fast, typed, Rust-powered. |
|
Experimental |
| 51 |
TomOst-Sec/BlueOS
GPU-first LLM inference runtime in Rust + CUDA. Tiered virtual VRAM,... |
|
Experimental |
| 52 |
sizzlecar/ferrum-infer-rs
Rust-native LLM inference engine. Single binary, no Python. Chat locally or... |
|
Experimental |
| 53 |
greysquirr3l/heretic-rs
Abliterate LLMs in pure Rust — zero Python, single static binary, runs on Colab |
|
Experimental |
| 54 |
GoWtEm/llm-model-selector
A high-performance Rust utility that analyzes your system hardware to... |
|
Experimental |
| 55 |
wassemgtk/llm-training-rust
llm training rust |
|
Experimental |
| 56 |
qwrtgvdsdf/ternary-tools
🔍 Explore and validate GGUF files effortlessly with ternary-tools, a... |
|
Experimental |
| 57 |
Ranjitbarnala0/rai
CPU-native LLM inference engine — hand-written SIMD kernels, 4-bit... |
|
Experimental |
| 58 |
proj-airi/candle-examples
🦀 Rust powered LLM, Whisper, Embedding inference, backed by 🤗 candle from HuggingFace |
|
Experimental |
| 59 |
cukas/KERNlang
The language LLMs think in. Write one .kern file, ship 7 targets. 70% fewer tokens |
|
Experimental |
| 60 |
petlukk/Cougar
Fast, dependency-free LLM engine in Rust with custom SIMD kernels |
|
Experimental |
| 61 |
defai-digital/ax-engine
Mac-native Rust inference engine for running larger local GGUF models with... |
|
Experimental |
| 62 |
PCfVW/candle-mi
Mechanistic interpretability for language models in Rust, built on candle |
|
Experimental |
| 63 |
TheRadDani/VectorPrime
VectorPrime takes a model file and your hardware, then finds the fastest way... |
|
Experimental |
| 64 |
SundryAPI/sundry
Sundry is an intelligent context provider API designed specifically for... |
|
Experimental |
| 65 |
nkypy/candle-rwkv
RWKV models and examples powered by candle. |
|
Experimental |
| 66 |
pwh-pwh/couplet_gen
use rust to generate couplet |
|
Experimental |
| 67 |
t81dev/ternary-tools
file(1) of the ternary age — balanced-ternary-aware GGUF inspector and... |
|
Experimental |
| 68 |
coconut-os/coconutOS
Rust microkernel for GPU-isolated AI inference |
|
Experimental |
| 69 |
Defilan/gguf-parser
A Rust library and CLI for parsing GGUF model file headers — extract... |
|
Experimental |
| 70 |
ahoylabs/gguf.js
A Javascript library (with Typescript types) to parse metadata of GGML based... |
|
Experimental |
| 71 |
neuron-nexus-agregator/nn-yandex-foundation
Unified library for working with Yandex Foundation Models. Provides a simple... |
|
Experimental |
| 72 |
yarenty/modelmux
ModelMux is a high-performance Rust proxy server that seamlessly converts... |
|
Experimental |
| 73 |
menezis-ai/LDSI
White-box LLM stability benchmark using Kolmogorov complexity, Shannon... |
|
Experimental |
| 74 |
tzervas/axolotl-rs
YAML-driven configurable fine-tuning toolkit for LLMs in Rust |
|
Experimental |
| 75 |
chongliujia/fermi-infer
The Rust-native inference engine for Small Language Models (SLMs), Run... |
|
Experimental |
| 76 |
AspadaX/secretary
Robustly create/extract structural data with LLMs |
|
Experimental |
| 77 |
matthewhaynesonline/phile
Single file llm, but in _rust_. phi + file = phile. |
|
Experimental |
| 78 |
santino-research/spell
A Programming Language Designed for Large Language Models |
|
Experimental |
| 79 |
blueheron786/cpu_llm
A lightweight CPU-friendly neural language model from scratch, with hybrid... |
|
Experimental |
| 80 |
srijitiyer/alloy
A fast Rust CLI for LLM model merging, diffing, and conversion. 10 merge... |
|
Experimental |
| 81 |
cjroth/neuroscope
Real-time "x-ray vision" into LLMs' minds |
|
Experimental |
| 82 |
msk/lumine
A high-level Rust interface for language models powered by the Candle ML... |
|
Experimental |
| 83 |
JuliaMerz/pantry
Actor based multi-llm registry + runner. |
|
Experimental |
| 84 |
abdulrahmanashraf5594/comprehensive-rust
🦀 Explore Comprehensive Rust, a multi-day course that teaches Rust from... |
|
Experimental |
| 85 |
reinterpretcat/zero-depend-pub
An educational Rust workspace featuring zero-dependency crates built using... |
|
Experimental |
| 86 |
samkeen/llm-bridge
Rust SDK for interacting with various Large Language Model (LLM) APIs |
|
Experimental |
| 87 |
StepfenShawn/ferris-grad
Pytorch-like autograd engine in Rust. |
|
Experimental |
| 88 |
lipish/llm-providers
A unified source of truth for LLM providers, models, pricing, and... |
|
Experimental |
| 89 |
Plarturer/llm-distributed-inference
High-performance distributed inference engine for LLMs using Rust and CUDA. |
|
Experimental |
| 90 |
sanggi-wjg/LLML
LLML — Language for Large Model Logic. A programming language optimized for... |
|
Experimental |
| 91 |
ltouati/tiny-llm
A tiny llm writen using rust candle |
|
Experimental |
| 92 |
eren23/synapse
Modular LLM inference engine in Rust + Zig SIMD kernels. Runs on desktop... |
|
Experimental |
| 93 |
rhi-zone/sketchpad
Deep learning inference in pure Rust using Burn. Image generation (SD, SDXL,... |
|
Experimental |
| 94 |
llmprogram/llmprogram-rs
llmprogram is a Rust crate that provides a structured and powerful way to... |
|
Experimental |
| 95 |
gicrisf/microgpt-candle-rs
Rust implementation of Karpathy's Microgpt |
|
Experimental |
| 96 |
kn0sys/adamo
Rust LLM proof-of-concept |
|
Experimental |
| 97 |
magic003/llama2-rs
Inference Llama 2 in Rust |
|
Experimental |
| 98 |
text-yoga/ask
WIP browser-based LLM question/answering for the web |
|
Experimental |
| 99 |
tauseefk/streamformers
Wrap Rustformers' LLM inference in a stream. |
|
Experimental |
| 100 |
kmolerov/llm-temp-scale
llm-temp-scale is a multiplatform library for normalizing and converting a... |
|
Experimental |
| 101 |
zTgx/transformer-rust
Transformer With Rust & Candle |
|
Experimental |
| 102 |
AshtonVaughan/prismllm
Any model. Any hardware. Any size. — Hardware-agnostic LLM inference with... |
|
Experimental |
| 103 |
ramendrasingla/ml_algorithms_in_rust
Creating Machine Learning and Deep Learning Algorithms in Rust |
|
Experimental |
| 104 |
mrcsparker/guanaco
Run local LLMs in Ruby |
|
Experimental |