trymirai/uzu
A high-performance inference engine for AI models
Leverages Apple Silicon's unified memory architecture with Metal-accelerated kernels for optimized on-device inference. Supports a custom model format with conversion tooling (via lalamo) for popular open-source models, and provides language bindings for Swift and TypeScript alongside a Rust core API. Includes built-in CLI utilities for model serving, benchmarking, and inference with configurable decoding parameters.
1,492 stars. Actively maintained with 69 commits in the last 30 days.
Stars
1,492
Forks
44
Language
Rust
License
MIT
Category
Last pushed
Mar 13, 2026
Commits (30d)
69
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/trymirai/uzu"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
justrach/bhumi
⚡ Bhumi – The fastest AI inference client for Python, built with Rust for unmatched speed,...
keyvank/femtoGPT
Pure Rust implementation of a minimal Generative Pretrained Transformer
lipish/llm-connector
LLM Connector - A unified interface for connecting to various Large Language Model providers
ShelbyJenkins/llm_client
The Easiest Rust Interface for Local LLMs and an Interface for Deterministic Signals from...
rustformers/llm
[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models