trymirai/uzu

A high-performance inference engine for AI models

/ 100

Established

Leverages Apple Silicon's unified memory architecture with Metal-accelerated kernels for optimized on-device inference. Supports a custom model format with conversion tooling (via lalamo) for popular open-source models, and provides language bindings for Swift and TypeScript alongside a Rust core API. Includes built-in CLI utilities for model serving, benchmarking, and inference with configurable decoding parameters.

1,492 stars. Actively maintained with 69 commits in the last 30 days.

No Package No Dependents

Maintenance 25 / 25

Adoption 10 / 25

Maturity 15 / 25

Community 12 / 25

How are scores calculated?

Stars

1,492

Forks

Language

Rust

License

MIT

Related tools

justrach/bhumi

⚡ Bhumi – The fastest AI inference client for Python, built with Rust for unmatched speed,...

keyvank/femtoGPT

Pure Rust implementation of a minimal Generative Pretrained Transformer

lipish/llm-connector

LLM Connector - A unified interface for connecting to various Large Language Model providers

ShelbyJenkins/llm_client

The Easiest Rust Interface for Local LLMs and an Interface for Deterministic Signals from...

rustformers/llm

[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models

Explore LLM Tools

All categories Trending LLM Tool directory Insights