Kokoro-FastAPI and Kokoros
These are ecosystem siblings—one provides a Python API service layer for the Kokoro model with flexible compute options (CPU/GPU), while the other reimplements the same base model in Rust for different performance and integration characteristics, both serving different deployment contexts rather than competing for the same use case.
About Kokoro-FastAPI
remsky/Kokoro-FastAPI
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model w/CPU ONNX and NVIDIA GPU PyTorch support, handling, and auto-stitching
Provides phoneme-level control with per-word timestamped captions and supports voice mixing via weighted combinations, enabling fine-grained audio generation and synthesis customization. Implements an OpenAI-compatible Speech API endpoint for drop-in integration with existing applications while offering a built-in web UI for standalone use. Includes Kubernetes/Helm deployment support and integrations with popular AI frameworks like SillyTavern and OpenWebUI.
About Kokoros
lucasjinreal/Kokoros
🔥🔥 Kokoro in Rust. https://huggingface.co/hexgrad/Kokoro-82M Insanely fast, realtime TTS with high quality you ever have.
Provides built-in phonemization and ONNX model inference without external dependencies, enabling end-to-end TTS in pure Rust. Supports style mixing, word-level timestamps, streaming output, and an OpenAI-compatible HTTP API with configurable parallel processing for both low-latency and high-throughput scenarios.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work