gpustack/vox-box
A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.
Provides flexible model sourcing through HuggingFace and ModelScope repositories with GPU acceleration via CUDA, enabling deployment across Linux, Windows, and macOS with configurable model sizes from tiny to large variants. Implements a stateless server architecture that auto-downloads and caches models, supporting both streaming (Paraformer-zh-streaming) and batch processing pipelines with CLI configuration for device binding and data directory management.
200 stars.
Stars
200
Forks
32
Language
Python
License
Apache-2.0
Category
Last pushed
Dec 23, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/gpustack/vox-box"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
devnen/Chatterbox-TTS-Server
Self-host the powerful Chatterbox TTS model. This server offers a user-friendly Web UI, flexible...
daswer123/xtts-api-server
A simple FastAPI Server to run XTTSv2
jamiepine/voicebox
The open-source voice synthesis studio
Aivis-Project/AivisSpeech-Engine
AivisSpeech Engine: AI Voice Imitation System - Text to Speech Engine
jianchang512/ChatTTS-ui
一个简单的本地网页界面,使用ChatTTS将文字合成为语音,同时支持对外提供API接口。A simple native web interface that uses ChatTTS to...