FunAudioLLM/SenseVoice
Multilingual Voice Understanding Model
Combines ASR, emotion recognition, and audio event detection in a single non-autoregressive end-to-end model trained on 400k+ hours across 50+ languages. Achieves 15× faster inference than Whisper-Large while supporting timestamp generation via CTC alignment, with export options for ONNX and libtorch. Integrates with FunASR framework for streamlined deployment with multi-concurrent request handling and client SDKs (Python, C++, Java, C#).
7,691 stars.
Stars
7,691
Forks
708
Language
Python
License
—
Category
Last pushed
Dec 30, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/FunAudioLLM/SenseVoice"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
travisvn/chatterbox-tts-api
Local, OpenAI-compatible text-to-speech (TTS) API using Chatterbox, enabling users to generate...
FunAudioLLM/CosyVoice
Multi-lingual large voice generation model, providing inference, training and deployment...
fishaudio/Bert-VITS2
vits2 backbone with multilingual-bert
OpenMOSS/MOSS-TTS
MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the...
sfortis/openai_tts
Custom TTS component for Home Assistant. Utilizes the OpenAI speech engine or any compatible...