open-mmlab/Amphion
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.
Amphion implements task-specific generation pipelines (TTS, SVC, voice conversion, text-to-audio) with unified architectural components including neural vocoders and standardized evaluation metrics for reproducibility. Built on PyTorch, it provides modular design supporting both classical and foundation models—such as Vevo (zero-shot voice imitation with prosody control) and MaskGCT (non-autoregressive TTS)—alongside large-scale datasets like Emilia (200k+ hours) for training. The toolkit integrates with Hugging Face and ModelScope, enabling seamless model sharing and deployment across speech generation tasks.
9,712 stars. No commits in the last 6 months.
Stars
9,712
Forks
796
Language
Python
License
MIT
Category
Last pushed
May 27, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/open-mmlab/Amphion"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
whitphx/streamlit-stt-app
Real time web based Speech-to-Text app with Streamlit
saidsef/tika-document-to-text
Apache Tika extract text and metadata from any document format with this pre-built containerised...
hipnologo/EchoForge_Studio
Multi-LLM writing and voice production workspace built with Streamlit.
declare-lab/jamify
JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment
SiddhantSadangi/st_deepgram_playground
API playground for Deepgram built with Streamlit