open-mmlab/Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

/ 100

Emerging

Amphion implements task-specific generation pipelines (TTS, SVC, voice conversion, text-to-audio) with unified architectural components including neural vocoders and standardized evaluation metrics for reproducibility. Built on PyTorch, it provides modular design supporting both classical and foundation models—such as Vevo (zero-shot voice imitation with prosody control) and MaskGCT (non-autoregressive TTS)—alongside large-scale datasets like Emilia (200k+ hours) for training. The toolkit integrates with Hugging Face and ModelScope, enabling seamless model sharing and deployment across speech generation tasks.

9,712 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 19 / 25

How are scores calculated?

Stars

9,712

Forks

796

Language

Python

License

MIT

Related tools

whitphx/streamlit-stt-app

Real time web based Speech-to-Text app with Streamlit

saidsef/tika-document-to-text

Apache Tika extract text and metadata from any document format with this pre-built containerised...

hipnologo/EchoForge_Studio

Multi-LLM writing and voice production workspace built with Streamlit.

declare-lab/jamify

JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment

SiddhantSadangi/st_deepgram_playground

API playground for Deepgram built with Streamlit

Explore Voice AI Tools

All categories Trending Voice AI directory Insights