lucasnewman/vocos-mlx

Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX

/ 100

Emerging

Bridges time-domain and Fourier-based vocoding by reconstructing high-fidelity audio from either Mel spectrograms or EnCodec tokens, with dual inference pathways for flexible audio generation. Built on MLX framework optimized for Apple Silicon, supporting bandwidth-controlled EnCodec decoding and pretrained model checkpoints for 24kHz audio synthesis. Integrates with EnCodec for neural compression-based audio conditioning alongside traditional spectrogram-based approaches.

Used by 2 other packages. No commits in the last 6 months. Available on PyPI.

Stale 6m

Maintenance 0 / 25

Adoption 8 / 25

Maturity 18 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

TensorSpeech/TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for...

lucasnewman/nanospeech

A simple, hackable text-to-speech system in PyTorch and MLX

Tomiinek/Multilingual_Text_to_Speech

An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing,...

jxzhanggg/nonparaSeq2seqVC_code

Implementation code of non-parallel sequence-to-sequence VC

keonlee9420/STYLER

Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech...

Explore Voice AI Tools

All categories Trending Voice AI directory Insights