lucasnewman/vocos-mlx
Implementation of 'Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis', in MLX
Bridges time-domain and Fourier-based vocoding by reconstructing high-fidelity audio from either Mel spectrograms or EnCodec tokens, with dual inference pathways for flexible audio generation. Built on MLX framework optimized for Apple Silicon, supporting bandwidth-controlled EnCodec decoding and pretrained model checkpoints for 24kHz audio synthesis. Integrates with EnCodec for neural compression-based audio conditioning alongside traditional spectrogram-based approaches.
Used by 2 other packages. No commits in the last 6 months. Available on PyPI.
Stars
24
Forks
2
Language
Python
License
MIT
Category
Last pushed
Oct 30, 2024
Commits (30d)
0
Dependencies
4
Reverse dependents
2
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/lucasnewman/vocos-mlx"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TensorSpeech/TensorFlowTTS
:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for...
lucasnewman/nanospeech
A simple, hackable text-to-speech system in PyTorch and MLX
Tomiinek/Multilingual_Text_to_Speech
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing,...
jxzhanggg/nonparaSeq2seqVC_code
Implementation code of non-parallel sequence-to-sequence VC
keonlee9420/STYLER
Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech...