modelscope/FunCodec

FunCodec is a research-oriented toolkit for audio quantization and downstream applications, such as text-to-speech synthesis, music generation et.al.

/ 100

Emerging

Implements multiple neural codec architectures (EnCodec, FreqCodec) with configurable quantization schemes and bitrates (250-16000 bps), integrated with ModelScope and Hugging Face for model distribution. Built on Kaldi-style data organization (`wav.scp`) supporting both waveform and ark file formats, enabling training on custom datasets alongside downstream tasks like codec-based TTS (LauraTTS) that outperforms VALL-E. Provides end-to-end batch inference pipelines with distributed training via torchrun and recipe-based workflows for reproducibility across multiple open-source corpora.

442 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 9 / 25

Community 15 / 25

How are scores calculated?

Stars

442

Forks

Language

Python

License

MIT

Higher-rated alternatives

shangeth/wavencoder

WavEncoder is a Python library for encoding audio signals, transforms for audio augmentation,...

fatchord/WaveRNN

WaveRNN Vocoder + TTS

kan-bayashi/ParallelWaveGAN

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

HAKORADev/VODER

Voice Operation and Design Engine with Reproduction capabilities

seungwonpark/melgan

MelGAN vocoder (compatible with NVIDIA/tacotron2)

Explore Voice AI Tools

All categories Trending Voice AI directory Insights