descriptinc/descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

/ 100

Verified

Built on improved RVQGAN architecture, the codec uses residual vector quantization to compress 44.1 kHz audio into discrete codes at 8 kbps bitrate while maintaining perceptual fidelity across all audio domains (speech, music, environmental sound). The universal model serves as a drop-in replacement for EnCodec in audio language modeling applications like AudioLM and MusicGen, offering both CLI tools and a Python API for integration into generative audio pipelines.

1,732 stars and 453,067 monthly downloads. Used by 4 other packages. Available on PyPI.

Maintenance 10 / 25

Adoption 24 / 25

Maturity 18 / 25

Community 20 / 25

How are scores calculated?

Stars

1,732

Forks

171

Language

Python

License

MIT

Compare

descript-audio-codec and focalcodec

Related tools

crlandsc/torch-log-wmse

logWMSE, an audio quality metric & loss function with support for digital silence target. Useful...

drethage/speech-denoising-wavenet

A neural network for end-to-end speech denoising

KyungsuKim42/tokensynth

The official implementation of TokenSynth (ICASSP 2025)

iver56/torch-audiomentations

Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.

YuanGongND/ast

Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".

Explore Voice AI Tools

All categories Trending Voice AI directory Insights