zhenye234/xcodec

AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

46
/ 100
Emerging

Combines acoustic encoding with learned semantic representations (HuBERT, WavLM) through dual feature extraction and joint quantization, enabling unified codec tokens that preserve both content and acoustic properties. Integrates seamlessly with Hugging Face Transformers with pre-trained checkpoints optimized for speech (Librispeech, MLS) and general audio domains, supporting straightforward adaptation by injecting semantic modules into existing codecs via projection layers and MSE-based semantic preservation loss.

294 stars.

No Package No Dependents
Maintenance 6 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 14 / 25

How are scores calculated?

Stars

294

Forks

23

Language

Python

License

MIT

Last pushed

Oct 12, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/zhenye234/xcodec"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.