zhenye234/xcodec

AAAI 2025: Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model

/ 100

Emerging

Combines acoustic encoding with learned semantic representations (HuBERT, WavLM) through dual feature extraction and joint quantization, enabling unified codec tokens that preserve both content and acoustic properties. Integrates seamlessly with Hugging Face Transformers with pre-trained checkpoints optimized for speech (Librispeech, MLS) and general audio domains, supporting straightforward adaptation by injecting semantic modules into existing codecs via projection layers and MSE-based semantic preservation loss.

294 stars.

No Package No Dependents

Maintenance 6 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 14 / 25

How are scores calculated?

Stars

294

Forks

Language

Python

License

MIT

Related tools

zhuhanqing/APOLLO

APOLLO: SGD-like Memory, AdamW-level Performance; MLSys'25 Oustanding Paper Honorable Mention

Y-Research-SBU/CSRv2

Official Repository for CSRv2 - ICLR 2026

HITESHLPATEL/Mamba-Papers

Awesome Mamba Papers: A Curated Collection of Research Papers , Tutorials & Blogs

psychofict/llm-effective-context-length

Investigating Why the Effective Context Length of LLMs Falls Short (Based on STRING, ICLR 2025)

MouxiaoHuang/PPE

[ICLR 2026] Official code of PPE: Positional Preservation Embedding for Token Compression in...

Explore LLM Tools

All categories Trending LLM Tool directory Insights