ExplainableML/ZerAuCap
[NeurIPS 2023 - ML for Audio Workshop (Oral)] Zero-shot audio captioning with audio-language model guidance and audio context keywords
No commits in the last 6 months.
Stars
18
Forks
1
Language
Python
License
—
Category
Last pushed
Nov 30, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/ExplainableML/ZerAuCap"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
canopyai/Orpheus-TTS
Towards Human-Sounding Speech
lifeiteng/vall-e
PyTorch implementation of VALL-E(Zero-Shot Text-To-Speech), Reproduced Demo...
Plachtaa/VALL-E-X
An open source implementation of Microsoft's VALL-E X zero-shot TTS model. Demo is available in...
umbertocappellazzo/Omni-AVSR
Official Pytorch implementation of "Omni-AVSR: Towards Unified Multimodal Speech Recognition...
primepake/learnable-speech
This repo is text to speech with learnable audio encoder without alignment with transcript reference