coqui-ai/open-speech-corpora

💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

46
/ 100
Emerging

Curated database organized by license type (CC-0, CC-BY, CC-BY-SA, etc.) with standardized metadata including speaker count, duration, and language coverage across 50+ multilingual corpora. Covers specialized datasets for ASR, TTS, and emotional speech synthesis, ranging from single-speaker TTS datasets to large-scale parliamentary and crowdsourced collections. Community-driven with prioritized emphasis on truly open licenses and commercial/research usability.

1,390 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

1,390

Forks

150

Language

License

MIT

Last pushed

Jun 06, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/voice-ai/coqui-ai/open-speech-corpora"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.