apple/ml-spatial-librispeech
A large synthetic dataset of spatial audio with multiple labels
Synthesizes 650+ hours of first-order ambisonics audio by augmenting LibriSpeech recordings with 200k+ simulated acoustic conditions across 8k+ synthetic rooms. Includes rich spatial labels for source position, speaking direction, room acoustics, and geometry, plus optional distractor noise tracks. Provides PyTorch dataloader and Parquet-based metadata schema for straightforward integration into audio ML pipelines.
125 stars. No commits in the last 6 months.
Stars
125
Forks
8
Language
—
License
—
Category
Last pushed
Oct 25, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/apple/ml-spatial-librispeech"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related frameworks
Ijwi-ry-Ikirundi-AI/Kirundi_Dataset
🇧🇮 The first large-scale, open-source speech and text dataset for Kirundi language. Building AI...
hstsethi/in-mob-prefix
Dataset, charts, models of 4 digit mobile number prefixes in India by state, operator name.
Jahangirbd23/WenetSpeech-Yue
📑 Explore WenetSpeech-Yue, a comprehensive Cantonese speech corpus with rich annotations,...
Nexdata-AI/359-Hours-Indonesian-Speech-Data-by-Mobile-Phone_Reading
Indonesian Speech Dataset
Nexdata-AI/207-Hours-Japanese-Speaking-English-Speech-Data-by-Mobile-Phone
Japanese Speaking English Speech Dataset