audio-data-pytorch and aac-datasets
These are ecosystem siblings—both are PyTorch dataset utilities designed for audio tasks, but they serve different purposes (general audio transforms vs. audio-to-text captioning) and can be used together in the same audio processing pipeline.
About audio-data-pytorch
archinetai/audio-data-pytorch
A collection of useful audio datasets and transforms for PyTorch.
This tool helps machine learning engineers and researchers efficiently manage and preprocess various types of audio data for training machine learning models. It takes raw audio files from local folders, web datasets, or online sources like YouTube, and outputs pre-processed audio waveforms and associated metadata ready for model training. It's designed for anyone building speech recognition, audio classification, or other audio-centric AI applications.
About aac-datasets
Labbeti/aac-datasets
Audio Captioning datasets for PyTorch.
This tool helps researchers and developers working on audio captioning projects to easily access and prepare large datasets. It takes raw audio and associated text descriptions, providing them in a structured format suitable for machine learning models. The primary users are machine learning engineers and AI researchers focused on multimodal audio-language tasks.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work