unsplash/datasets
🎁 6,500,000+ Unsplash images made available for research and machine learning
Provides two tiers—a 25k-photo Lite dataset (commercial-friendly) and a 6.5M-photo Full dataset (noncommercial)—each with rich relational tables including 1M+ keywords and aggregated search analytics spanning 160M+ queries. Structured as PostgreSQL-compatible schemas with Python ingestion examples, enabling direct integration into data pipelines and machine learning workflows. Semantically versioned releases ensure reproducible research citations while maintaining attribution separation from the Unsplash API for product integration.
2,680 stars. No commits in the last 6 months.
Stars
2,680
Forks
135
Language
Jupyter Notebook
License
—
Category
Last pushed
Apr 17, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/unsplash/datasets"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
open-edge-platform/datumaro
Dataset Management Framework, a Python library and a CLI tool to build, analyze and manage...
webdataset/webdataset
A high-performance Python-based I/O system for large (and small) deep learning problems, with...
explosion/ml-datasets
🌊 Machine learning dataset loaders for testing and example scripts
alan-turing-institute/CleverCSV
CleverCSV is a Python package for handling messy CSV files. It provides a drop-in replacement...
tensorflow/datasets
TFDS is a collection of datasets ready to use with TensorFlow, Jax, ...