sdv-dev/SDV
Synthetic data generation for tabular data
Supports multiple synthesis architectures including statistical methods (GaussianCopula) and deep learning approaches (CTGAN) for single, multi-table, and sequential datasets. Includes built-in evaluation metrics comparing synthetic to real data across column distributions and correlations, plus constraint enforcement and PII anonymization during generation.
3,439 stars and 150,480 monthly downloads. Used by 5 other packages. Actively maintained with 36 commits in the last 30 days. Available on PyPI.
Stars
3,439
Forks
417
Language
Python
License
—
Category
Last pushed
Mar 12, 2026
Monthly downloads
150,480
Commits (30d)
36
Dependencies
14
Reverse dependents
5
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/sdv-dev/SDV"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
sdv-dev/SDGym
Benchmarking synthetic data generation methods.
NVIDIA-NeMo/DataDesigner
🎨 NeMo Data Designer: A general library for generating high-quality synthetic data from scratch...
AlexanderVNikitin/tsgm
Generation and evaluation of synthetic time series datasets (also, augmentations,...
wwhenxuan/S2Generator
A series-symbol (S2) dual-modality data generation mechanism, enabling the unrestricted creation...
hitsz-ids/synthetic-data-generator
SDG is a specialized framework designed to generate high-quality structured tabular data.