SDV and SDGym

SDGym is a benchmarking framework that evaluates and compares synthetic data generation methods, making it a complement to SDV that enables practitioners to assess SDV's performance against alternative approaches.

SDV
94
Verified
SDGym
72
Verified
Maintenance 23/25
Adoption 25/25
Maturity 25/25
Community 21/25
Maintenance 13/25
Adoption 18/25
Maturity 18/25
Community 23/25
Stars: 3,439
Forks: 417
Downloads: 150,480
Commits (30d): 36
Language: Python
License:
Stars: 301
Forks: 67
Downloads: 1,273
Commits (30d): 0
Language: Python
License:
No risk flags
No risk flags

About SDV

sdv-dev/SDV

Synthetic data generation for tabular data

Supports multiple synthesis architectures including statistical methods (GaussianCopula) and deep learning approaches (CTGAN) for single, multi-table, and sequential datasets. Includes built-in evaluation metrics comparing synthetic to real data across column distributions and correlations, plus constraint enforcement and PII anonymization during generation.

About SDGym

sdv-dev/SDGym

Benchmarking synthetic data generation methods.

Scores updated daily from GitHub, PyPI, and npm data. How scores work