SDV and SDGym

SDGym is a benchmarking framework that evaluates and compares synthetic data generation methods, making it a complement to SDV that enables practitioners to assess SDV's performance against alternative approaches.

SDV

Verified

SDGym

Verified

Maintenance 23/25

Adoption 25/25

Maturity 25/25

Community 21/25

Maintenance 13/25

Adoption 18/25

Maturity 18/25

Community 23/25

Stars: 3,439

Forks: 417

Downloads: 150,480

Commits (30d): 36

Language: Python

License: —

Stars: 301

Forks: 67

Downloads: 1,273

Commits (30d): 0

Language: Python

License: —

No risk flags

About SDV

sdv-dev/SDV

Synthetic data generation for tabular data

Supports multiple synthesis architectures including statistical methods (GaussianCopula) and deep learning approaches (CTGAN) for single, multi-table, and sequential datasets. Includes built-in evaluation metrics comparing synthetic to real data across column distributions and correlations, plus constraint enforcement and PII anonymization during generation.

About SDGym

sdv-dev/SDGym

Benchmarking synthetic data generation methods.

Related comparisons

SDV and synthetic-data-generator SDV and tsgm SDV and synthetic-data-generator SDV and synthetic_generator

Scores updated daily from GitHub, PyPI, and npm data. How scores work