SDV
Synthetic Data Vault — generates statistically realistic synthetic datasets using probabilistic and deep learning models while preserving privacy.
synthetic-data-generation-frameworksRecently released
55
Hero Score
Popularity
46
Performance
15
Ecosystem
50
Maturity
100
Dev Experience
62
⭐ 3,497 stars⬇ 33.0K downloads/wkFirst release: Jan 2016Last release: May 2026
Async Support: NoPlugin Extensions: MediumSpeed: SlowDoc Quality: Very highLearning Curve: Medium
Pros
- • Preserves statistical properties, correlations, and relationships in data
- • Supports tabular, relational, time-series, and multi-table datasets
- • Built-in evaluation, quality reporting, and privacy metrics
Cons
- • Significantly slower setup and generation than simple fakers
- • Requires real datasets to learn distributions from
- • Business Source License (not fully open source)