SDV

Synthetic Data Vault — generates statistically realistic synthetic datasets using probabilistic and deep learning models while preserving privacy.

synthetic-data-generation-frameworksRecently released
55
Hero Score
Popularity
46
Performance
15
Ecosystem
50
Maturity
100
Dev Experience
62
⭐ 3,497 stars⬇ 33.0K downloads/wkFirst release: Jan 2016Last release: May 2026
Async Support: NoPlugin Extensions: MediumSpeed: SlowDoc Quality: Very highLearning Curve: Medium

Pros

  • Preserves statistical properties, correlations, and relationships in data
  • Supports tabular, relational, time-series, and multi-table datasets
  • Built-in evaluation, quality reporting, and privacy metrics

Cons

  • Significantly slower setup and generation than simple fakers
  • Requires real datasets to learn distributions from
  • Business Source License (not fully open source)

Alternatives in synthetic-data-generation-frameworks

Compare Python Packages with ease.