Giskard

Open-source testing & evaluation platform for ML/LLM — test suites, bias/safety checks, and regression testing with a web UI.

llm-evaluation-frameworksRecently released
51
Hero Score
Popularity
46
Performance
30
Ecosystem
50
Maturity
69
Dev Experience
57
⭐ 5,411 stars⬇ 5.0K downloads/wkFirst release: Apr 2022Last release: May 2026
Async Support: NoPlugin Extensions: MediumSpeed: MediumDoc Quality: HighLearning Curve: Medium

Pros

  • UI-driven test authoring, datasets, and reports for LLMs/ML
  • Built-in bias, safety, and robustness checks
  • CI integration for model gating before deployment

Cons

  • Heavier setup than simple metric libraries
  • Some enterprise features require additional configuration
  • Crafting good tests still needs domain expertise

Alternatives in llm-evaluation-frameworks

Compare Python Packages with ease.