OpenLLM

BentoML-built CLI and Python framework for self-hosting open-weight LLMs as OpenAI-compatible API endpoints with one command.

model-serving-frameworksInactive
58
Hero Score
Popularity
28
Performance
85
Ecosystem
50
Maturity
61
Dev Experience
68
⭐ 12,342 stars⬇ 2.2K downloads/wkFirst release: May 2023Last release: Apr 2025
Async Support: YesPlugin Extensions: MediumSpeed: FastDoc Quality: HighLearning Curve: Easy

Pros

  • One-command serving of popular open-weight models with OpenAI-compatible endpoints
  • Built on BentoML, so it inherits adaptive batching and cloud deploy targets
  • CLI-first developer experience that's great for self-hosting experiments

Cons

  • Smaller weekly download volume than peer projects (newer/smaller community)
  • Model coverage is curated rather than universal
  • Production-scale orchestration relies on BentoCloud or the broader BentoML stack

Alternatives in model-serving-frameworks

Compare Python Packages with ease.