Chonkie

Lightweight, fast text chunking library for RAG pipelines with multiple strategies (token, sentence, semantic, neural) and 32+ integrations.

text-chunking-frameworksRecently released
74
Hero Score
Popularity
58
Performance
100
Ecosystem
75
Maturity
61
Dev Experience
75
⭐ 4,116 stars⬇ 264.5K downloads/wkFirst release: Nov 2024Last release: May 2026
Async Support: YesPlugin Extensions: HighSpeed: Very fastDoc Quality: HighLearning Curve: Easy

Pros

  • Extremely lightweight (~49MB installed, 505KB wheel) with modular install for only the integrations you need
  • Multiple chunking strategies including SIMD-accelerated FastChunker at 100+ GB/s and semantic/neural chunkers
  • 32+ integrations with embedding providers, vector databases, and LLM providers out of the box

Cons

  • Relatively new project (2024) with a smaller community compared to established frameworks
  • Advanced chunkers (semantic, neural, LLM-based) require additional heavy dependencies
  • No built-in document parsing; requires pre-extracted text as input

Alternatives in text-chunking-frameworks

Compare Python Packages with ease.