chonkie-inc/chonkie

🦛 CHONK docs with Chonkie ✨ — The lightweight ingestion library for fast, efficient and robust RAG pipelines

83
/ 100
Verified

Provides pluggable chunking strategies—recursive, semantic, code-aware, and LLM-based—with composable pipeline workflows that chain multiple chunkers and refineries together. Integrates with 32+ tools across tokenizers (GPT-2, BPE), embeddings (OpenAI, Sentence Transformers), vector databases, and LLMs, while supporting 56 languages out-of-the-box through modular dependency installation.

3,829 stars. Used by 15 other packages. Actively maintained with 53 commits in the last 30 days. Available on PyPI.

Maintenance 25 / 25
Adoption 15 / 25
Maturity 25 / 25
Community 18 / 25

How are scores calculated?

Stars

3,829

Forks

256

Language

Python

License

MIT

Last pushed

Mar 12, 2026

Commits (30d)

53

Dependencies

4

Reverse dependents

15

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/chonkie-inc/chonkie"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.