chonkie and chonkiejs
These are ecosystem siblings—Chonkie is the Python reference implementation for document chunking in RAG pipelines, while ChonkieJS is its TypeScript/JavaScript port enabling the same chunking approach across different runtime environments.
About chonkie
chonkie-inc/chonkie
🦛 CHONK docs with Chonkie ✨ — The lightweight ingestion library for fast, efficient and robust RAG pipelines
Provides pluggable chunking strategies—recursive, semantic, code-aware, and LLM-based—with composable pipeline workflows that chain multiple chunkers and refineries together. Integrates with 32+ tools across tokenizers (GPT-2, BPE), embeddings (OpenAI, Sentence Transformers), vector databases, and LLMs, while supporting 56 languages out-of-the-box through modular dependency installation.
About chonkiejs
chonkie-inc/chonkiejs
🦛 CHONK your texts with Chonkie ✨ Type-friendly, light-weight, fast and super-simple chunking library
Supports multiple chunking strategies (recursive, token-based, semantic, and neural) through a modular package architecture, with optional HuggingFace tokenizer integration for improved accuracy. Built specifically for RAG pipelines, it provides on-the-fly chunking with token counting capabilities and includes cloud-based options via api.chonkie.ai for advanced algorithms without local dependencies.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work