chonkie and SmartChunk

These are competitors in the semantic chunking space, with Chonkie offering a mature, production-ready solution featuring multiple chunking strategies and language support, while SmartChunk provides an earlier-stage alternative focused on structure-aware semantic chunking for RAG pipelines.

chonkie

Verified

SmartChunk

Emerging

Maintenance 25/25

Adoption 15/25

Maturity 25/25

Community 18/25

Maintenance 10/25

Adoption 5/25

Maturity 9/25

Community 7/25

Stars: 3,829

Forks: 256

Downloads: —

Commits (30d): 53

Language: Python

License: MIT

Stars: 10

Forks: 1

Downloads: —

Commits (30d): 0

Language: Python

License: MIT

No risk flags

No Package No Dependents

About chonkie

chonkie-inc/chonkie

🦛 CHONK docs with Chonkie ✨ — The lightweight ingestion library for fast, efficient and robust RAG pipelines

Provides pluggable chunking strategies—recursive, semantic, code-aware, and LLM-based—with composable pipeline workflows that chain multiple chunkers and refineries together. Integrates with 32+ tools across tokenizers (GPT-2, BPE), embeddings (OpenAI, Sentence Transformers), vector databases, and LLMs, while supporting 56 languages out-of-the-box through modular dependency installation.

About SmartChunk

ayush585/SmartChunk

SmartChunk is a lightweight, structure-aware semantic chunking toolkit designed to supercharge RAG (Retrieval-Augmented Generation) and LLM pipelines. Unlike naive splitters that break text arbitrarily, SmartChunk respects document structure (headings, lists, tables, code blocks) and semantic flow, ensuring cleaner, more coherent chunks.

Related comparisons

chonkie and chunklet-py chonkie and jchunk chonkie and chonkiejs chonkie and chonkify chonkie and rag-chunk chonkie and chunky

Scores updated daily from GitHub, PyPI, and npm data. How scores work