chonkie and jchunk

These are ecosystem siblings—Chonkie is a language-agnostic RAG ingestion framework (Python-focused) while JChunk provides equivalent document chunking functionality specifically for Java applications, allowing teams to implement similar chunking strategies across different tech stacks.

chonkie

Verified

jchunk

Emerging

Maintenance 25/25

Adoption 15/25

Maturity 25/25

Community 18/25

Maintenance 10/25

Adoption 6/25

Maturity 9/25

Community 15/25

Stars: 3,829

Forks: 256

Downloads: —

Commits (30d): 53

Language: Python

License: MIT

Stars: 17

Forks: 4

Downloads: —

Commits (30d): 0

Language: Java

License: Apache-2.0

No risk flags

No Package No Dependents

About chonkie

chonkie-inc/chonkie

🦛 CHONK docs with Chonkie ✨ — The lightweight ingestion library for fast, efficient and robust RAG pipelines

Provides pluggable chunking strategies—recursive, semantic, code-aware, and LLM-based—with composable pipeline workflows that chain multiple chunkers and refineries together. Integrates with 32+ tools across tokenizers (GPT-2, BPE), embeddings (OpenAI, Sentence Transformers), vector databases, and LLMs, while supporting 56 languages out-of-the-box through modular dependency installation.

About jchunk

jchunk-io/jchunk

JChunk is a lightweight and flexible library designed to provide multiple strategies for text chunking within Java applications

Provides fixed-size, recursive character-based, and semantic chunking strategies as modular Maven dependencies, enabling RAG pipelines to select splitting approaches based on use case. The library's pluggable architecture allows independent selection and composition of chunking strategies, with semantic chunking offering context-aware splitting beyond simple delimiter or size-based methods.

Related comparisons

chonkie and chunklet-py chonkie and chonkiejs chonkie and chonkify chonkie and rag-chunk chonkie and chunky chonkie and SmartChunk

Scores updated daily from GitHub, PyPI, and npm data. How scores work