speedyk-005/chunklet-py

One library to split them all: Sentence, Code, Docs. Chunk smarter, not harder — built for LLMs, RAG pipelines, and beyond.

51
/ 100
Established

Supports 50+ languages with automatic detection and offers composable constraints (sentences, tokens, sections, lines, functions) through a pluggable architecture with custom tokenizers and processors. Rich metadata annotations include source references, spans, and structural information—including AST details for code—making it well-suited for RAG and LLM applications. Handles diverse formats (PDF, DOCX, EPUB, Markdown, HTML, LaTeX, CSV, Excel) via optional document processing modules, with CLI, library, and web-based visualization interfaces.

Used by 1 other package. Available on PyPI.

Maintenance 13 / 25
Adoption 9 / 25
Maturity 24 / 25
Community 5 / 25

How are scores calculated?

Stars

64

Forks

2

Language

Python

License

MIT

Last pushed

Mar 13, 2026

Commits (30d)

0

Dependencies

12

Reverse dependents

1

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/rag/speedyk-005/chunklet-py"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.