jparkerweb/semantic-chunking

🍱 semantic-chunking ⇢ semantically create chunks from large document for passing to LLM workflows

/ 100

Established

Performs semantic chunking by embedding sentences with ONNX models and grouping them based on cosine similarity scores, with configurable thresholds and optional chunk rebalancing. Supports multiple embedding models including quantized variants (q4, q8), and can return chunk embeddings for RAG workflows. Deployable as a Node.js library, microservice API, or Docker container with an included web UI for interactive configuration.

134 stars and 5,194 monthly downloads. Used by 1 other package. Available on npm.

Maintenance 10 / 25

Adoption 20 / 25

Maturity 18 / 25

Community 13 / 25

How are scores calculated?

Stars

134

Forks

Language

JavaScript

License

MIT

Compare

semantic-chunking and Normalized-Semantic-Chunker semantic-chunking and MSchunker semantic-chunking and go-semantic-chunking

Related tools

drittich/SemanticSlicer

🧠✂️ SemanticSlicer — A smart text chunker for LLM-ready documents.

ndgigliotti/afterthoughts

Sentence-aware embeddings using late chunking with transformers.

smart-models/Normalized-Semantic-Chunker

Cutting-edge tool that unlocks the full potential of semantic chunking

cspnms/MSchunker

Smart text chunker for LLM preprocessing (sections → paragraphs → sentences → hard splits).

ReemHal/Semantic-Text-Segmentation-with-Embeddings

Uses GloVe embeddings and greedy sequence segmentation to semantically segment a text document...

Explore Embedding Tools

All categories Trending Embeddings directory Insights