jparkerweb/semantic-chunking
🍱 semantic-chunking ⇢ semantically create chunks from large document for passing to LLM workflows
Performs semantic chunking by embedding sentences with ONNX models and grouping them based on cosine similarity scores, with configurable thresholds and optional chunk rebalancing. Supports multiple embedding models including quantized variants (q4, q8), and can return chunk embeddings for RAG workflows. Deployable as a Node.js library, microservice API, or Docker container with an included web UI for interactive configuration.
134 stars and 5,194 monthly downloads. Used by 1 other package. Available on npm.
Stars
134
Forks
14
Language
JavaScript
License
MIT
Category
Last pushed
Feb 03, 2026
Monthly downloads
5,194
Commits (30d)
0
Dependencies
5
Reverse dependents
1
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/jparkerweb/semantic-chunking"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
drittich/SemanticSlicer
🧠✂️ SemanticSlicer — A smart text chunker for LLM-ready documents.
ndgigliotti/afterthoughts
Sentence-aware embeddings using late chunking with transformers.
smart-models/Normalized-Semantic-Chunker
Cutting-edge tool that unlocks the full potential of semantic chunking
cspnms/MSchunker
Smart text chunker for LLM preprocessing (sections → paragraphs → sentences → hard splits).
ReemHal/Semantic-Text-Segmentation-with-Embeddings
Uses GloVe embeddings and greedy sequence segmentation to semantically segment a text document...