ALucek/QuicKB
Optimize Document Retrieval with Fine-Tuned KnowledgeBases
51
/ 100
Established
Implements an end-to-end ML pipeline combining semantic and token-based document chunking strategies, synthetic QA pair generation with deduplication, and Sentence Transformers-based embedding fine-tuning with Matryoshka dimension reduction. Supports multiple LLM providers via LiteLLM, parallel processing, and cross-device training (CUDA/MPS/CPU), with optional integration to Hugging Face Hub for dataset and model publishing.
183 stars.
No Package
No Dependents
Maintenance
6 / 25
Adoption
10 / 25
Maturity
16 / 25
Community
19 / 25
Stars
183
Forks
32
Language
Python
License
MIT
Category
Last pushed
Nov 05, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/ALucek/QuicKB"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.