xhluca/bm25s
Fast lexical search implementing BM25 in Python
Leverages sparse matrix representations to eagerly compute and cache BM25 scores for all document tokens, enabling sub-millisecond query scoring without runtime computation. Built entirely on NumPy with optional Numba JIT compilation for further acceleration, and integrates with lightweight stemming libraries like PyStemmer for linguistic preprocessing. Designed as a drop-in replacement for Elasticsearch and rank-bm25, offering a Python-native alternative with no external service dependencies.
1,560 stars and 1,192,545 monthly downloads. Used by 12 other packages. Actively maintained with 13 commits in the last 30 days. Available on PyPI.
Stars
1,560
Forks
93
Language
Python
License
MIT
Category
Last pushed
Mar 06, 2026
Monthly downloads
1,192,545
Commits (30d)
13
Dependencies
1
Reverse dependents
12
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/xhluca/bm25s"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.