ThoughtRiver/lmdb-embeddings

Fast word vectors with little memory usage in Python

40
/ 100
Emerging

Leverages Lightning Memory-Mapped Database (LMDB) to enable zero-load-time access to pre-trained embeddings with negligible memory overhead—large models like GloVe-840B require only a few MB versus 4GB traditionally. Supports pluggable serialization backends (pickle, msgpack) and includes an LRU cache for frequently accessed vectors, with transparent compatibility for gensim models and custom embedding iterators.

416 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 14 / 25

How are scores calculated?

Stars

416

Forks

31

Language

Python

License

GPL-3.0

Last pushed

Jun 26, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/ThoughtRiver/lmdb-embeddings"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.