oborchers/Fast_Sentence_Embeddings
Compute Sentence Embeddings Fast!
Implements three lightweight algorithms (Average, SIF, and uSIF) that aggregate pre-trained word embeddings into sentence vectors using Cython-optimized routines, achieving 300k-500k sentences/second on CPU. Integrates seamlessly with Gensim's Word2Vec and FastText models, supports disk-streaming and RAM-to-disk training for massive corpora, and provides hub access to pre-trained embeddings including GloVe, Word2Vec, and FastText variants.
625 stars. No commits in the last 6 months.
Stars
625
Forks
84
Language
Jupyter Notebook
License
GPL-3.0
Category
Last pushed
Mar 02, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/oborchers/Fast_Sentence_Embeddings"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
shibing624/text2vec
text2vec, text to vector....
ddangelov/Top2Vec
Top2Vec learns jointly embedded topic, document and word vectors.
predict-idlab/pyRDF2Vec
đ Python Implementation and Extension of RDF2Vec
IntuitionEngineeringTeam/chars2vec
Character-based word embeddings model based on RNN for handling real world texts
IITH-Compilers/IR2Vec
Implementation of IR2Vec, LLVM IR Based Scalable Program Embeddings