lgalke/vec4ir
Word Embeddings for Information Retrieval
Implements multiple embedding-based retrieval models (Word Centroid Similarity, IDF-reweighted variants) integrated with gensim for training Skip-gram and GloVe embeddings. The framework provides a modular pipeline for matching and similarity scoring with built-in evaluation metrics, designed for extensibility through sklearn-inspired APIs to benchmark custom retrieval models against standard IR benchmarks.
226 stars. No commits in the last 6 months.
Stars
226
Forks
41
Language
Python
License
MIT
Category
Last pushed
Oct 04, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/lgalke/vec4ir"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
shibing624/text2vec
text2vec, text to vector....
ddangelov/Top2Vec
Top2Vec learns jointly embedded topic, document and word vectors.
predict-idlab/pyRDF2Vec
đ Python Implementation and Extension of RDF2Vec
IntuitionEngineeringTeam/chars2vec
Character-based word embeddings model based on RNN for handling real world texts
IITH-Compilers/IR2Vec
Implementation of IR2Vec, LLVM IR Based Scalable Program Embeddings