tca19/dict2vec
Dict2vec is a framework to learn word embeddings using lexical dictionaries.
Combines Wikipedia corpora with dictionary definition pairs (strong and weak semantic relationships) during training to improve embedding quality. The C-based implementation supports multi-threaded training and includes comprehensive evaluation against 13 word similarity benchmarks with Spearman correlation scoring. Provides pre-trained embeddings (100-300 dimensions) and utilities to fetch definitions from online dictionaries and generate training pairs automatically.
115 stars. No commits in the last 6 months.
Stars
115
Forks
29
Language
Python
License
GPL-3.0
Category
Last pushed
Jan 08, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/tca19/dict2vec"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
shibing624/text2vec
text2vec, text to vector....
ddangelov/Top2Vec
Top2Vec learns jointly embedded topic, document and word vectors.
predict-idlab/pyRDF2Vec
đ Python Implementation and Extension of RDF2Vec
IntuitionEngineeringTeam/chars2vec
Character-based word embeddings model based on RNN for handling real world texts
IITH-Compilers/IR2Vec
Implementation of IR2Vec, LLVM IR Based Scalable Program Embeddings