Word Embedding Methods NLP Tools

Tools, implementations, and evaluations of word embedding algorithms and techniques (Word2Vec, GloVe, PPMI, etc.). Does NOT include embedding applications for downstream tasks, multimodal embeddings, or language model embeddings.

There are 58 word embedding methods tools tracked. 3 score above 50 (established tier). The highest-rated is avidale/compress-fasttext at 56/100 with 183 stars and 13,409 monthly downloads.

Get all 58 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=word-embedding-methods&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 avidale/compress-fasttext

Tools for shrinking fastText models (in gensim format)

56
Established
2 dselivanov/text2vec

Fast vectorization, topic modeling, distances and GloVe word embeddings in R.

55
Established
3 vzhong/embeddings

Fast, DB Backed pretrained word embeddings for natural language processing.

53
Established
4 dccuchile/spanish-word-embeddings

Spanish word embeddings computed with different methods and from different corpora

49
Emerging
5 ncbi-nlp/BioSentVec

BioWordVec & BioSentVec: pre-trained embeddings for biomedical words and sentences

48
Emerging
6 ibrahimsharaf/doc2vec

:notebook: Long(er) text representation and classification using Doc2Vec embeddings

46
Emerging
7 iarroyof/sentence_embedding

A sentence embedding method based on weighted series

35
Emerging
8 rguthrie3/MorphologicalPriorsForWordEmbeddings

Code for EMNLP 2016 paper: Morphological Priors for Probabilistic Word Embeddings

34
Emerging
9 WorksApplications/chiVe

Japanese word embedding with Sudachi and NWJC 🌿

34
Emerging
10 awslabs/sagemaker-privacy-for-nlp

A solution that helps apply a privacy preserving mechanism to NLP data,...

33
Emerging
11 CLARIN-PL/embeddings

Embeddings: State-of-the-art Text Representations for Natural Language...

31
Emerging
12 Kekkodf/pypantera

A Python Package for NLP obfuscation using Differential Privacy

30
Emerging
13 mkearney/wactor

Word Factor Vectors

29
Experimental
14 thoppe/transorthogonal-linguistics

Uses a distributed word representation to finds words along the hyperchord...

29
Experimental
15 YuriyGuts/thrones2vec

Using Word2Vec to explore semantic similarities between the entities of "A...

26
Experimental
16 DanilBaibak/Harry_Potter_vs_Word2Vec

This is the example of analysing corpus of texts using Word2Vec

26
Experimental
17 amitvikramraj/Medical-Embeddings-and-Clinical-Trial-Search-Engine

The Project aims to train SkipGram and FastText Models on COVID-19 Clinical...

25
Experimental
18 schoennenbeck/swem

pytorch implementation of the simple word embedding model.

24
Experimental
19 jeffrichardchemistry/WordFP

A new way to encode words and similarity calculate

24
Experimental
20 neuro-symbolic-ai/multi_relational_hyperbolic_word_embeddings

Multi-Relational Hyperbolic Word Embeddings from Natural Language Definitions

23
Experimental
21 anishLearnsToCode/word-embeddings

Continuous Bag πŸ’Ό of Words Model to create Word embeddings for a word from a...

23
Experimental
22 seyedsaeidmasoumzadeh/Binary-Text-Classification-Doc2vec-SVM

A Python implementation of a binary text classifier using Doc2Vec and SVM

22
Experimental
23 Tuanpham1994/Word-embedding-and-prediction

Word embedding and prediction

22
Experimental
24 shimo-lab/sembei

:rice_cracker: 単θͺžεˆ†ε‰²γ‚’η΅Œη”±γ—γͺγ„ε˜θͺžεŸ‹γ‚θΎΌγΏ :rice_cracker:

21
Experimental
25 SigSegvSquad/WordLink-PharmaSearch

A Natural Language Search Enabled for Pharmaceutical research data. We aim...

20
Experimental
26 digitalprk/north_korean_embeddings

Word2Vec Word Vectors trained on a North Korean Corpus / μ‘°μ„ μ–΄ (λΆν•œμ–΄) 단어 μž„λ² λ”©

20
Experimental
27 viveksck/simplicity

Code and Data for Simple Models for Word Formation in English Slang

20
Experimental
28 EQTPartners/pause

🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by...

20
Experimental
29 Turkish-Word-Embeddings/Word-Embeddings-Repository-for-Turkish

Code for "A Comprehensive Analysis of Static Word Embeddings for Turkish"....

20
Experimental
30 unixpickle/wordembed

Word embeddings for natural language processing

19
Experimental
31 vaskonov/burvec

Word Embeddings for Low Resource Languages: The Case of Buryat

19
Experimental
32 ispras-texterra/word-embeddings-eval-hy

Pre-trained fastText, word2vec, GloVe embeddings for the Armenian language...

19
Experimental
33 SigSegvSquad/WordLink

In this project we try to establish a concrete mathematical relation between...

18
Experimental
34 brandonyph/Introduction-to-Word-Embedding-in-R

This page serve as the repository for the script file I used in my...

16
Experimental
35 AsadiAhmad/Word-Embeding-CNN

Word Embeding with CNN

16
Experimental
36 AsadiAhmad/Word-Embedding

Word Embeding with Simple model, w2v, Simple RNN, LSTM

16
Experimental
37 federicoarenasl/Evaluating-w-Embeddings

In this paper we compare and evaluate two simple embedding models which can...

16
Experimental
38 RonyAbecidan/PrivateWordEmbeddings

Study of the paper "Differentially Private Representation for NLP"

15
Experimental
39 BlackKakapo/Icelandic-Word-Embedding

Icelandic Word Embeddings. Here you can find pre-trained corpora of word...

15
Experimental
40 helboukkouri/embedding-visualization

This is a project for visualizing word embeddings based on the work of...

15
Experimental
41 shubham0204/glove-android

Power of GloVe word embeddings in Android

15
Experimental
42 shubham0204/glove.c

Simple, cross-platform port of GloVe embeddings, written in C

15
Experimental
43 trongdang09/word-embeddings

Application of Word Embeddings Model in Natural Language Processing

15
Experimental
44 akash18tripathi/Word-Embeddings

This GitHub repository contains implementations of three popular word...

15
Experimental
45 abdulsalam-bande/swifty

This is a work to improve molecular docking speed. Normally docking a ligand...

15
Experimental
46 alvations/vegetables

Collection of Repackaged Word Embeddings

13
Experimental
47 LeonardoEmili/Word-in-Context

Word-in-Context (WiC) as a binary classification task using static word...

12
Experimental
48 grusso98/sins_word_embeddings

7 sins diachronic analysis using CADE on W2V and GloVe embeddings

12
Experimental
49 Rajspeaks/Deep-Learning-Approach-to-Bengali-Text-Visualization-using-Word2Vec-Model

This repository consists of Bengali Text-Visualization using Word2Vec Model....

12
Experimental
50 Ritika2001/Word-Embedding-Models-for-Subjectivity-Analysis

An Empirical Evaluation of Word Embedding Models for Subjectivity Analysis Tasks

12
Experimental
51 CodeBoyPhilo/VocabOverfit

VocabOverfit: adopting the concept of embedding in memorising vocabularies

12
Experimental
52 keivanipchihagh/simple-word-embedding

A simple and custom word embedding algorithm

11
Experimental
53 jreades/ph-tutorial-code

Code to accompany clustering and visualising documents with word embeddings tutorial.

11
Experimental
54 Adit-Mugdha-das/Operations-on-Word-Vectors---Debiasing

Implementation of vector arithmetic on pre-trained word embeddings (GloVe)...

11
Experimental
55 CZboop/Word2Vec2SVG

Python script creating SVG polygons from word2vec vectors, and React app to...

11
Experimental
56 GunjanDhanuka/word2vec_vis

Semantic Word Embeddings Visualizer that has the option to train on your own...

10
Experimental
57 callforpapers-source/inter-word-embedding

a non-neural network approach for word embedding

10
Experimental
58 lyraphix/pushingPikachu

Pushing Pikachu is a project that uses a fine-tuned GloVe embedding to find...

10
Experimental