Word Embedding Implementations Embedding Tools

Custom implementations and training of word embedding models (Word2Vec, binary embeddings, etc.) from scratch or on specific datasets. Does NOT include pretrained models, sentence embeddings, or downstream applications of embeddings.

There are 109 word embedding implementations tools tracked. 1 score above 70 (verified tier). The highest-rated is shibing624/text2vec at 73/100 with 4,950 stars and 1,922 monthly downloads.

Get all 109 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=embeddings&subcategory=word-embedding-implementations&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 shibing624/text2vec

text2vec, text to vector....

73
Verified
2 ddangelov/Top2Vec

Top2Vec learns jointly embedded topic, document and word vectors.

65
Established
3 predict-idlab/pyRDF2Vec

🐍 Python Implementation and Extension of RDF2Vec

58
Established
4 IntuitionEngineeringTeam/chars2vec

Character-based word embeddings model based on RNN for handling real world texts

56
Established
5 IITH-Compilers/IR2Vec

Implementation of IR2Vec, LLVM IR Based Scalable Program Embeddings

56
Established
6 stephantul/reach

Load embeddings and featurize your sentences.

56
Established
7 natasha/navec

Compact high quality word embeddings for Russian language

51
Established
8 dalinvip/cw2vec

cw2vec: Learning Chinese Word Embeddings with Stroke n-gram Information

49
Emerging
9 jaanli/food2vec

:hamburger:

48
Emerging
10 pnpnpn/dna2vec

dna2vec: Consistent vector representations of variable-length k-mers

48
Emerging
11 tca19/dict2vec

Dict2vec is a framework to learn word embeddings using lexical dictionaries.

46
Emerging
12 lgalke/vec4ir

Word Embeddings for Information Retrieval

46
Emerging
13 oborchers/Fast_Sentence_Embeddings

Compute Sentence Embeddings Fast!

46
Emerging
14 bnosac/doc2vec

Distributed Representations of Sentences and Documents

45
Emerging
15 persiyanov/skip-thought-tf

An implementation of skip-thought vectors in Tensorflow

45
Emerging
16 wikipedia2vec/wikipedia2vec

A tool for learning vector representations of words and entities from Wikipedia

45
Emerging
17 brannondorsey/GloVe-experiments

GloVe word vector embedding experiments (similar to Word2Vec)

43
Emerging
18 neuml/staticvectors

🔢 Work with static vector models

43
Emerging
19 clips/dutchembeddings

Repository for the word embeddings experiments described in "Evaluating...

41
Emerging
20 fnielsen/wembedder

Wikidata embedding

41
Emerging
21 CyberZHG/keras-word-char-embd

Concatenate word and character embeddings in Keras

41
Emerging
22 jaredwinick/img2vec-keras

Image to dense vector embedding. Clone of...

41
Emerging
23 pommedeterresautee/fastrtext

R wrapper for fastText

40
Emerging
24 ThoughtRiver/lmdb-embeddings

Fast word vectors with little memory usage in Python

40
Emerging
25 bnosac/word2vec

Distributed Representations of Words using word2vec

39
Emerging
26 vecto-ai/vecto

Doing things with embeddings

39
Emerging
27 md-mq/philo2vec

An implementation of word2vec applied to [stanford philosophy...

39
Emerging
28 dwslab/jRDF2Vec

A high-performance Java Implementation of RDF2Vec

39
Emerging
29 midi-ld/midi2vec

MIDI2vec computes embeddings for representing MIDI data in vector space

38
Emerging
30 joisino/wordtour

Code for "Word Tour: One-dimensional Word Embeddings via the Traveling...

38
Emerging
31 sismetanin/word2vec-tsne

Google News and Leo Tolstoy: Visualizing Word2Vec Word Embeddings using t-SNE.

37
Emerging
32 FraLotito/pytorch-continuous-bag-of-words

The Continuous Bag-of-Words model (CBOW) is frequently used in NLP deep...

37
Emerging
33 aihpi/workshop-nlp-embeddings

Code for the KISZ-BB Workshop series "Working with embeddings"

34
Emerging
34 cmasch/word-embeddings-from-scratch

Creating word embeddings from scratch and visualize them on TensorBoard....

34
Emerging
35 arsena-k/Word2Vec-bias-extraction

How are words loaded with meaning? Repository accompanying research by...

34
Emerging
36 zhaojishun/GenderBiasPapers

Must-read Papers on Gender Bias.

33
Emerging
37 Santosh-Gupta/Research2Vec

Representing research papers as vectors / latent representations.

32
Emerging
38 maxent-ai/lda2vec

Mixing Dirichlet Topic Models and Word Embeddings to Make lda2vec from this...

32
Emerging
39 MirunaPislar/emoji2vec

Train emoji embeddings based on emoji descriptions.

31
Emerging
40 franciszekparma/Word2Vec

From-scratch Word2Vec (skip-gram with negative sampling) fully implemented in PyTorch

30
Emerging
41 PlanTL-GOB-ES/Biomedical-Word-Embeddings-for-Spanish

Biomedical Word embeddings generated from Spanish Biomedical corpora.

29
Experimental
42 hassyGo/charNgram2vec

Pre-training character n-gram embeddings

29
Experimental
43 zgornel/Glowe.jl

Julia interface to GloVe

28
Experimental
44 stevend94/Feature2Vec

Code used in the paper, Feature2Vec: Distributional semantic modelling of...

28
Experimental
45 ChristophAlt/pytorch-starspace

PyTorch implementation of StarSpace as described in "StarSpace: Embed All...

28
Experimental
46 Rj7/Unsupervised-morphology-induction-word2vec

Implementation of Unsupervised Morphology Induction Using Word Embeddings

27
Experimental
47 warchildmd/game2vec

TensorFlow implementation of word2vec applied on...

27
Experimental
48 NURx2/pycode2vec

The tool for getting embeddings of Python 3 code chunks

26
Experimental
49 noobiegz/cw2vec

Implementation of the cw2vec model

26
Experimental
50 zgornel/ConceptnetNumberbatch.jl

Julia API for ConceptNetNumberbatch

26
Experimental
51 roopalgarg/brand_embedding

Generate word embeddings for commercial brand names to study similarity between them.

25
Experimental
52 marmarelis/QDiffusion.jl

Leveraging the full dimensionality of single-cell transcriptomics (among...

25
Experimental
53 brannondorsey/ChessEmbeddings

GloVe vector embeddings of chess moves

25
Experimental
54 Abhinavexists/Vectorlake

Trying to build embedding from Scratch

24
Experimental
55 Rajspeaks/Deep-Learning-Approach-to-Bengali-Word-Embedding-using-BengaliWord2Vec-from-BNLP

Bengali word embedding using BengaliWord2Vec from BNLP. A mini project under...

24
Experimental
56 ninalx/table2vec-lideng

Table2Vec: Neural Word and Entity Embeddings for Table Population and Retrieval

24
Experimental
57 zgornel/EmbeddingsAnalysis.jl

A package for embeddings processing

24
Experimental
58 danielcieslinski/curve2vec

Python package for generating vector embeddings of curves

23
Experimental
59 sshh12/Voice-Vector

A one-shot siamese approach to generating voice embeddings.

23
Experimental
60 dr-irani/Quantifying-Bias-Contextualized-Embeddings

Semester project for Machine Learning: Deep Learning, Spring 2020

23
Experimental
61 Kirili4ik/code2vec

code2vec for Python 3 made for NL2ML project

23
Experimental
62 chanind/word2vec-gender-bias-explorer

A tool to show gender bias in words based on NLP word embeddings from Google News

23
Experimental
63 w-zm/python-sentence2vec

This tool provides some implementations of sentence to vector. (sentence2vec)

23
Experimental
64 eifuentes/skipgrammar

A framework for representing sequences as embeddings.

22
Experimental
65 CharlesGaydon/Dater-to-Vec

Collaborative filtering in dating. A NLP-based user embedding approach...

22
Experimental
66 jolivaresc/fastText-vecmap

bilingual word embeddings mapping using fastText

22
Experimental
67 MahmoudAbdelRahman/build2Vec

Building representation in the vector space

22
Experimental
68 BotCenter/spanishWordEmbeddings

Spanish Word Embeddings computed from large corpora and different sizes...

22
Experimental
69 brangerbriz/midi-glove

Create MIDI note vector embeddings using GloVe (Global Vectors for Word...

21
Experimental
70 ccmaymay/word2vec

word2vec, commented

21
Experimental
71 ksalama/data2cooc2emb2ann

Learning embeddings from item co-occurrence statistics, and building an...

21
Experimental
72 MartinoMensio/it_vectors_wiki_spacy

Word embeddings for Italian language, spacy2 prebuilt model

21
Experimental
73 cr0wley-zz/Embeddings

A study on the ingenious concept of word2vec. The repository contains a...

20
Experimental
74 chr1sbest/word2vec_explorer

Interactive REPL for exploring word2vec word embeddings - demonstrates the...

19
Experimental
75 SmartData-Polito/darkvec

This repo contains the codes and the notebooks used for the paper "DarkVec:...

17
Experimental
76 YannDubs/RAW-Embedings

Novel word embeddings based on a simple and intuitive rolling average. Still...

17
Experimental
77 rosasalberto/image2vec

Building applications on top of Image Embeddings. Recommendation Engine,...

16
Experimental
78 Ashly1991/word2vec-tf2

Word2Vec Skipgram with negative sampling in TensorFlow 2. Self-supervised...

15
Experimental
79 LoicGrobol/fasttextlt

A pure Python FastText interface, to ensure that FastText model stay usable...

15
Experimental
80 srijansood/debias-word-embeddings

Tackling Gender and Race bias in Word Embeddings

15
Experimental
81 menon92/Bangla-Word2Vec

Bangla word2vec using skipgram approach

15
Experimental
82 vsoch/wikipedia-equations

word2vec embeddings for statistics and math equations from Wikipedia

15
Experimental
83 worldbeater/code-vecs

Code for the methods and algorithms described in the paper "Analysis of...

15
Experimental
84 acd17sk/Word2vec-CBOW-Negative-Sampling

This project provides a pure NumPy implementation of the Word2vec Continuous...

14
Experimental
85 mahb97/Wake2vec

Controlled style shift to Joyce via embedding surgery and Wake lexicon

14
Experimental
86 danaugrs/binary-word-embeddings

Generates binary word embeddings by analyzing Wikipedia

14
Experimental
87 mayankkejriwal/Geonames-embeddings

Embeddings for all geonames populated locations with population greater than 0

14
Experimental
88 dkaslovsky/Not-Word2Vec

This is not word2vec

14
Experimental
89 hammi03/word2vec-numpy

Skip-gram Word2Vec with Negative Sampling in pure NumPy, trained on text8

14
Experimental
90 undeluro/word2vec

Implementation of the core word2vec training loop in pure numpy.

14
Experimental
91 ben300694/word-embeddings

Repository for the seminar "Word Embedding Spaces", Master CS and Master AI...

13
Experimental
92 mantzaris/TextSpace.jl

A Julia package for text embeddings and related NLP transformations

13
Experimental
93 nocotan/skipgram_cpp

Skipgram with Hierarchical Softmax

13
Experimental
94 nadinejackson1/word-embeddings-visualization

Visualizing word embeddings generated by GloVe and Word2Vec models using the...

13
Experimental
95 maxi-w/image-vectors

Embed images easily

13
Experimental
96 boyanangelov/species2vec

Species vector representations

12
Experimental
97 BotCenter/spanish-sent2vec

Spanish Sentence Embeddings computed from large corpora using sent2vec.

12
Experimental
98 CentreForDigitalHumanities/Word2VecElastic

Collect sentences from ElasticSearch, preprocess and train diachronic Word2Vec models

12
Experimental
99 Hellisotherpeople/debate2vec

Word-vectors created from a large corpus of competative debate evidence

12
Experimental
100 AnasMohammad4321/Word2Vec-Pytorch

Implementation of Word2Vec for learning word embeddings using the Amazon...

11
Experimental
101 rtlee9/state-of-the-union

Paragraph vector analysis of state of the union addresses

11
Experimental
102 nisaharan/vector-embeddings-workshop-vavuniya

Intro to word embeddings & semantic search – workshop at University of Vavuniya

11
Experimental
103 vamsivallepu/Telugu-W2V

Word-2-Vec embeddings trained on Telugu text corpus

11
Experimental
104 Koziev/word_embedders

Character-level autoencoder models for words

11
Experimental
105 remeinium/Uganna_Siyabasa

A FastText Embedding model trained on Sinhala language.

11
Experimental
106 AaruranLog/Analogies

Analogy solver using Google's pretrained word vectors

10
Experimental
107 Marwolaeth/EmbeddingsTools.jl

Extra tools for working with word embeddings, such as those in...

10
Experimental
108 gumblex/lmtvec

Low Memory Text Vector

10
Experimental
109 japgarrido/Word2Vec-Embedding-Analysis

This project focuses on implementing and analyzing word embeddings using the...

10
Experimental