Embedding Evaluation Benchmarks Embedding Tools
Tools and frameworks for evaluating, testing, and benchmarking embedding models across various dimensions (quality, stress-testing, cross-lingual performance). Does NOT include embedding generation, pre-trained models, or domain-specific embedding applications.
There are 58 embedding evaluation benchmarks tools tracked. 1 score above 70 (verified tier). The highest-rated is embeddings-benchmark/mteb at 99/100 with 3,159 stars and 1,555,633 monthly downloads. 1 of the top 10 are actively maintained.
Get all 58 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=embeddings&subcategory=embedding-evaluation-benchmarks&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark |
|
Verified |
| 2 |
yannvgn/laserembeddings
LASER multilingual sentence embeddings as a pip package |
|
Established |
| 3 |
harmonydata/harmony
The Harmony Python library: a research tool for psychologists to harmonise... |
|
Established |
| 4 |
embeddings-benchmark/results
Data for the MTEB leaderboard |
|
Emerging |
| 5 |
MilaNLProc/honest
A Python package to compute HONEST, a score to measure hurtful sentence... |
|
Emerging |
| 6 |
fresh-stack/freshstack
This repository helps you evaluate your models on the FreshStack benchmark! |
|
Emerging |
| 7 |
autonomio/signs
A suite of tools for text preparation, vectorization and processing for deep... |
|
Emerging |
| 8 |
Hironsan/awesome-embedding-models
A curated list of awesome embedding models tutorials, projects and communities. |
|
Emerging |
| 9 |
SeanLee97/AnglE
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and... |
|
Emerging |
| 10 |
flipz357/S3BERT
Semantically Structured Sentence Embeddings |
|
Emerging |
| 11 |
etalab-ia/mediatech
Collection of public datasets from the French administration, vectorized and... |
|
Emerging |
| 12 |
plasticityai/magnitude
A fast, efficient universal vector embedding utility package. |
|
Emerging |
| 13 |
isaacus-dev/mleb
The code used to evaluate embedding models on the Massive Legal Embedding... |
|
Emerging |
| 14 |
bheinzerling/bpemb
Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE) |
|
Emerging |
| 15 |
ricsinaruto/dialog-eval
Evaluate your dialog model with 17 metrics! (see paper) |
|
Emerging |
| 16 |
MaxwellRebo/awesome-2vec
Curated list of 2vec-type embedding models |
|
Emerging |
| 17 |
wangyuxinwhy/uniem
unified embedding model |
|
Emerging |
| 18 |
encord-team/ebind
A 5-way embedding model for text, audio, image, video, and 3D point clouds. |
|
Emerging |
| 19 |
IndicoDataSolutions/Enso
Enso: An Open Source Library for Benchmarking Embeddings + Transfer Learning Methods |
|
Emerging |
| 20 |
janluke/embfile
A package for reading/writing files containing pre-trained word embeddings... |
|
Emerging |
| 21 |
isaacus-dev/open-australian-legal-embeddings-creator
The code used to create and update the Open Australian Legal Embeddings, the... |
|
Experimental |
| 22 |
DeepK/hoDMD-experiments
EigenSent: Spectral sentence embeddings using higher-order Dynamic Mode Decomposition |
|
Experimental |
| 23 |
ikergarcia1996/MetaVec
A monolingual and cross-lingual meta-embedding generation and evaluation framework |
|
Experimental |
| 24 |
vered1986/NC_embeddings
Comparison between various noun compound embeddings |
|
Experimental |
| 25 |
jfilter/hyperhyper
🧮 Python package to construct word embeddings for small data using PMI and SVD |
|
Experimental |
| 26 |
sberdevices/saf_vectorizers
Плагин для SmartApp Framework, осуществляющий векторизацию (получение... |
|
Experimental |
| 27 |
EloiZ/embedding_evaluation
Evaluate your word embeddings |
|
Experimental |
| 28 |
semvec/embedstresstest
Stress Testing Embedding Models |
|
Experimental |
| 29 |
sileod/embcomp
Composition of embeddings |
|
Experimental |
| 30 |
louisbrulenaudet/tax-retrieval-benchmark
An implementation of the TaxRetrievalBenchmark task for the 🤗 Massive Text... |
|
Experimental |
| 31 |
yanaiela/easyEmbed
downloading pre-trained embedding easily and keeping only the necessary... |
|
Experimental |
| 32 |
s1mb1o/epg-embedding-benchmark
Evaluating sentence embedding models for cross-lingual TV program guide... |
|
Experimental |
| 33 |
Sandipan99/POLAR
The POLAR Framework: polar Opposites Enable Interpretability of Pre-Trained... |
|
Experimental |
| 34 |
MukundaKatta/EmbedBench
Embedding model comparison toolkit — benchmark TF-IDF, BoW, n-gram... |
|
Experimental |
| 35 |
Hanscal/textembedding
计算文本相似度时经常需要用到的算法包 |
|
Experimental |
| 36 |
AbdulSametTurkmenoglu/embedding_compare
Embedding Model Comparison for Turkish Medical Texts |
|
Experimental |
| 37 |
ClimSocAna/tecb-de
German Text Embedding Clustering Benchmark |
|
Experimental |
| 38 |
rafalposwiata/pl-mteb
PL-MTEB: Polish Massive Text Embedding Benchmark |
|
Experimental |
| 39 |
neural-dialogue-metrics/EmbeddingBased
Embedding-based evaluation metrics for dialogue generation. |
|
Experimental |
| 40 |
eifuentes/awesome-embeddings
🪁A curated list of awesome resources around entity embeddings |
|
Experimental |
| 41 |
OctaviusLeo/rag-lite-tfidf-eval
AI/SWE |
|
Experimental |
| 42 |
guenthermi/table-embeddings
Tools for training schema-aware Web table embedding for unsupervised and... |
|
Experimental |
| 43 |
busycaesar/Embeddings_And_Cosine_Similarity
Code for the presentation. |
|
Experimental |
| 44 |
Paulescu/text-embedding-evaluation
Join 15k builders to the Real-World ML Newsletter ⬇️⬇️⬇️ |
|
Experimental |
| 45 |
paithiov909/apportita
Utility for handling ‘magnitude’ pretrained word embeddings |
|
Experimental |
| 46 |
TonioDominguez/dungeons_and_pythons_embeddings
Particular adaptación de juegos de rol basados en texto con tecnología NLP... |
|
Experimental |
| 47 |
kushmadlani/embedtrics
Word embedding evaluation package for word similarity, word analogies & word... |
|
Experimental |
| 48 |
iamtatsuki05/MIREI
MIREI is a research workspace that builds encoder/decoder text-embedding... |
|
Experimental |
| 49 |
dali-does/vse-probing
Code for COLING2020 paper: Probing Multimodal Embeddings for Linguistic Properties. |
|
Experimental |
| 50 |
BYU-PCCL/regexv
Regex using word embeddings for text matching |
|
Experimental |
| 51 |
France-Travail/embcompare
A simple python tool for embedding comparison |
|
Experimental |
| 52 |
MinionAttack/conllu-conll-tool
Tool to convert CoNLL-U format files to CoNLL format files and manipulate... |
|
Experimental |
| 53 |
abhimishra91/corpus-creator
This tool can be used to create a word corpus from locally available... |
|
Experimental |
| 54 |
tahsinkoc/test-embrix-experimental
Comprehensive benchmark suite for evaluating embedding model performance... |
|
Experimental |
| 55 |
aravpanwar/Embedding_Comparision
This repository provides a framework to benchmark the performance and... |
|
Experimental |
| 56 |
metawake/awesome-text-embeddings
A curated list of text embedding models, benchmarks, and tools for semantic... |
|
Experimental |
| 57 |
alecokas/subword-embedding
A tool for generating sub-word (phone or grapheme) level embeddings from an... |
|
Experimental |
| 58 |
inkrement/StuffedTurkey
Distributed Embedding Aggregation |
|
Experimental |