embeddings-benchmark/mteb

MTEB: Massive Text Embedding Benchmark

/ 100

Verified

Provides standardized evaluation across 100+ tasks spanning classification, clustering, retrieval, and semantic textual similarity for both text and multimodal embeddings. Integrates with Hugging Face ecosystem (SentenceTransformers, transformers) and offers a unified Python API plus CLI for benchmarking custom or pretrained models against a curated leaderboard. Supports multilingual evaluation with automatic caching, batch processing, and reproducible result tracking across embedding model implementations.

3,159 stars and 1,555,633 monthly downloads. Used by 5 other packages. Actively maintained with 107 commits in the last 30 days. Available on PyPI.

Maintenance 25 / 25

Adoption 25 / 25

Maturity 25 / 25

Community 24 / 25

How are scores calculated?

Stars

3,159

Forks

568

Language

Python

License

Apache-2.0

Featured in

Embeddings Are Easier Than Whatever You're Doing Instead You're Shipping AI You Can't Measure

Compare

mteb and results mteb and mleb

Related tools

yannvgn/laserembeddings

LASER multilingual sentence embeddings as a pip package

harmonydata/harmony

The Harmony Python library: a research tool for psychologists to harmonise data and...

embeddings-benchmark/results

Data for the MTEB leaderboard

fresh-stack/freshstack

This repository helps you evaluate your models on the FreshStack benchmark!

Hironsan/awesome-embedding-models

A curated list of awesome embedding models tutorials, projects and communities.

Explore Embedding Tools

All categories Trending Embeddings directory Insights