mteb and results
The first is the benchmark framework and evaluation suite, while the second is the results repository that populates the public leaderboard—they are complements that work together in a producer-consumer relationship.
About mteb
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
Provides standardized evaluation across 100+ tasks spanning classification, clustering, retrieval, and semantic textual similarity for both text and multimodal embeddings. Integrates with Hugging Face ecosystem (SentenceTransformers, transformers) and offers a unified Python API plus CLI for benchmarking custom or pretrained models against a curated leaderboard. Supports multilingual evaluation with automatic caching, batch processing, and reproducible result tracking across embedding model implementations.
About results
embeddings-benchmark/results
Data for the MTEB leaderboard
Stores standardized evaluation results from the MTEB (Massive Text Embedding Benchmark) package across diverse embedding models and tasks. Results are submitted directly to this repository rather than via Hugging Face model cards, enabling verification that scores match verified model implementations. The leaderboard aggregates these results to provide comparable benchmarks across retrieval, clustering, semantic search, and other embedding-based tasks.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work