Embedding Model Tuning Embedding Tools
Tools, techniques, and frameworks for fine-tuning embedding models on domain-specific data to improve performance on downstream tasks. Does NOT include pre-trained embedding models, embedding inference/serving, or applications built on top of embeddings.
There are 41 embedding model tuning tools tracked. 1 score above 50 (established tier). The highest-rated is ContextualAI/gritlm at 56/100 with 688 stars and 12,353 monthly downloads.
Get all 41 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=embeddings&subcategory=embedding-model-tuning&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
ContextualAI/gritlm
Generative Representational Instruction Tuning |
|
Established |
| 2 |
xlang-ai/instructor-embedding
[ACL 2023] One Embedder, Any Task: Instruction-Finetuned Text Embeddings |
|
Emerging |
| 3 |
liuqidong07/LLMEmb
[AAAI'25 Oral] The official implementation code of LLMEmb |
|
Emerging |
| 4 |
ritesh-modi/embedding-hallucinations
This repo shows how foundational model hallucinates and how we can fix such... |
|
Emerging |
| 5 |
hpcaitech/CachedEmbedding
A memory efficient DLRM training solution using ColossalAI |
|
Emerging |
| 6 |
ritesh-modi/fine-tuning-embeddings-template
This repo is a template to fine-tune embedding models using... |
|
Emerging |
| 7 |
shobrook/weightgain
Train an adapter for any embedding model in under a minute |
|
Emerging |
| 8 |
lperezmo/embeddings-extraction
Scripts for reading, extracting, and organizing data from either HTML or PDF... |
|
Experimental |
| 9 |
jjcmoon/DeepSoftLog
Soft-Unification in Deep Probabilistic Logic (NeurIPS 2023) |
|
Experimental |
| 10 |
jina-ai/llm-query-expansion
Query Expension for Better Query Embedding using LLMs |
|
Experimental |
| 11 |
CodeSoul-co/THETA
LLM-adaptive embeddings (Zero-shot / LoRA) with Generative Topic Modeling &... |
|
Experimental |
| 12 |
Benja1972/topicphrase
Simple project for extraction of key-phrases from single document based on... |
|
Experimental |
| 13 |
IsmaelMekene/meteor-CUTIE
Spatial and Semantic Segementation |
|
Experimental |
| 14 |
FelipeBenavidesMz/AlphaEarth-Interpretability-Experiments
Binary classification experiments to interpret Google AlphaEarth Foundation... |
|
Experimental |
| 15 |
Jiayu7Yao/llm-classifier
Classify, cluster, and extract data using structured LLM outputs with... |
|
Experimental |
| 16 |
Blue16-WangFudi/DialectSense
Chinese dialect identification using audio embeddings from LLMs. |
|
Experimental |
| 17 |
aws-samples/finetune-bge-embeddings-blog
Code associated with the blog post titled, "Fine-Tuning BGE Embeddings Using... |
|
Experimental |
| 18 |
AnderssonProgramming/llm-embeddings-text-preprocessing
LLM text preprocessing and embedding pipeline implementation for the... |
|
Experimental |
| 19 |
LivingFutureLab/UQABench
[KDD 2025] The source code for UQABench |
|
Experimental |
| 20 |
shimo-lab/modelmap
Embedding language models in probability space via log-likelihood vectors |
|
Experimental |
| 21 |
rag-fish/noesisnoema-pipeline
Modular pipeline for building RAG and LLM workflows in Colab, including... |
|
Experimental |
| 22 |
zh-he/Document-Based-Fine-Tuning-Tool
One-stop pipeline for building IR datasets from PDFs and fine-tuning... |
|
Experimental |
| 23 |
csinva/fmri
Experiments with language fMRI data from Alex Huth lab. More organized repo... |
|
Experimental |
| 24 |
warrofua/n-dimensional-llm
Research exploration of multi‑field information bottlenecks and... |
|
Experimental |
| 25 |
aws-samples/fine-tune-embedding-models-on-sagemaker
This repository contains samples for fine-tuning embedding models using... |
|
Experimental |
| 26 |
csinva/interpretable-embeddings
Interpretable text embeddings by asking LLMs yes/no questions (NeurIPS 2024) |
|
Experimental |
| 27 |
vidhiJain/SpatialEmbeddings
Learning Embeddings that Capture Spatial Semantics for Indoor Navigation,... |
|
Experimental |
| 28 |
quantumxiaol/activation_beacon
fork from... |
|
Experimental |
| 29 |
rubsj/ai-contrastive-embedding-finetuning
Domain-specific embedding fine-tuning with contrastive learning and PEFT/LoRA |
|
Experimental |
| 30 |
ksm26/Embedding-Models-From-Architecture-to-Implementation
Understand and build embedding models, focusing on word and sentence... |
|
Experimental |
| 31 |
meghanmane84/LLM-Manifold-Based-Compression-Techniques
Research code for LLM Compression using Functional Algorithms, exploring... |
|
Experimental |
| 32 |
PetropoulakisPanagiotis/igae
State Representations as Incentives for Reinforcement Learning Agents: A... |
|
Experimental |
| 33 |
NC0DER/LMRank
LMRank: Utilizing Pre-Trained Language Models and Dependency Parsing for... |
|
Experimental |
| 34 |
sine2pi/ASR-model
ASR model |
|
Experimental |
| 35 |
LCEmT/LCEmT
Lossless Compression Techniques for Embedding Tables in Substantial Deep... |
|
Experimental |
| 36 |
AparnaRoy76/Fine-Tune-Embedding-Model
🚀 Generate high-quality triplet datasets for job titles & skills, and... |
|
Experimental |
| 37 |
IMSUVEN/wubba
Wubba learns layout-invariant embeddings from raw HTML using contrastive... |
|
Experimental |
| 38 |
1kkiRen/Embeddings-Division
Python script for dividing embedding layer of LLM. |
|
Experimental |
| 39 |
Renatoelho/embeddings-consultas-similaridade
Vou mostrar como converter textos simples em representações matemáticas... |
|
Experimental |
| 40 |
kushagraghosh/EuroSAT
Trained a ResNet50 model on the EuroSAT satellite imagery dataset w/... |
|
Experimental |
| 41 |
daniau23/Fine_Tuning_LLMs_and_Embeddings
Exploring the fine tuning of both LLMs and Embedding models. |
|
Experimental |