UKPLab/gpl
Powerful unsupervised domain adaptation method for dense retrieval. Requires only unlabeled corpus and yields massive improvement: "GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Retrieval" https://arxiv.org/abs/2112.07577
Generated queries are synthesized from the corpus using T5 models, then negative examples are mined via dense retrievers, and pseudo-labels are assigned by cross-encoders to create training signal without manual annotation. Built on Hugging Face Transformers and Sentence-Transformers, GPL accepts BeIR-format datasets and trains dense retrievers using MarginMSE loss optimized for dot-product similarity.
340 stars and 175 monthly downloads. No commits in the last 6 months. Available on PyPI.
Stars
340
Forks
38
Language
Python
License
Apache-2.0
Category
Last pushed
Jul 06, 2023
Monthly downloads
175
Commits (30d)
0
Dependencies
4
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/UKPLab/gpl"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
galilai-group/stable-pretraining
Reliable, minimal and scalable library for pretraining foundation and world models
CognitiveAISystems/MAPF-GPT
[AAAI-2025] This repository contains MAPF-GPT, a deep learning-based model for solving MAPF...
larslorch/avici
Amortized Inference for Causal Structure Learning, NeurIPS 2022
svdrecbd/mhc-mlx
MLX + Metal implementation of mHC: Manifold-Constrained Hyper-Connections by DeepSeek-AI.
Cognitive-AI-Systems/MAPF-GPT-DDG
[IROS-2025] MAPF-GPT-DDG is a scalable decentralized multi-agent pathfinding (MAPF) solver based...