JohnGiorgi/DeCLUTR
The corresponding code from our paper "DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations". Do not hesitate to open an issue if you run into any trouble!
Uses contrastive learning with span-based data augmentation to train sentence encoders on unlabeled text, requiring only documents and no labeled data. Implements mean-pooled RoBERTa-based transformers optimized via a contrastive objective, integrated with AllenNLP training infrastructure and compatible with Hugging Face model export. Evaluated extensively on SentEval downstream and probing tasks, achieving competitive performance without supervised pretraining.
378 stars. No commits in the last 6 months.
Stars
378
Forks
33
Language
Python
License
Apache-2.0
Category
Last pushed
Apr 21, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/JohnGiorgi/DeCLUTR"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
mims-harvard/ClinVec
ClinVec: Unified Embeddings of Clinical Codes Enable Knowledge-Grounded AI in Medicine
NYUMedML/DeepEHR
Chronic Disease Prediction Using Medical Notes
mims-harvard/SHEPHERD
SHEPHERD: Few shot learning for phenotype-driven diagnosis of patients with rare genetic diseases
nomic-ai/contrastors
Train Models Contrastively in Pytorch
biocentral/biocentral_server
Compute functionality for biocentral.