Protein Language Models ML Frameworks

Tools for training and applying transformer-based language models on protein sequences for tasks like fitness prediction, stability estimation, and property inference. Does NOT include structure prediction, sequence alignment, or general protein embeddings without generative/discriminative language modeling.

There are 55 protein language models frameworks tracked. 3 score above 50 (established tier). The highest-rated is DeepRank/deeprank2 at 62/100 with 57 stars and 114 monthly downloads.

Get all 55 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=ml-frameworks&subcategory=protein-language-models&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Framework Score Tier
1 DeepRank/deeprank2

An open-source deep learning framework for data mining of protein-protein...

62
Established
2 sacdallago/biotrainer

Biological prediction models made simple.

58
Established
3 jonathanking/sidechainnet

An all-atom protein structure dataset for machine learning.

51
Established
4 a-r-j/ProteinWorkshop

Benchmarking framework for protein representation learning. Includes a large...

48
Emerging
5 BioinfoMachineLearning/DIPS-Plus

The Enhanced Database of Interacting Protein Structures for Interface Prediction

46
Emerging
6 idrblab/AnnoPRO

Feature map and function annotation of Proteins

45
Emerging
7 songlab-cal/tape

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically...

43
Emerging
8 flatironinstitute/DeepFRI

Deep functional residue identification

42
Emerging
9 aqlaboratory/proteinnet

Standardized data set for machine learning of protein structure

42
Emerging
10 jaswindersingh2/SPOT-RNA

RNA Secondary Structure Prediction using an Ensemble of Two-dimensional Deep...

41
Emerging
11 LBM-EPFL/PeSTo

Geometric deep learning method to predict protein binding interfaces from a...

37
Emerging
12 jonathanking/protein-transformer

Predicting protein structure through sequence modeling

35
Emerging
13 michaelhla/pro-1

reasoning model trained using GRPO towards rosetta REF2015 for protein stability

35
Emerging
14 HannesStark/protein-localization

Using Transformer protein embeddings with a linear attention mechanism to...

34
Emerging
15 anton-bushuiev/PPIformer

Learning to design protein-protein interactions with enhanced generalization...

33
Emerging
16 vsomnath/holoprot

Multi-Scale Representation Learning on Proteins (NeurIPS 2021)

33
Emerging
17 anton-bushuiev/PPIRef

Dataset and package for working with protein-protein interactions in 3D

31
Emerging
18 dohlee/abyssal-pytorch

Implementation of Abyssal, a deep neural network trained with a new "mega"...

31
Emerging
19 adaptyvbio/ProteinFlow

Versatile computational pipeline for processing protein structure data for...

31
Emerging
20 aws-samples/lm-gvp

LM-GVP: A Generalizable Deep Learning Framework for Protein Property...

30
Emerging
21 victor369basu/ProteinStructurePrediction

Protein structure prediction is the task of predicting the 3-dimensional...

30
Emerging
22 vam-sin/CATHe

Deep Learning tool trained on protein sequence embeddings from protein...

30
Emerging
23 OpenProteinAI/PoET

Inference code for PoET: A generative model of protein families as...

28
Experimental
24 conradry/prtm

Deep learning for protein science

28
Experimental
25 lightonai/RITA

RITA is a family of autoregressive protein models, developed by LightOn in...

28
Experimental
26 dohlee/rasp-pytorch

Reimplementation of RaSP, a deep neural network for rapid protein stability...

28
Experimental
27 draeger-lab/TFpredict

Identification and structural characterization of transcription factors...

27
Experimental
28 QizhiPei/BioT5

BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)

27
Experimental
29 bioinfodlsu/phage-host-prediction

Published in PLOS ONE. Phage-host interaction prediction tool that uses...

26
Experimental
30 MachineLearningLifeScience/protein_regression

The codebase to replicate the analysis of "A systematic analysis of...

26
Experimental
31 Bitbol-Lab/DiffPALM

Differentiable Pairing using Alignment-based Language Models

25
Experimental
32 milagjurovska/PPI-link-prediction-with-optimized-gcn-and-gan

Comparing different biologically inspired algorithms in hyperparameter...

24
Experimental
33 google-research/slip

SLIP is a sandbox environment for engineering protein sequences with...

24
Experimental
34 NIGMS/Protein-Protein-Interactions-using-ML

In this module, you will harness novel machine learning techniques to...

24
Experimental
35 jiaqingxie/DeepProtein

Deep Learning Library and Benchmark for Protein Sequence Learning...

23
Experimental
36 JieZheng-ShanghaiTech/SL_benchmark

Benchmarking study of machine learning methods for prediction of synthetic lethality

23
Experimental
37 kiwijuice56/protein-visualizer

Visualizing the function of biological proteins through deep learning. MIT...

22
Experimental
38 jsmccabe1/ApiPred

Predict fitness phenotypes and invasion machinery in apicomplexan parasites...

22
Experimental
39 bschilder/VEP_protein

Using Protein Language Models to compute Variant Effect Predictions across...

22
Experimental
40 Ulton321/Protein-Language-Model-Steering

Protein-Language-Model-Steering explores how to guide or "steer" large...

22
Experimental
41 raphamontana/BioNCE

An Intelligent digital system for classification of molecules querying...

22
Experimental
42 jgbrasier/protein-classification

Deep sequence models for protein classification

21
Experimental
43 daisybio/data-leakage-ppi-prediction

Code associated with the paper 'Cracking the blackbox of deep sequence-based...

20
Experimental
44 omarperacha/ps4-dataset

The largest open-source dataset for Protein Single Sequence Secondary...

20
Experimental
45 shruti-sivakumar/MSA-Comparative-Study

Benchmarking 6 MSA tools (Clustal Omega, MUSCLE, MAGUS, M-Coffee, MSA Probs,...

19
Experimental
46 310-ai/lib310

lib310 python package

18
Experimental
47 DeepFoldProtein/OTalign

OTalign: Protein sequence alignment for remote homologs using Protein...

17
Experimental
48 claopodium/SLP-for-Bio

Based on Single Layer Perceptron model, the programme intended to locate key...

16
Experimental
49 bioinfodlsu/PHIStruct

Published in Bioinformatics. Phage-host interaction prediction tool that...

16
Experimental
50 allamiro/9mers-structure-prediction

Protein structure prediction for CullPDB 9-mer fragments using multi-input...

15
Experimental
51 MPI-Dortmund/pymissense

PyMissense creates the pathogenicity plot and modified pdb as shown in the...

15
Experimental
52 A-Hareed/BackMapNet

BackMapNet is a deep-learning framework for reconstructing all-atom protein...

14
Experimental
53 mciaravino/yeast-protein-classification

Multiclass classification of yeast protein localization sites using multiple...

14
Experimental
54 kren-ai-lab/RUDEUS

Developing classification models for DNA-Binding proteins through machine...

12
Experimental
55 phenolophthaleinum/phastDNA

Virus-host interaction prediction using local fluctuations of genome...

12
Experimental