songlab-cal/tape
Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of protein biology.
Provides pretrained protein language models (BERT-style Transformer, UniRep) integrated with HuggingFace transformers API for easy model loading and inference. Includes a modular benchmark suite with five downstream tasks—secondary structure prediction, contact prediction, remote homology detection, fluorescence, and stability prediction—plus utilities for generating sequence embeddings via `tape-embed` command with automatic GPU distribution. The codebase migrated from TensorFlow to PyTorch but recommends external frameworks (PyTorch Lightning, Fairseq) for training rather than native training utilities.
733 stars. No commits in the last 6 months.
Stars
733
Forks
133
Language
Python
License
BSD-3-Clause
Category
Last pushed
Dec 11, 2022
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/songlab-cal/tape"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
DeepRank/deeprank2
An open-source deep learning framework for data mining of protein-protein interfaces or...
sacdallago/biotrainer
Biological prediction models made simple.
jonathanking/sidechainnet
An all-atom protein structure dataset for machine learning.
a-r-j/ProteinWorkshop
Benchmarking framework for protein representation learning. Includes a large number of...
BioinfoMachineLearning/DIPS-Plus
The Enhanced Database of Interacting Protein Structures for Interface Prediction