BERT Model Implementations Transformer Models

PyTorch and framework-specific implementations of BERT and BERT-variant architectures (RoBERTa, DistilBERT, etc.), including pretraining, finetuning libraries, and language-specific BERT models. Does NOT include task-specific applications (NER, classification, QA), downstream finetuning notebooks, or non-BERT transformer implementations.

There are 68 bert model implementations models tracked. 2 score above 50 (established tier). The highest-rated is Tongjilibo/bert4torch at 67/100 with 1,335 stars and 180 monthly downloads.

Get all 68 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=bert-model-implementations&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	Tongjilibo/bert4torch An elegent pytorch implement of transformers	67	Established	1,335	Python
2	nyu-mll/jiant jiant is an nlp toolkit	56	Established	1,674	Python
3	lonePatient/TorchBlocks A PyTorch-based toolkit for natural language processing	46	Emerging	160	Python
4	grammarly/gector Official implementation of the papers "GECToR – Grammatical Error...	44	Emerging	955	Python
5	monologg/JointBERT Pytorch implementation of JointBERT: "BERT for Joint Intent Classification...	44	Emerging	738	Python
6	backprop-ai/backprop Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.	43	Emerging	241	Python
7	appvision-ai/fast-bert Super easy library for BERT based NLP models	43	Emerging	1,920	Python
8	sagorbrur/bntransformer Bengali transformer using transformers	41	Emerging	22	Python
9	sagorbrur/bangla-bert Bangla-Bert is a pretrained bert model for Bengali language	40	Emerging	83	Jupyter Notebook
10	voidful/TFkit 🤖📇 handling multiple nlp task in one pipeline	39	Emerging	57	Python
11	taishi-i/nagisa_bert A BERT model for nagisa	37	Emerging	5	Jupyter Notebook
12	gitabtion/BertBasedCorrectionModels PyTorch impelementations of BERT-based Spelling Error Correction Models. ...	37	Emerging	279	Python
13	dccuchile/beto BETO - Spanish version of the BERT model	37	Emerging	502	—
14	iPieter/RobBERT A Dutch RoBERTa-based language model	36	Emerging	207	Jupyter Notebook
15	gitabtion/SoftMaskedBert-PyTorch 🙈 An unofficial implementation of SoftMaskedBert based on huggingface/transformers.	36	Emerging	97	Python
16	JetRunner/BERT-of-Theseus ⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT...	36	Emerging	315	Python
17	menon92/BangalASR Transformer based Bangla Speech Recognition \| Encoder Decoder Architecture	35	Emerging	57	Jupyter Notebook
18	Ethan-yt/guwenbert GuwenBERT: 古文预训练语言模型（古文BERT） A Pre-trained Language Model for Classical...	34	Emerging	555	—
19	ymcui/PERT PERT: Pre-training BERT with Permuted Language Model	34	Emerging	367	—
20	JulesBelveze/bert-squeeze 🛠️ Tools for Transformers compression using PyTorch Lightning ⚡	33	Emerging	85	Python
21	nlpaueb/greek-bert A Greek edition of BERT pre-trained language model	31	Emerging	148	Python
22	dbmdz/berts DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models	31	Emerging	159	—
23	alexa/ramen A software for transferring pre-trained English models to foreign languages	30	Emerging	19	Python
24	rdenadai/BR-BERTo Transformer model for Portuguese language (Brazil pt_BR)	30	Emerging	16	Python
25	retarfi/language-pretraining Pre-training Language Models for Japanese	29	Experimental	50	Python
26	cakshat/AlloyBERT Introducing AlloyBERT: a transformer encoder-based model for predicting...	29	Experimental	12	Python
27	bnosac/golgotha Contextualised Embeddings and Language Modelling using BERT and Friends using R	29	Experimental	47	R
28	TayeeChang/keras_transformers the implement of transformer family such as bert, alber, roberta, nezha, etc.	28	Experimental	7	Python
29	Beomi/exbert-transformers exBERT on Transformers🤗	28	Experimental	10	Python
30	psychbruce/FMAT 😷 The Fill-Mask Association Test (FMAT): Measuring Propositions in Natural Language.	28	Experimental	16	R
31	shahrukhx01/bert-probe BERT Probe: A python package for probing attention based robustness to...	27	Experimental	18	Jupyter Notebook
32	isaacus-dev/emubert-creator The training code behind EmuBert, the largest open-source masked language...	26	Experimental	3	Python
33	Beomi/KcBERT-Finetune KcBERT/KcELECTRA Fine Tune Benchmarks code (forked from...	26	Experimental	47	Python
34	HeegyuKim/language-model 한국어 언어 모델 학습을 위한 프로젝트(Flax, Pytorch with Huggingface Accelerate)	25	Experimental	32	Jupyter Notebook
35	ant-louis/netbert 📶 NetBERT: a domain-specific BERT model for computer networking.	25	Experimental	5	Jupyter Notebook
36	DomHudson/bert-in-production A collection of resources on using BERT (https://arxiv.org/abs/1810.04805 )...	24	Experimental	96	—
37	AshutoshDongare/softskill-NER Fine tuning 🤗 transformer model for softskill NER task	24	Experimental	3	Jupyter Notebook
38	asiff00/Bengali-Sentence-Error-Correction Fine-tune mBart 50 for Bengali Sentence Error Correction	24	Experimental	4	Jupyter Notebook
39	gitabtion/ConvBert-PyTorch 🤗An unofficial PyTorch implementation of ConvBert based on huggingface/transformers.	23	Experimental	17	Python
40	sagorbrur/fillblank Fill The Blank	23	Experimental	2	Jupyter Notebook
41	PlanTL-GOB-ES/lm-biomedical-clinical-es Official source for Spanish pretrained biomedical and clinical language...	23	Experimental	26	Python
42	YRL-AIDA/RuTaBERT RuTaBERT is a framework for solving column type and property annotation...	22	Experimental	7	Python
43	Thisen-Ekanayake/HelaBERT A compact BERT (6-layer) masked language model trained from scratch on a...	22	Experimental	—	Jupyter Notebook
44	phkhanhtrinh23/spelling_correction_project This spelling correction project helps people fix English spelling mistakes....	22	Experimental	18	Python
45	haozhg/lmd Language Model Decomposition: Quantifying the Dependency and Correlation of...	21	Experimental	10	Python
46	Pchambet/NLP-from-scratch-to-BERT End-to-end NLP in 4 notebooks: text preprocessing, TF-IDF,...	19	Experimental	—	Jupyter Notebook
47	lcl-hse/heptabot A full-text error corrector for English based on transformers and deep learning	19	Experimental	10	Jupyter Notebook
48	Vidhyambika/Next-Word-Prediction-using-BERT-GPT Predicting the next word for a sentence/word given using BERT	19	Experimental	—	Python
49	RichardScottOZ/geoscience-transformers-for-predictive-mapping-of-critical-minerals First pass paper implementation	19	Experimental	—	Python
50	sfp932705/simple_bert A pure pytorch from scratch implementation of BERT	19	Experimental	—	Python
51	shreydan/masked-language-modeling Transformers Pre-Training with MLM objective — implemented encoder-only...	18	Experimental	6	Jupyter Notebook
52	LennartKeller/roberta2longformer Convert pretrained RoBerta models to various long-document transformer models	18	Experimental	11	Python
53	ilanaliouchouche/KANBert Implementation of an Encoder only MoE usable as an Embedding Model,...	17	Experimental	2	Python
54	joshstephenson/MorphemeSegmentation This is a survey of morpheme segmentation techniques including 2 baselines...	16	Experimental	3	Python
55	Vincentiv/BERT_Finetuning_from_scratch Notebook on finetuning BERT	15	Experimental	2	Jupyter Notebook
56	sappho192/ffxiv-ja-ko-translator Japanese→Korean translator model specialized in Final Fantasy XIV based on...	14	Experimental	11	C#
57	Sean652039/Token-Masking Token Masking Regularization	14	Experimental	—	Python
58	tejasvaidhyadev/ALBERT.jl ALBERT(A Lite BERT for Self-Supervised Learning of Language Representations)...	13	Experimental	7	Julia
59	SumitM0432/XLM-RoBERTa-for-Textual-Entailment A multilingual model XLM- RoBERTa for the textual entailment of sequence...	13	Experimental	6	Jupyter Notebook
60	DiFronzo/Multilingual-Models mBERT and XLM-R for encodeing of Scandinavian languages	12	Experimental	3	Python
61	teticio/inBERTolate Hit your word count by using BERT to pad out your essays!	12	Experimental	3	Python
62	mhmdsabry/BERT_with_Residual_vs_Highway Comparing between residual stream and highway stream in transformers(BERT) .	12	Experimental	3	Python
63	viktor-shcherb/vive_la_ner The default way to fine-tune BERT is wrong. Here is why	12	Experimental	4	Jupyter Notebook
64	mdmmn378/spell-magic Transformer Based Seq2Seq Model for Bangla Spell Correction	11	Experimental	—	Jupyter Notebook
65	UnkindGoose/MultiTask-NLP-model Multitask model for NER and document-level classification. Project contains...	11	Experimental	—	Jupyter Notebook
66	davydantoniuk/grammarfix-bot Fine-tuned a Hugging Face transformer model for grammar correction.	10	Experimental	1	Jupyter Notebook
67	gaolichen/simplebert A simple implementation of transformer models with tensorflow/keras.	10	Experimental	1	Python
68	cbstanley/dp-bert Differential privacy with BERT model	10	Experimental	1	Python