BERT Model Implementations Transformer Models

PyTorch and framework-specific implementations of BERT and BERT-variant architectures (RoBERTa, DistilBERT, etc.), including pretraining, finetuning libraries, and language-specific BERT models. Does NOT include task-specific applications (NER, classification, QA), downstream finetuning notebooks, or non-BERT transformer implementations.

There are 68 bert model implementations models tracked. 2 score above 50 (established tier). The highest-rated is Tongjilibo/bert4torch at 67/100 with 1,335 stars and 180 monthly downloads.

Get all 68 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=bert-model-implementations&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 Tongjilibo/bert4torch

An elegent pytorch implement of transformers

67
Established
2 nyu-mll/jiant

jiant is an nlp toolkit

56
Established
3 lonePatient/TorchBlocks

A PyTorch-based toolkit for natural language processing

46
Emerging
4 grammarly/gector

Official implementation of the papers "GECToR – Grammatical Error...

44
Emerging
5 monologg/JointBERT

Pytorch implementation of JointBERT: "BERT for Joint Intent Classification...

44
Emerging
6 backprop-ai/backprop

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.

43
Emerging
7 appvision-ai/fast-bert

Super easy library for BERT based NLP models

43
Emerging
8 sagorbrur/bntransformer

Bengali transformer using transformers

41
Emerging
9 sagorbrur/bangla-bert

Bangla-Bert is a pretrained bert model for Bengali language

40
Emerging
10 voidful/TFkit

🤖📇 handling multiple nlp task in one pipeline

39
Emerging
11 taishi-i/nagisa_bert

A BERT model for nagisa

37
Emerging
12 gitabtion/BertBasedCorrectionModels

PyTorch impelementations of BERT-based Spelling Error Correction Models. ...

37
Emerging
13 dccuchile/beto

BETO - Spanish version of the BERT model

37
Emerging
14 iPieter/RobBERT

A Dutch RoBERTa-based language model

36
Emerging
15 gitabtion/SoftMaskedBert-PyTorch

🙈 An unofficial implementation of SoftMaskedBert based on huggingface/transformers.

36
Emerging
16 JetRunner/BERT-of-Theseus

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT...

36
Emerging
17 menon92/BangalASR

Transformer based Bangla Speech Recognition | Encoder Decoder Architecture

35
Emerging
18 Ethan-yt/guwenbert

GuwenBERT: 古文预训练语言模型(古文BERT) A Pre-trained Language Model for Classical...

34
Emerging
19 ymcui/PERT

PERT: Pre-training BERT with Permuted Language Model

34
Emerging
20 JulesBelveze/bert-squeeze

🛠️ Tools for Transformers compression using PyTorch Lightning ⚡

33
Emerging
21 nlpaueb/greek-bert

A Greek edition of BERT pre-trained language model

31
Emerging
22 dbmdz/berts

DBMDZ BERT, DistilBERT, ELECTRA, GPT-2 and ConvBERT models

31
Emerging
23 alexa/ramen

A software for transferring pre-trained English models to foreign languages

30
Emerging
24 rdenadai/BR-BERTo

Transformer model for Portuguese language (Brazil pt_BR)

30
Emerging
25 retarfi/language-pretraining

Pre-training Language Models for Japanese

29
Experimental
26 cakshat/AlloyBERT

Introducing AlloyBERT: a transformer encoder-based model for predicting...

29
Experimental
27 bnosac/golgotha

Contextualised Embeddings and Language Modelling using BERT and Friends using R

29
Experimental
28 TayeeChang/keras_transformers

the implement of transformer family such as bert, alber, roberta, nezha, etc.

28
Experimental
29 Beomi/exbert-transformers

exBERT on Transformers🤗

28
Experimental
30 psychbruce/FMAT

😷 The Fill-Mask Association Test (FMAT): Measuring Propositions in Natural Language.

28
Experimental
31 shahrukhx01/bert-probe

BERT Probe: A python package for probing attention based robustness to...

27
Experimental
32 isaacus-dev/emubert-creator

The training code behind EmuBert, the largest open-source masked language...

26
Experimental
33 Beomi/KcBERT-Finetune

KcBERT/KcELECTRA Fine Tune Benchmarks code (forked from...

26
Experimental
34 HeegyuKim/language-model

한국어 언어 모델 학습을 위한 프로젝트(Flax, Pytorch with Huggingface Accelerate)

25
Experimental
35 ant-louis/netbert

📶 NetBERT: a domain-specific BERT model for computer networking.

25
Experimental
36 DomHudson/bert-in-production

A collection of resources on using BERT (https://arxiv.org/abs/1810.04805 )...

24
Experimental
37 AshutoshDongare/softskill-NER

Fine tuning 🤗 transformer model for softskill NER task

24
Experimental
38 asiff00/Bengali-Sentence-Error-Correction

Fine-tune mBart 50 for Bengali Sentence Error Correction

24
Experimental
39 gitabtion/ConvBert-PyTorch

🤗An unofficial PyTorch implementation of ConvBert based on huggingface/transformers.

23
Experimental
40 sagorbrur/fillblank

Fill The Blank

23
Experimental
41 PlanTL-GOB-ES/lm-biomedical-clinical-es

Official source for Spanish pretrained biomedical and clinical language...

23
Experimental
42 YRL-AIDA/RuTaBERT

RuTaBERT is a framework for solving column type and property annotation...

22
Experimental
43 Thisen-Ekanayake/HelaBERT

A compact BERT (6-layer) masked language model trained from scratch on a...

22
Experimental
44 phkhanhtrinh23/spelling_correction_project

This spelling correction project helps people fix English spelling mistakes....

22
Experimental
45 haozhg/lmd

Language Model Decomposition: Quantifying the Dependency and Correlation of...

21
Experimental
46 Pchambet/NLP-from-scratch-to-BERT

End-to-end NLP in 4 notebooks: text preprocessing, TF-IDF,...

19
Experimental
47 lcl-hse/heptabot

A full-text error corrector for English based on transformers and deep learning

19
Experimental
48 Vidhyambika/Next-Word-Prediction-using-BERT-GPT

Predicting the next word for a sentence/word given using BERT

19
Experimental
49 RichardScottOZ/geoscience-transformers-for-predictive-mapping-of-critical-minerals

First pass paper implementation

19
Experimental
50 sfp932705/simple_bert

A pure pytorch from scratch implementation of BERT

19
Experimental
51 shreydan/masked-language-modeling

Transformers Pre-Training with MLM objective — implemented encoder-only...

18
Experimental
52 LennartKeller/roberta2longformer

Convert pretrained RoBerta models to various long-document transformer models

18
Experimental
53 ilanaliouchouche/KANBert

Implementation of an Encoder only MoE usable as an Embedding Model,...

17
Experimental
54 joshstephenson/MorphemeSegmentation

This is a survey of morpheme segmentation techniques including 2 baselines...

16
Experimental
55 Vincentiv/BERT_Finetuning_from_scratch

Notebook on finetuning BERT

15
Experimental
56 sappho192/ffxiv-ja-ko-translator

Japanese→Korean translator model specialized in Final Fantasy XIV based on...

14
Experimental
57 Sean652039/Token-Masking

Token Masking Regularization

14
Experimental
58 tejasvaidhyadev/ALBERT.jl

ALBERT(A Lite BERT for Self-Supervised Learning of Language Representations)...

13
Experimental
59 SumitM0432/XLM-RoBERTa-for-Textual-Entailment

A multilingual model XLM- RoBERTa for the textual entailment of sequence...

13
Experimental
60 DiFronzo/Multilingual-Models

mBERT and XLM-R for encodeing of Scandinavian languages

12
Experimental
61 teticio/inBERTolate

Hit your word count by using BERT to pad out your essays!

12
Experimental
62 mhmdsabry/BERT_with_Residual_vs_Highway

Comparing between residual stream and highway stream in transformers(BERT) .

12
Experimental
63 viktor-shcherb/vive_la_ner

The default way to fine-tune BERT is wrong. Here is why

12
Experimental
64 mdmmn378/spell-magic

Transformer Based Seq2Seq Model for Bangla Spell Correction

11
Experimental
65 UnkindGoose/MultiTask-NLP-model

Multitask model for NER and document-level classification. Project contains...

11
Experimental
66 davydantoniuk/grammarfix-bot

Fine-tuned a Hugging Face transformer model for grammar correction.

10
Experimental
67 gaolichen/simplebert

A simple implementation of transformer models with tensorflow/keras.

10
Experimental
68 cbstanley/dp-bert

Differential privacy with BERT model

10
Experimental