Korean Language Models

Pretrained transformer models specifically designed for Korean language processing, including BERT, ELECTRA, and specialized variants. Does NOT include general multilingual models, non-Korean language models, or downstream task-specific applications (unless they primarily showcase the Korean model architecture itself).

There are 33 korean language models tracked. The highest-rated is SKTBrain/KoBERT at 46/100 with 1,407 stars.

Get all 33 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=korean-language-models&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 SKTBrain/KoBERT

Korean BERT pre-trained cased (KoBERT)

46
Emerging
2 monologg/KoELECTRA

Pretrained ELECTRA Model for Korean

44
Emerging
3 monologg/KoBERT-Transformers

KoBERT on ๐Ÿค— Huggingface Transformers ๐Ÿค— (with Bug Fixed)

40
Emerging
4 VinAIResearch/PhoBERT

PhoBERT: Pre-trained language models for Vietnamese (EMNLP-2020 Findings)

40
Emerging
5 KB-AI-Research/KB-ALBERT

KB๊ตญ๋ฏผ์€ํ–‰์—์„œ ์ œ๊ณตํ•˜๋Š” ๊ฒฝ์ œ/๊ธˆ์œต ๋„๋ฉ”์ธ์— ํŠนํ™”๋œ ํ•œ๊ตญ์–ด ALBERT ๋ชจ๋ธ

39
Emerging
6 monologg/KoBERT-KorQuAD

Korean MRC (KorQuAD) with KoBERT

37
Emerging
7 ymcui/MacBERT

Revisiting Pre-trained Models for Chinese Natural Language Processing (MacBERT)

37
Emerging
8 monologg/DistilKoBERT

Distillation of KoBERT from SKTBrain (Lightweight KoBERT)

35
Emerging
9 Beomi/KcELECTRA

๐Ÿค— Korean Comments ELECTRA: ํ•œ๊ตญ์–ด ๋Œ“๊ธ€๋กœ ํ•™์Šตํ•œ ELECTRA ๋ชจ๋ธ

34
Emerging
10 thevasudevgupta/bigbird

Google's BigBird (Jax/Flax & PyTorch) @ ๐Ÿค—Transformers

33
Emerging
11 monologg/korean-hate-speech-koelectra

Bias, Hate classification with KoELECTRA ๐Ÿ‘ฟ

33
Emerging
12 monologg/KoBigBird

๐Ÿฆ… Pretrained BigBird Model for Korean (up to 4096 tokens)

33
Emerging
13 monologg/KoCharELECTRA

Character-level Korean ELECTRA Model (์Œ์ ˆ ๋‹จ์œ„ ํ•œ๊ตญ์–ด ELECTRA)

33
Emerging
14 toriving/text-classification-transformers

Easy text classification for everyone : Bert based models via Huggingface...

30
Emerging
15 monologg/KoELECTRA-Pipeline

Transformers Pipeline with KoELECTRA

30
Emerging
16 monologg/HanBert-Transformers

HanBert on ๐Ÿค— Huggingface Transformers ๐Ÿค—

29
Experimental
17 bayartsogt-ya/albert-mongolian

ALBERT trained on Mongolian text corpus

27
Experimental
18 sajjjadayobi/ParsBigBird

Persian Bert For Long-Range Sequences

26
Experimental
19 Anshler/vietnamese-poem-classifier

Classify genre and score Vietnamese poems ๐Ÿ“œ๐Ÿ”

26
Experimental
20 SciCrunch/bio_electra

Bio-Electra - Small and efficient discriminatively pre-trained language...

24
Experimental
21 oneonlee/KoAirBERT

๐Ÿค— ํ•ญ๊ณต ์•ˆ์ „ ๋„๋ฉ”์ธ์— ํŠนํ™”๋œ ํ•œ๊ตญ์–ด BERT ๋ชจ๋ธ โœˆ๏ธ

23
Experimental
22 svn05/vietnamese-sentiment-phobert

a fine tuned PhoBERT model to classify product reviews across a range of...

20
Experimental
23 Nikki-oo7/pos-tagger

Part-of-Speech Tagger implemented in PyTorch using BiLSTM and Transformer models.

19
Experimental
24 yejoon-lee/kr3

KR3: Korean Restaurant Review with Ratings / Experiments on...

18
Experimental
25 qanastek/French-Part-Of-Speech-Tagging

Repository for the source code of the HuggingFace Space named...

16
Experimental
26 codegram/calbert

Catalan ALBERT (A Lite BERT for self-supervised learning of language representations)

16
Experimental
27 Pirata-Codex/Tag-Persian-Entities-Using-Bert

Using the fa-bert model to tag persian entities in a sentence

15
Experimental
28 edoost/pert

Persian Ezafe Recognition Using Transformers and Its Role in Part-Of-Speech Tagging

13
Experimental
29 qanastek/ANTILLES

ANTILLES : An Open French Linguistically Enriched Part-of-Speech Corpus

13
Experimental
30 HRSadeghi/Joint_Comma_and_Kasreh_Recognizer

In this repository, we provide a joint neural model based on BERT and two...

12
Experimental
31 ilos-vigil/bigbird-small-indonesian

Lighweight Indonesian language model for long sequence.

12
Experimental
32 phanxuanphucnd/CoBERTa

CoBERTa is a pre-trained models are the pre-trained language models for...

12
Experimental
33 amanaser/BabyLM-ELECTRA-Pre-training

BabyLM ELECTRA Pre-training on NVIDIA L40 GPU Cluster.

11
Experimental