SKTBrain/KoBERT
Korean BERT pre-trained cased (KoBERT)
Pretrained on 5M Korean Wikipedia sentences with SentencePiece tokenization, achieving a compact 8,002-token vocabulary (92M parameters vs. 110M for multilingual BERT). Supports PyTorch, ONNX, and MXNet-Gluon frameworks with ready-to-use model loading APIs. Demonstrates superior Korean NLP performance on sentiment analysis and named entity recognition tasks compared to Google's multilingual BERT baseline.
1,407 stars. No commits in the last 6 months.
Stars
1,407
Forks
380
Language
Python
License
Apache-2.0
Category
Last pushed
Jun 14, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/transformers/SKTBrain/KoBERT"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related models
monologg/KoELECTRA
Pretrained ELECTRA Model for Korean
VinAIResearch/PhoBERT
PhoBERT: Pre-trained language models for Vietnamese (EMNLP-2020 Findings)
monologg/KoBERT-Transformers
KoBERT on 🤗 Huggingface Transformers 🤗 (with Bug Fixed)
KB-AI-Research/KB-ALBERT
KB국민은행에서 제공하는 경제/금융 도메인에 특화된 한국어 ALBERT 모델
ymcui/MacBERT
Revisiting Pre-trained Models for Chinese Natural Language Processing (MacBERT)