allenai/scibert

A BERT model for scientific text.

48
/ 100
Emerging

Pretrained on 1.14M full-text papers (3.1B tokens) from Semantic Scholar with a domain-specific vocabulary optimized for scientific language. Available in TensorFlow and PyTorch formats via Hugging Face's `transformers` library, supporting both custom `scivocab` and standard BERT vocabularies in cased/uncased variants. Achieves state-of-the-art results across scientific NLP tasks including named entity recognition, relation extraction, citation intent classification, and dependency parsing on biomedical and computer science benchmarks.

1,677 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 22 / 25

How are scores calculated?

Stars

1,677

Forks

233

Language

Python

License

Apache-2.0

Last pushed

Feb 22, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/allenai/scibert"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.