Text-Mining/Persian-NER
پیکره بزرگ شناسایی موجودیتهای نامدار فارسی برچسب خورده
Contains ~25 million tokens across ~1 million sentences extracted from Persian Wikipedia, annotated with five entity classes: Person, Organization, Location, Event, and Time/Date expressions. Leverages crowdsourced annotation through a web interface (text-mining.ir) where 1000+ contributors have refined tag quality, enabling continuous dataset improvement. Provides standardized IOB-tagged format suitable for training sequence labeling models and NER systems targeting Persian language processing.
237 stars. No commits in the last 6 months.
Stars
237
Forks
25
Language
—
License
MIT
Category
Last pushed
Jun 29, 2021
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/Text-Mining/Persian-NER"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
MantisAI/nervaluate
Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13
dice-group/gerbil
GERBIL - General Entity annotatoR Benchmark
syuoni/eznlp
Easy Natural Language Processing
OpenJarbas/simple_NER
simple rule based named entity recognition
bltlab/seqscore
SeqScore: Scoring for named entity recognition and other sequence labeling tasks