grammarly/ua-gec
UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language
Provides two corpus variants (GEC+Fluency and GEC-only) with fine-grained error annotations covering 15+ grammatical and fluency error categories, alongside source/target plain-text pairs for 500K+ tokens across 1,872 documents. The accompanying `ua_gec` Python library offers document iteration, metadata filtering by author demographics and submission type, and `AnnotatedText` utilities for programmatic annotation parsing and selective error removal. Corpus includes multi-annotator test sets and tracks native vs. non-native speaker texts across Ukrainian regions, supporting both model training and detailed linguistic error analysis.
269 stars. No commits in the last 6 months.
Stars
269
Forks
23
Language
Macaulay2
License
CC-BY-4.0
Category
Last pushed
Feb 11, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/grammarly/ua-gec"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
kanyun-inc/fairseq-gec
Source code for paper: Improving Grammatical Error Correction via Pre-Training a Copy-Augmented...
awasthiabhijeet/PIE
Fast + Non-Autoregressive Grammatical Error Correction using BERT. Code and Pre-trained models...
kakaobrain/helo-word
Team Kakao&Brain's Grammatical Error Correction System for the ACL 2019 BEA Shared Task
CAMeL-Lab/text-editing
Code, models, and data for "Enhancing Text Editing for Grammatical Error Correction: Arabic as a...
CAMeL-Lab/arabic-gec
Code, models, and data for "Advancements in Arabic Grammatical Error Detection and Correction:...