seanpm2001/AI2001_Category-Linguistics-SC-English
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🔠️🔢️ The linguistic:English category for AI2001, containing English language linguistic datasets
Part of the broader AI2001 multilingual dataset initiative, this collection aggregates English linguistic resources structured for machine learning training pipelines. The project organizes diverse language datasets by category and subcategory within a modular framework, enabling systematic curation of NLP training data. Integration targets AI2001's unified dataset infrastructure, allowing cross-linguistic research and comparative language model development.
No commits in the last 6 months.
Stars
12
Forks
1
Language
R
License
GPL-3.0
Category
Last pushed
Mar 18, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/ml-frameworks/seanpm2001/AI2001_Category-Linguistics-SC-English"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
dsfsi/za-mafoko
DSFSI South African Terminlogy Lists and Lexicon Project
seanpm2001/AI2001_Category-Linguistics-SC-Igbo
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🔠️🔢️ The linguistic:Igbo category for AI2001, containing Igbo language...
seanpm2001/AI2001_Category-Linguistics-SC-Manchu
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🔠️🔢️ The linguistic:Manchu category for AI2001, containing Manchu language...
seanpm2001/AI2001_Category-Linguistics-SC-Tamil
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🔠️🔢️ The linguistic:Tamil category for AI2001, containing Tamil language...
seanpm2001/AI2001_Category-Linguistics-SC-Hungarian
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️🔠️🔢️ The linguistic:Hungarian category for AI2001, containing Hungarian...