SkyworkAI/Skywork
Skywork series models are pre-trained on 3.2TB of high-quality multilingual (mainly Chinese and English) and code data. We have open-sourced the model, training data, evaluation data, evaluation methods, etc.
1,491 stars. No commits in the last 6 months.
Stars
1,491
Forks
145
Language
Python
License
—
Category
Last pushed
Mar 07, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/SkyworkAI/Skywork"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
mikahama/uralicNLP
An NLP library for Uralic languages such as Finnish, Skolt Sami, Moksha and so on. Also...
shamspias/lexsublm-lite
A laptop‑friendly toolkit for context‑aware single‑word paraphrasing and lexical‑substitution...
jiangnanboy/llm_corpus_quality
大模型预训练中文语料清洗及质量评估 Large model pre-training corpus cleaning