bahaeddinmselmi/tunisian-arabic-ai-dataset
The largest open-source dataset for Tunisian Arabic (Derja) NLP, featuring social media text, transcripts, and e-commerce data for LLM training and fine-tuning.
Stars
1
Forks
—
Language
—
License
—
Category
Last pushed
Jan 28, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/bahaeddinmselmi/tunisian-arabic-ai-dataset"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
CAMeL-Lab/camel_tools
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York...
PetrKorab/Arabica
Python package for text mining of time-series data
markuskiller/textblob-de
German language support for TextBlob.
01walid/awesome-arabic
A curated list of awesome projects and dev/design resources for supporting Arabic computational needs.
MagedSaeed/farasapy
A Python implementation of Farasa toolkit