Indonesian NLP Resources NLP Tools
Curated collections, datasets, and resource lists specifically for Indonesian/Malay language NLP. Includes benchmark datasets, resource compilations, and toolkit libraries for Bahasa Indonesia. Does NOT include general NLP courses, application-specific projects (like sentiment analysis tools), or non-Indonesian language resources.
There are 25 indonesian nlp resources tools tracked. 1 score above 50 (established tier). The highest-rated is malaysia-ai/malaya at 64/100 with 521 stars. 1 of the top 10 are actively maintained.
Get all 25 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=indonesian-nlp-resources&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
malaysia-ai/malaya
Natural Language Toolkit for Malaysian language, https://malaya.readthedocs.io/ |
|
Established |
| 2 |
louisowen6/NLP_bahasa_resources
A Curated List of Dataset and Usable Library Resources for NLP in Bahasa Indonesia |
|
Emerging |
| 3 |
IndoNLP/indonlu
The first-ever vast natural language processing benchmark for Indonesian... |
|
Emerging |
| 4 |
kirralabs/indonesian-NLP-resources
data resource untuk NLP bahasa indonesia |
|
Emerging |
| 5 |
wongnai/wongnai-corpus
Collection of Wongnai's datasets |
|
Emerging |
| 6 |
rizalespe/Dataset-Sentimen-Analisis-Bahasa-Indonesia
Repositori ini merupakan kumpulan dataset terkait analisis sentimen... |
|
Emerging |
| 7 |
kmkurn/id-pos-tagging
Indonesian part-of-speech (POS) tagging |
|
Emerging |
| 8 |
kmkurn/id-nlp-resource
A list of Indonesian NLP resources. |
|
Emerging |
| 9 |
IndoNLP/nusa-catalogue
Dataset Catalogue Homepage for Indonesian Languages |
|
Emerging |
| 10 |
IndoNLP/nusax
High-quality parallel resource on sentiment analysis for 10 low-resource... |
|
Emerging |
| 11 |
ariya/tebakmasa
Infer the date and time from the general description in Bahasa Indonesia |
|
Emerging |
| 12 |
yohanesgultom/nlp-experiments
Indonesian NLP experiments |
|
Experimental |
| 13 |
feryandi/Dataset-Artikel
Repository ini berisikan kumpulan data mentah berupa artikel dari berbagai... |
|
Experimental |
| 14 |
Wikidepia/indonesian_datasets
NLP Datasets for Indonesian |
|
Experimental |
| 15 |
Hyuto/indo-nlp
Library python sederhana tanpa dependency tambahan yang bertujuan untuk... |
|
Experimental |
| 16 |
ailabtelkom/id-NLP-resources
Kumpulan resource untuk pemrosesan bahasa alami Bahasa Indonesia. Segala... |
|
Experimental |
| 17 |
LazarusNLP/indonesian-sentence-embeddings
Embedding Representation for Indonesian Sentences! |
|
Experimental |
| 18 |
datascienceid/nlp-resources
A curated list of natural language processing courses, video lectures,... |
|
Experimental |
| 19 |
danieldanuega/spacyndo
Dependency Parser and NER model for Bahasa Indonesia Spacy 2.1 |
|
Experimental |
| 20 |
rrayhka/indonesian-ner-spacy
Fine-tuning SpaCy for Indonesian Named Entity Recognition (NER) with custom dataset. |
|
Experimental |
| 21 |
irfandythalib/python-indonesia-stopwords-remover
This code is used to remove stopwords using Tala stopwords library for... |
|
Experimental |
| 22 |
nandanovenia/resource-nlp-indonesia
Natural Language Processing Resource for Bahasa Indonesia |
|
Experimental |
| 23 |
matbahasa/MALINDO_BLiMP
MALINDO BLiMP (Malay/Indonesian Benchmark of Linguistic Minimal Pairs) |
|
Experimental |
| 24 |
Cortana-Coders/NutriSense
NutriSense: Platform Pengukuran Gizi dengan Pemrosesan Bahasa Alami |
|
Experimental |
| 25 |
HantuGur/NUSANTAARA-LEARN-LANGUAGE
🌿 NusaLingua adalah platform web edukasi bahasa daerah Indonesia berbasis... |
|
Experimental |