ajdavidl/Portuguese-NLP
List of resources and tools developed with focus on Portuguese.
Curated inventory of 100+ Portuguese-language NLP datasets and tools spanning sentiment analysis, question answering, essay scoring, speech recognition, and fake news detection, with resources hosted across HuggingFace, GitHub, and academic repositories. Includes domain-specific corpora (clinical NER, court decisions, e-commerce reviews) alongside foundational datasets like BrWaC and Carolina for pretraining, enabling end-to-end Portuguese NLP pipeline development from raw text to task-specific fine-tuning.
311 stars. No commits in the last 6 months.
Stars
311
Forks
32
Language
—
License
—
Category
Last pushed
Jun 26, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/ajdavidl/Portuguese-NLP"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
thalesbertaglia/enelvo
A flexible normalizer for user-generated content
meedan/alegre
A text and media analysis service for Meedan Check, a collaborative media annotation platform
alan-barzilay/NLPortugues
NLPortuguês - Aprenda PLN em português! Esse repositório contem os materiais e exercícios do...
EticaAI/linguistic-datasets-portuguese
Linguistic Datasets for Portuguese: Lista de conjuntos de dados linguísticos para língua...
turing-usp/fernando-pessoa
Classificador de poemas do Fernando Pessoa de acordo com os seus heterônimos