yassersouri/classify-text
"20 Newsgroups" text classification with python
ArchivedImplements comparative experiments across multiple feature representations (Bag of Words, TF, TF-IDF) and classifiers (Naive Bayes, SVM, k-NN) using scikit-learn, with evaluation via train-test splits and stratified k-fold cross-validation. Handles preprocessing quirks like UTF-8 incompatibility in source documents and supports both binary classification (likes vs. dislikes) and full 20-class multiclass scenarios. Results demonstrate TF-IDF with linear SVM achieving ~97% accuracy on binary tasks and ~89% on full 20-class classification.
147 stars. No commits in the last 6 months.
Stars
147
Forks
63
Language
Python
License
—
Category
Last pushed
Nov 30, 2016
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/yassersouri/classify-text"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
angelosalatino/cso-classifier
Python library that classifies content from scientific papers with the topics of the Computer...
giuseppebonaccorso/Reuters-21578-Classification
Text classification with Reuters-21578 datasets using Gensim Word2Vec and Keras LSTM
tblock/10kGNAD
Ten Thousand German News Articles Dataset for Topic Classification
NirantK/Hinglish
Hinglish Text Classification
newsgac/platform
Platform for machine learning experiments developed in the project NEWSGAC