hrwhisper/SpamMessage
中文垃圾短信识别(手写分类器)
Implements multiple classification algorithms (Perceptron, Logistic Regression, Naive Bayes, SVM) with both custom implementations and scikit-learn wrappers, using jieba for Chinese tokenization and bag-of-words feature representation. The pipeline includes separate training (cross-validation in test.py) and inference phases, with trained models serialized for reuse. Accepts unlabeled SMS files via command-line interface and outputs binary spam/non-spam predictions.
201 stars. No commits in the last 6 months.
Stars
201
Forks
61
Language
Python
License
—
Category
Last pushed
Dec 08, 2016
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/hrwhisper/SpamMessage"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
MAIF/melusine
📧 Melusine: Use python to automatize your email processing workflow
97k/spam-ham-web-app
A web app that classifies text as a spam or ham. I am using my own ML algorithm in the backend,...
30lm32/ml-spam-sms-classification
Naive Bayesian, SVM, Random Forest Classifier, and Deeplearing (LSTM) on top of Keras and...
stdlib-js/datasets-spam-assassin
Spam Assassin public mail corpus.
Zimbra/zimbra-ml
Zimbra Machine Learning GraphQL Server