clips/pattern
Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.
Bundles multilingual NLP components (Brill taggers for English, Dutch, German, Spanish, French, Italian) alongside machine learning classifiers (KNN, SVM via LIBSVM/LIBLINEAR) and integrates with public web APIs (Google, Twitter, Wikipedia) for direct data acquisition. Implements a vector-space pipeline that chains HTML parsing, POS tagging, and feature extraction for end-to-end text classification workflows, with graph analysis built on NetworkX for network visualization.
8,856 stars. No commits in the last 6 months.
Stars
8,856
Forks
1,570
Language
Python
License
BSD-3-Clause
Category
Last pushed
Jun 10, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/clips/pattern"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
maximtrp/bitermplus
Biterm Topic Model (BTM): modeling topics in short texts
stephenhky/PyShortTextCategorization
Various Algorithms for Short Text Mining
Hassaan-Elahi/Writing-Styles-Classification-Using-Stylometric-Analysis
✍️ An intelligent system that takes a document and classifies different writing styles within...
eimg/burmese-text-classifier
A neural network based text classification system for Burmese
cohere-ai/sandbox-topically
Topic modeling helpers using managed language models from Cohere. Name text clusters using large...