Text Mining Fundamentals NLP Tools
Introductory courses, tutorials, and practical guides covering core text mining techniques, workflows, and applications. Includes repositories focused on teaching text processing, analysis methods, and statistical approaches to text data. Does NOT include domain-specific applications (sentiment analysis, fake news detection, etc.) or advanced specialized tools already categorized elsewhere.
There are 65 text mining fundamentals tools tracked. The highest-rated is dipanjanS/text-analytics-with-python at 44/100 with 1,690 stars.
Get all 65 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=text-mining-fundamentals&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
dipanjanS/text-analytics-with-python
Learn how to process, classify, cluster, summarize, understand syntax,... |
|
Emerging |
| 2 |
jonathandunn/text_analytics
Basic text analytics and natural language processing in Python |
|
Emerging |
| 3 |
Clarifai/clarifai-pyspark
Interfaces for Unstructured data and ML pipelines with Databricks and Clarifai |
|
Emerging |
| 4 |
IBM/watson-document-co-relation
Correlate text content across documents using Watson NLU, Python NLTK and... |
|
Emerging |
| 5 |
itrummer/NaturalMiner
Mine data for patterns described in natural language |
|
Emerging |
| 6 |
umer7/Applied-Text-Mining-in-Python
Repo for Applied Text Mining in Python (coursera) by University of Michigan |
|
Emerging |
| 7 |
EudaLabs/nlp
A repository for Natural Language Processing (NLP) projects, tools, and experiments. |
|
Emerging |
| 8 |
fingeredman/teanaps
자연어 처리와 텍스트 분석을 위한 오픈소스 파이썬 라이브러리 입니다. |
|
Emerging |
| 9 |
remrama/krank
Fetch curated dream reports. |
|
Emerging |
| 10 |
mchesterkadwell/intro-to-text-mining-with-python
Cambridge Digital Humanities 'Introduction to Text-Mining with Python'... |
|
Emerging |
| 11 |
algonell/ipo-miner
IPO Investment via Text Mining. |
|
Emerging |
| 12 |
zaratsian/Spark
Apache Spark (Scala, PySpark, SparkR) Code, Tricks, and References |
|
Emerging |
| 13 |
oroszgy/hungarian-text-mining-workshop
Materials for the Text Mining workshop held in the HuNLP meetup, June 2017 |
|
Emerging |
| 14 |
blanchefort/text_mining
Набор ноутбуков, в которых решаются различные задачи обработки естественного... |
|
Emerging |
| 15 |
mchesterkadwell/intro-to-text-mining-with-python-2020
Cambridge Digital Humanities Learning, Methods Workshop: "Introduction to... |
|
Emerging |
| 16 |
malares/STeM-Scientifc-Paper-Mining-Tool
STeM is a text mining tool to help scientists and researchers evaluate new... |
|
Emerging |
| 17 |
QData/TextAttack-WebDemo
TextAttack Web Demo |
|
Experimental |
| 18 |
JohnSnowLabs/spark-nlp-conda
Build and publish Spark NLP to Anaconda Cloud |
|
Experimental |
| 19 |
hhaoyan/awesome-textmining-materials-science
Collection of papers on text mining for materials science |
|
Experimental |
| 20 |
fingeredman/text-mining-for-practice
파이썬 라이브러리를 활용해 텍스트 분석을 수행하는 방법에 대해 다룹니다. |
|
Experimental |
| 21 |
argilla-io/biome-text
Custom Natural Language Processing with big and small models 🌲🌱 |
|
Experimental |
| 22 |
lorenzoscottb/DReAMy
DReAMy: a library for dream-reports annotation methods with python, NLP, and LLMs |
|
Experimental |
| 23 |
arshren/MachineLearning
Machine Learning documents |
|
Experimental |
| 24 |
mb010/Text2Tag
Code base for the analysis presented in in Bowles et al. 2022: "Radio Galaxy... |
|
Experimental |
| 25 |
DmitrySerg/open-data
Collecting and analysing open data stuff |
|
Experimental |
| 26 |
buomsoo-kim/Introduction-to-text-mining-with-Python
Lectures in Urban Data Science Lab, Seoul |
|
Experimental |
| 27 |
MrpYA45/github-text-mining-tfg
We're aiming to create a tool which lets us experiment with text mining and... |
|
Experimental |
| 28 |
HimanshuMittal01/bagmodels
Various bag-of-words ML algorithms like BM25 |
|
Experimental |
| 29 |
thatguy1104/NLP-Data-Mining-Engine
Our main project goals include trying to achieve a way for all researchers... |
|
Experimental |
| 30 |
SAP-samples/github-pull-analyzer
The GitHub Pull Request Analyzer (with SAP AI Core) automates the task of... |
|
Experimental |
| 31 |
aeleraqi/Text-Mining
Text mining techniques and workflows in Python |
|
Experimental |
| 32 |
ycatsh/connor
Organize and classify files based on their content using NLP |
|
Experimental |
| 33 |
prestondunton/marvel-dialogue-nlp
A machine learning project that will use Natural Language Processing (NLP)... |
|
Experimental |
| 34 |
Vaibhavabhaysharma/Applied-Text-Mining-in-Python
This repository contains solutions of the course-... |
|
Experimental |
| 35 |
SciCrunch/Antibody-Watch
Antibody Watch: Text Mining Antibody Specificity from the Literature |
|
Experimental |
| 36 |
juliasilge/ibm-ai-day
Presentation for IBM Community Day AI |
|
Experimental |
| 37 |
MahsaShk/ApacheSpark
Apache Spark machine learning project using pyspark |
|
Experimental |
| 38 |
StabRise/ScaleDP-Tutorials
Tutorials for ScaleDP library. ScaleDP is an Open-Source Library for... |
|
Experimental |
| 39 |
park1997/Industrial_safety_and_health_law-visualization
산업안전보건법 법규시각화, 텍스트마이닝을 통한 법들간의 유사도 네트워크화 |
|
Experimental |
| 40 |
cyidhn/texto
📚 La librairie Python de textométrie. |
|
Experimental |
| 41 |
analyticalmonk/pyspark_nlp_workshop
Instructions and code for the workshop "From Big Data to NLP Insights:... |
|
Experimental |
| 42 |
sudheera96/pyspark-textprocessing
Project on word count using pySpark, data bricks cloud environment. |
|
Experimental |
| 43 |
Achint08/tech-diffusion
Patents data analysis on PySpark |
|
Experimental |
| 44 |
AsadiAhmad/Edit-Distance-Spark
Calculating Edit Distance with PySpark |
|
Experimental |
| 45 |
AsadiAhmad/Ngram-Spark-Wikipedia
Calculating Ngram with PySpark for wikipedia text |
|
Experimental |
| 46 |
AsadiAhmad/Word-Counter-Spark
Word counter with spark |
|
Experimental |
| 47 |
fredriko/draviz
A method for assessing the data readiness of NLP projects, as well as the... |
|
Experimental |
| 48 |
fingeredman/text-mining-for-beginner
파이썬 기초문법 부터 간단한 텍스트 분석을 수행하는 방법에 대해 다룹니다. |
|
Experimental |
| 49 |
mucahidozcelik/NLP
Text Mining and Natural Language Processing |
|
Experimental |
| 50 |
fingeredman/advanced-text-mining
TEANAPS 라이브러리를 활용한 자연어 처리와 텍스트 분석 방법론에 대해 다룹니다. |
|
Experimental |
| 51 |
paulbricman/memnav
Expanding propositional memory through text mining. |
|
Experimental |
| 52 |
MuzamilSaiq/toy-to-theory-bag-of-words
Pedagogical walkthrough of Bag of Words |
|
Experimental |
| 53 |
thukg/AMinerOpen
An open source community who focuses on developing and publishing elegant... |
|
Experimental |
| 54 |
frances-ai/frances-api
frances is an advanced cloud-based text mining digital platform that... |
|
Experimental |
| 55 |
peetceenatoo/my-first-keyword-extractor
first steps into natural language processing |
|
Experimental |
| 56 |
tkachuksergiy/aws-spark-nlp
Works related to recent project on the use of Apache Spark and AWS cloud for... |
|
Experimental |
| 57 |
yashmanne/an_analysis_of_nothing
Exploring character occurrences and NLP with Seinfeld scripts. |
|
Experimental |
| 58 |
ReAlex1902/Hawk
German documents analysis |
|
Experimental |
| 59 |
manmeetkaurbaxi/Analyzing-ACL-and-EMNLP-papers
Analyzing paper details of ACL and EMNLP from 2016-2021. |
|
Experimental |
| 60 |
ekardatos/TextAnalysisAndStatisticalTesting
Statistical hypothesis testing applied to linguistic text data. |
|
Experimental |
| 61 |
YukiChen-yuxin/proj_NLPbrl_DATA534
The NLPbrl wrapper API is a package for wrapping The Rosette Text Analytics... |
|
Experimental |
| 62 |
N-y-c-t-o/Gutenberg-scribe-main
A Python-based project that processes and analyzes public-domain books from... |
|
Experimental |
| 63 |
Doubtable-Steves-Linguistics/MinecraftNLP
Natural Language Processing (NLP) project built to predict GitHub repository... |
|
Experimental |
| 64 |
Robin1999Stark/Recipe_Tagger
NLP Project for Auto Labeling Receipes |
|
Experimental |
| 65 |
exaiatech/cymo-tutorial
CYMO is a next-generation text mining and analytics software developed by Exaia |
|
Experimental |