All NLP Tools
11,856 tools ranked by quality score
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
explosion/spaCy
💫 Industrial-strength Natural Language Processing (NLP) in Python |
|
Verified |
| 2 |
PyThaiNLP/pythainlp
Thai natural language processing in Python |
|
Verified |
| 3 |
urchade/GLiNER
Generalist and Lightweight Model for Named Entity Recognition (Extract any... |
|
Verified |
| 4 |
sloria/TextBlob
Simple, Pythonic, text processing--Sentiment analysis, part-of-speech... |
|
Verified |
| 5 |
nltk/nltk
NLTK Source |
|
Verified |
| 6 |
chrismattmann/tika-python
Tika-Python is a Python binding to the Apache Tika™ REST services allowing... |
|
Verified |
| 7 |
textlint/textlint
textlint is the pluggable linter for natural language text. |
|
Verified |
| 8 |
deepdoctection/deepdoctection
A Repo For Document AI |
|
Verified |
| 9 |
stanfordnlp/stanza
Stanford NLP Python library for tokenization, sentence segmentation, NER,... |
|
Verified |
| 10 |
google/sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation. |
|
Verified |
| 11 |
miso-belica/sumy
Module for automatic summarization of text documents and HTML pages. |
|
Verified |
| 12 |
robocorp/rpaframework
Collection of open-source libraries and tools for Robotic Process Automation... |
|
Verified |
| 13 |
google/langextract
A Python library for extracting structured information from unstructured... |
|
Verified |
| 14 |
flairNLP/flair
A very simple framework for state-of-the-art Natural Language Processing (NLP) |
|
Verified |
| 15 |
deanmalmgren/textract
extract text from any document. no muss. no fuss. |
|
Verified |
| 16 |
spencermountain/compromise
modest natural-language processing |
|
Verified |
| 17 |
jxmorris12/language_tool_python
a free python grammar checker 📝✅ |
|
Verified |
| 18 |
hankcs/HanLP
Natural Language Processing for the next decade. Tokenization,... |
|
Verified |
| 19 |
CAMeL-Lab/camel_tools
A suite of Arabic natural language processing tools developed by the CAMeL... |
|
Verified |
| 20 |
NPC-Worldwide/npcpy
The python library for research and development in NLP, multimodal LLMs,... |
|
Verified |
| 21 |
unitaryai/detoxify
Trained models & code to predict toxic comments on all 3 Jigsaw Toxic... |
|
Verified |
| 22 |
EmilStenstrom/conllu
A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a... |
|
Verified |
| 23 |
gunthercox/chatterbot-corpus
A multilingual dialog corpus |
|
Verified |
| 24 |
chatopera/Synonyms
:herb: 中文近义词:聊天机器人,智能问答工具包 |
|
Verified |
| 25 |
lovit/soynlp
한국어 자연어처리를 위한 파이썬 라이브러리입니다. 단어 추출/ 토크나이저 / 품사판별/ 전처리의 기능을 제공합니다. |
|
Verified |
| 26 |
isaacus-dev/semchunk
A fast, lightweight and easy-to-use Python library for splitting text into... |
|
Verified |
| 27 |
huggingface/setfit
Efficient few-shot learning with Sentence Transformers |
|
Verified |
| 28 |
flairNLP/fundus
A very simple news crawler with a funny name |
|
Verified |
| 29 |
vi3k6i5/flashtext
Extract Keywords from sentence or Replace keywords in sentences. |
|
Verified |
| 30 |
estnltk/estnltk
Open source tools for Estonian natural language processing |
|
Verified |
| 31 |
JoeanAmier/XHS-Downloader
小红书(XiaoHongShu、RedNote)链接提取/作品采集工具:提取账号发布、收藏、点赞、专辑作品链接;提取搜索结果作品、用户链接;采集小红书作品... |
|
Verified |
| 32 |
cltk/cltk
The Classical Language Toolkit |
|
Verified |
| 33 |
google/langfun
OO for LLMs |
|
Verified |
| 34 |
kenlimmj/rouge
A Javascript implementation of the Recall-Oriented Understudy for Gisting... |
|
Verified |
| 35 |
hplt-project/sacremoses
Python port of Moses tokenizer, truecaser and normalizer |
|
Verified |
| 36 |
languagetool-org/languagetool
Style and Grammar Checker for 25+ Languages |
|
Verified |
| 37 |
dkpro/dkpro-cassis
UIMA CAS processing library written in Python |
|
Verified |
| 38 |
JohnSnowLabs/spark-nlp
State of the Art Natural Language Processing |
|
Verified |
| 39 |
grobidOrg/grobid
A machine learning software for extracting information from scholarly documents |
|
Verified |
| 40 |
forzagreen/n2words
Convert numerical numbers to written numbers, in 52+ languages. |
|
Verified |
| 41 |
bab2min/kiwipiepy
Python API for Kiwi |
|
Verified |
| 42 |
dongrixinyu/JioNLP
中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocessing & Parsing Package... |
|
Verified |
| 43 |
666ghj/BettaFish
微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。 |
|
Verified |
| 44 |
HIT-SCIR/ltp
Language Technology Platform |
|
Established |
| 45 |
hellohaptik/chatbot_ner
chatbot_ner: Named Entity Recognition for chatbots. |
|
Established |
| 46 |
aphp/edsnlp
Modular, fast NLP framework, compatible with Pytorch and spaCy, offering... |
|
Established |
| 47 |
ChenghaoMou/text-dedup
All-in-one text de-duplication |
|
Established |
| 48 |
acl-org/acl-anthology
Data and software for building the ACL Anthology. |
|
Established |
| 49 |
chatopera/efaqa-corpus-zh
❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库 |
|
Established |
| 50 |
zjunlp/DeepKE
[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction |
|
Established |
| 51 |
discopy/discopy
The Python toolkit for computing with string diagrams. |
|
Established |
| 52 |
MantisAI/nervaluate
Full named-entity (i.e., not tag/token) evaluation metrics based on SemEval’13 |
|
Established |
| 53 |
Alir3z4/python-stop-words
Get list of common stop words in various languages in Python |
|
Established |
| 54 |
hankcs/pyhanlp
中文分词 |
|
Established |
| 55 |
thisandagain/sentiment
AFINN-based sentiment analysis for Node.js. |
|
Established |
| 56 |
OpenPecha/Botok
🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python |
|
Established |
| 57 |
goodmami/wn
A modern, interlingual wordnet interface for Python |
|
Established |
| 58 |
adbar/htmldate
Fast and robust date extraction from web pages, with Python or on the command-line |
|
Established |
| 59 |
CUNY-CL/wikipron
Massively multilingual pronunciation mining |
|
Established |
| 60 |
jacksonllee/pycantonese
Cantonese Linguistics and NLP |
|
Established |
| 61 |
huggingface/neuralcoref
✨Fast Coreference Resolution in spaCy with Neural Networks |
|
Established |
| 62 |
anoopkunchukuttan/indic_nlp_library
Resources and tools for Indian language Natural Language Processing |
|
Established |
| 63 |
allenai/scispacy
A full spaCy pipeline and models for scientific/biomedical documents. |
|
Established |
| 64 |
apache/opennlp
Apache OpenNLP |
|
Established |
| 65 |
MIND-Lab/OCTIS
OCTIS: Comparing Topic Models is Simple! A python package to optimize and... |
|
Established |
| 66 |
gunthercox/mathparse
A Python library for evaluating natural language mathematical equations |
|
Established |
| 67 |
DataFog/datafog-python
Python SDK for PII detection and redaction in text and images, combining... |
|
Established |
| 68 |
i-dot-ai/themefinder
A topic modelling Python package for analysing one-to-many question-answer data. |
|
Established |
| 69 |
undertheseanlp/underthesea
Underthesea - Vietnamese NLP Toolkit |
|
Established |
| 70 |
ziqizhang/jate
JATE - Just Automatic Term Extraction (in Python) |
|
Established |
| 71 |
facebookresearch/stopes
A library for preparing data for machine translation research (monolingual... |
|
Established |
| 72 |
codertimo/BERT-pytorch
Google AI 2018 BERT pytorch implementation |
|
Established |
| 73 |
blmoistawinde/HarvestText
文本挖掘和预处理工具(文本清洗、新词发现、情感分析、实体识别链接、关键词抽取、知识抽取、句法分析等),无监督或弱监督方法 |
|
Established |
| 74 |
fastnlp/fastNLP
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation. |
|
Established |
| 75 |
go-ego/gse
Go efficient multilingual NLP and text segmentation; support English,... |
|
Established |
| 76 |
rmovva/HypotheSAEs
HypotheSAEs: hypothesizing interpretable relationships in text datasets... |
|
Established |
| 77 |
segment-any-text/wtpsplit
Toolkit to segment text into sentences or other semantic units in a robust,... |
|
Established |
| 78 |
baidu/lac
百度NLP:分词,词性标注,命名实体识别,词重要性 |
|
Established |
| 79 |
dsfsi/textaugment
TextAugment: Text Augmentation Library |
|
Established |
| 80 |
ownthink/Jiagu
Jiagu深度学习自然语言处理工具 知识图谱关系抽取 中文分词 词性标注 命名实体识别 情感分析 新词发现 关键词 文本摘要 文本聚类 |
|
Established |
| 81 |
vmenger/deduce
Deduce: de-identification method for Dutch medical text |
|
Established |
| 82 |
quanteda/quanteda
An R package for the Quantitative Analysis of Textual Data |
|
Established |
| 83 |
angelosalatino/cso-classifier
Python library that classifies content from scientific papers with the... |
|
Established |
| 84 |
Tiiiger/bert_score
BERT score for text generation |
|
Established |
| 85 |
fhamborg/news-please
news-please - an integrated web crawler and information extractor for news... |
|
Established |
| 86 |
NatLibFi/Annif
Annif is a multi-algorithm automated subject indexing tool for libraries,... |
|
Established |
| 87 |
Helsinki-NLP/OpusFilter
OpusFilter - Parallel corpus processing toolkit |
|
Established |
| 88 |
titipata/pubmed_parser
:clipboard: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset |
|
Established |
| 89 |
malaysia-ai/malaya
Natural Language Toolkit for Malaysian language, https://malaya.readthedocs.io/ |
|
Established |
| 90 |
MAIF/melusine
📧 Melusine: Use python to automatize your email processing workflow |
|
Established |
| 91 |
taishi-i/nagisa
A Japanese tokenizer based on recurrent neural networks |
|
Established |
| 92 |
chartbeat-labs/textacy
NLP, before and after spaCy |
|
Established |
| 93 |
wooorm/franc
Natural language detection |
|
Established |
| 94 |
hyunwoongko/kss
KSS: Korean String processing Suite |
|
Established |
| 95 |
princeton-nlp/SimCSE
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings... |
|
Established |
| 96 |
stair-lab/kg-gen
[NeurIPS '25] Knowledge Graph Generation from Any Text |
|
Established |
| 97 |
alvations/pywsd
Python Implementations of Word Sense Disambiguation (WSD) Technologies. |
|
Established |
| 98 |
davidsbatista/BREDS
"Bootstrapping Relationship Extractors with Distributional Semantics"... |
|
Established |
| 99 |
OmkarPathak/pyresparser
A simple resume parser used for extracting information from resumes |
|
Established |
| 100 |
hunspell/hunspell
The most popular spellchecking library. |
|
Established |