All NLP Tools
11,854 tools ranked by quality score · Page 3 of 119
| # | Tool | Score | Tier |
|---|---|---|---|
| 201 |
davidemiceli/gender-detection
Determine a person's gender based on his/her first name. |
|
Established |
| 202 |
proycon/folia
FoLiA: Format for Linguistic Annotation - FoLiA is a rich XML-based... |
|
Established |
| 203 |
soaxelbrooke/python-bpe
Byte Pair Encoding for Python! |
|
Established |
| 204 |
BramVanroy/spacy_conll
Pipeline component for spaCy (and other spaCy-wrapped parsers such as... |
|
Established |
| 205 |
NLP-LOVE/Introduction-NLP
HanLP作者的新书《自然语言处理入门》详细笔记!业界良心之作,书中不是枯燥无味的公式罗列,而是用白话阐述的通俗易懂的算法模型。从基本概念出发,逐步介绍中... |
|
Established |
| 206 |
chakki-works/sumeval
Well tested & Multi-language evaluation framework for text summarization. |
|
Established |
| 207 |
nltk/nltk_data
NLTK Data |
|
Established |
| 208 |
explosion/spacy-stanza
💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy |
|
Established |
| 209 |
microsoft/Recognizers-Text
Microsoft.Recognizers.Text provides recognition and resolution of numbers,... |
|
Established |
| 210 |
adbar/simplemma
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency |
|
Established |
| 211 |
guillaume-be/rust-tokenizers
Rust-tokenizer offers high-performance tokenizers for modern language... |
|
Established |
| 212 |
modelscope/AdaSeq
AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence... |
|
Established |
| 213 |
eikek/docspell
Assist in organizing your piles of documents, resulting from scanners,... |
|
Established |
| 214 |
msgi/nlp-journey
Documents, papers and codes related to Natural Language Processing,... |
|
Established |
| 215 |
vgrabovets/multi_rake
Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python |
|
Established |
| 216 |
jerryji1993/DNABERT
DNABERT: pre-trained Bidirectional Encoder Representations from Transformers... |
|
Established |
| 217 |
yongzhuo/Macropodus
自然语言处理工具Macropodus,基于Albert+BiLSTM+CRF深度学习网络架构,中文分词,词性标注,命名实体识别,新词发现,关键词,文本摘要... |
|
Established |
| 218 |
SimGus/Chatette
A powerful dataset generator for Rasa NLU, inspired by Chatito |
|
Established |
| 219 |
jbesomi/texthero
Text preprocessing, representation and visualization from zero to hero. |
|
Established |
| 220 |
adrien2p/nestjs-dialogflow
Dialog flow module that simplify the web hook handling for your NLP... |
|
Established |
| 221 |
natasha/slovnet
Deep Learning based NLP modeling for Russian language |
|
Established |
| 222 |
shibing624/dialogbot
dialogbot, provide search-based dialogue, task-based dialogue and generative... |
|
Established |
| 223 |
kk7nc/HDLTex
HDLTex: Hierarchical Deep Learning for Text Classification |
|
Established |
| 224 |
davidjurgens/potato
potato: the portable annotation tool |
|
Established |
| 225 |
chakki-works/seqeval
A Python framework for sequence labeling evaluation(named-entity... |
|
Established |
| 226 |
gandersen101/spaczz
Fuzzy matching and more functionality for spaCy. |
|
Established |
| 227 |
hb20007/hands-on-nltk-tutorial
The hands-on NLTK tutorial for NLP in Python |
|
Established |
| 228 |
textvec/textvec
Text vectorization tool to outperform TFIDF for classification tasks |
|
Established |
| 229 |
stanfordnlp/CoreNLP
CoreNLP: A Java suite of core NLP tools for tokenization, sentence... |
|
Established |
| 230 |
JayYip/m3tl
BERT for Multitask Learning |
|
Established |
| 231 |
wikimedia/sentencex
A sentence segmentation library with wide language support optimized for... |
|
Established |
| 232 |
PaddlePaddle/RocketQA
🚀 RocketQA, dense retrieval for information retrieval and question... |
|
Established |
| 233 |
shibing624/similarity
similarity: Text similarity calculation Toolkit for Java.... |
|
Established |
| 234 |
stanford-oval/genie-toolkit
The Genie open source kit for voice assistant (formerly known as Almond) |
|
Established |
| 235 |
asahi417/tner
Language model fine-tuning on NER with an easy interface and cross-domain... |
|
Established |
| 236 |
EdinburghNLP/awesome-hallucination-detection
List of papers on hallucination detection in LLMs. |
|
Established |
| 237 |
thepushkarp/nalcos
Search Git commits in natural language |
|
Established |
| 238 |
davidsbatista/Snowball
Implementation with some extensions of the paper "Snowball: Extracting... |
|
Established |
| 239 |
interpretml/interpret-text
A library that incorporates state-of-the-art explainers for text-based... |
|
Established |
| 240 |
thalesbertaglia/enelvo
A flexible normalizer for user-generated content |
|
Established |
| 241 |
esbatmop/MNBVC
MNBVC(Massive Never-ending BT Vast Chinese... |
|
Established |
| 242 |
rudikershaw/whichx
A small, no dependencies, Naive Bayes Text Classifier for JavaScript |
|
Established |
| 243 |
techwolf-ai/workrb
WorkRB: Work Research Benchmark |
|
Established |
| 244 |
Hironsan/anago
Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech... |
|
Established |
| 245 |
natasha/ipymarkup
NER, syntax markup visualizations |
|
Established |
| 246 |
PetrKorab/Arabica
Python package for text mining of time-series data |
|
Established |
| 247 |
Ars-Linguistica/mlconjug3
A Python library to conjugate verbs in French, English, Spanish, Italian,... |
|
Established |
| 248 |
winkjs/wink-pos-tagger
English Part-of-speech (POS) tagger |
|
Established |
| 249 |
smart-on-fhir/cumulus-etl
Extract FHIR data, Transform with NLP and DEID tools, and then Load FHIR... |
|
Established |
| 250 |
zjunlp/OpenUE
[EMNLP 2020] OpenUE: An Open Toolkit of Universal Extraction from Text |
|
Established |
| 251 |
sciknoworg/OntoAligner
OntoAligner: A Python Toolkit for Ontology Alignment... |
|
Established |
| 252 |
openeventdata/mordecai
Full text geoparsing as a Python library |
|
Established |
| 253 |
fastdatascience/faststylometry
Stylometry library for Burrows' Delta method |
|
Established |
| 254 |
stephenhky/PyShortTextCategorization
Various Algorithms for Short Text Mining |
|
Established |
| 255 |
ikegami-yukino/oseti
Dictionary based Sentiment Analysis for Japanese |
|
Established |
| 256 |
DanielJDufour/date-extractor
Extract dates from text |
|
Established |
| 257 |
gagan3012/keytotext
Keywords to Sentences |
|
Established |
| 258 |
sagorbrur/bnlp
BNLP is a natural language processing toolkit for Bengali Language. |
|
Established |
| 259 |
brightmart/nlp_chinese_corpus
大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP |
|
Established |
| 260 |
lonePatient/awesome-pretrained-chinese-nlp-models
Awesome Pretrained Chinese NLP Models,高质量中文预训练模型&大模型&多模态模型&大语言模型集合 |
|
Established |
| 261 |
wi2trier/cbrkit
Customizable Case-Based Reasoning (CBR) toolkit for Python with a built-in... |
|
Established |
| 262 |
dnanhkhoa/python-vncorenlp
A Python wrapper for VnCoreNLP using a bidirectional communication channel. |
|
Established |
| 263 |
OpenNMT/Tokenizer
Fast and customizable text tokenization library with BPE and SentencePiece support |
|
Established |
| 264 |
google-research/turkish-morphology
A two-level morphological analyzer for Turkish. |
|
Established |
| 265 |
juliasilge/tidytext
Text mining using tidy tools :sparkles::page_facing_up::sparkles: |
|
Established |
| 266 |
lovit/KR-WordRank
비지도학습 방법으로 한국어 텍스트에서 단어/키워드를 자동으로 추출하는 라이브러리입니다 |
|
Established |
| 267 |
microsoft/presidio-research
This package features data-science related tasks for developing new... |
|
Established |
| 268 |
keon/awesome-nlp
:book: A curated list of resources dedicated to Natural Language Processing (NLP) |
|
Established |
| 269 |
yogeshhk/MiningResume
Text Mining certain fields from a resume |
|
Established |
| 270 |
jalammar/ecco
Explain, analyze, and visualize NLP language models. Ecco creates... |
|
Established |
| 271 |
brucewlee/lftk
[BEA @ ACL 2023] General-purpose tool for linguistic features extraction;... |
|
Established |
| 272 |
ikegami-yukino/pymlask
Emotion analyzer for Japanese text |
|
Established |
| 273 |
Cyberbolt/Cemotion
A Chinese NLP library based on BERT for sentiment analysis and... |
|
Established |
| 274 |
prasanthg3/cleantext
An open-source package for python to clean raw text data |
|
Established |
| 275 |
grid-parity-exchange/Egret
Tools for building power systems optimization problems |
|
Established |
| 276 |
emres/turkish-deasciifier
Turkish deasciifier in Python based on Deniz Yüret's turkish-mode for Emacs |
|
Established |
| 277 |
HzaCode/OneCite
📚 An intelligent toolkit to automatically parse, complete, and format... |
|
Established |
| 278 |
zhang17173/Event-Extraction
基于法律裁判文书的事件抽取及其应用,包括数据的分词、词性标注、命名实体识别、事件要素抽取和判决结果预测等内容 |
|
Established |
| 279 |
dkpro/dkpro-core
Collection of software components for natural language processing (NLP)... |
|
Established |
| 280 |
shibing624/pytextclassifier
pytextclassifier is a toolkit for text classification.... |
|
Established |
| 281 |
mihail911/fake-news
Building a fake news detector from initial ideation to model deployment |
|
Established |
| 282 |
cardiffnlp/tweetnlp
TweetNLP for all the NLP enthusiasts working on Twitter! The Python library... |
|
Established |
| 283 |
NorskRegnesentral/skweak
skweak: A software toolkit for weak supervision applied to NLP tasks |
|
Established |
| 284 |
dice-group/gerbil
GERBIL - General Entity annotatoR Benchmark |
|
Established |
| 285 |
JohnSnowLabs/johnsnowlabs
Gateway into the John Snow Labs Ecosystem |
|
Established |
| 286 |
carlosplanchon/betterhtmlchunking
BetterHTMLChunking is a Python library for intelligent HTML segmentation. It... |
|
Established |
| 287 |
nlpbook/nlpbook
Applied Natural Language Processing in the Enterprise - An O'Reilly Media Publication |
|
Established |
| 288 |
ysenarath/sinling
A collection of NLP tools for Sinhalese (සිංහල). |
|
Established |
| 289 |
fbilhaut/gline-rs
Inference engine for GLiNER models, in Rust |
|
Established |
| 290 |
apache/opennlp-sandbox
Apache OpenNLP Sandbox |
|
Established |
| 291 |
kensk8er/chicksexer
A Python package for gender classification. |
|
Established |
| 292 |
apache/ctakes
Apache cTAKES is a Natural Language Processing (NLP) platform for clinical text. |
|
Established |
| 293 |
AnasAito/SkillNER
A (smart) rule based NLP module to extract job skills from text |
|
Established |
| 294 |
Kyubyong/name2nat
name2nat: a Python package for nationality prediction from a name |
|
Established |
| 295 |
Systemcluster/kitoken
Fast and versatile tokenizer for language models, compatible with... |
|
Established |
| 296 |
fdalvi/NeuroX
A Python library that encapsulates various methods for neuron interpretation... |
|
Established |
| 297 |
n-waves/multifit
The code to reproduce results from paper "MultiFiT: Efficient Multi-lingual... |
|
Established |
| 298 |
explosion/spacymoji
💙 Emoji handling and meta data for spaCy with custom extension attributes |
|
Established |
| 299 |
mpuig/spacy-lookup
Named Entity Recognition based on dictionaries |
|
Established |
| 300 |
stanford-oval/genienlp
GenieNLP: A versatile codebase for any NLP task |
|
Established |