All NLP Tools
11,854 tools ranked by quality score · Page 4 of 119
| # | Tool | Score | Tier |
|---|---|---|---|
| 301 |
mikahama/uralicNLP
An NLP library for Uralic languages such as Finnish, Skolt Sami, Moksha and... |
|
Established |
| 302 |
SergeyShk/ruTS
Библиотека для извлечения статистик из текстов на русском языке. |
|
Established |
| 303 |
VietHoang1512/khmer-nltk
Khmer language processing toolkit |
|
Established |
| 304 |
obss/jury
Comprehensive NLP Evaluation System |
|
Established |
| 305 |
ropensci/googleLanguageR
R client for the Google Translation API, Google Cloud Natural Language API... |
|
Established |
| 306 |
howl-anderson/seq2annotation
基于 TensorFlow & PaddlePaddle 的通用序列标注算法库(目前包含 BiLSTM+CRF, Stacked-BiLSTM+CRF... |
|
Established |
| 307 |
graykode/toeicbert
TOEIC(Test of English for International Communication) solving using... |
|
Established |
| 308 |
yohasebe/wp2txt
A command-line tool to extract plain text from Wikipedia dumps with category... |
|
Established |
| 309 |
FreeDiscovery/FreeDiscovery
Web Service for E-Discovery Analytics |
|
Established |
| 310 |
markuskiller/textblob-de
German language support for TextBlob. |
|
Established |
| 311 |
mirkosertic/FXDesktopSearch
A JavaFX based desktop search application. |
|
Established |
| 312 |
CGCL-codes/naturalcc
NaturalCC: An Open-Source Toolkit for Code Intelligence |
|
Established |
| 313 |
messense/jieba-rs
The Jieba Chinese Word Segmentation Implemented in Rust |
|
Established |
| 314 |
gaphex/bert_experimental
code and supplementary materials for a series of Medium articles about the BERT model |
|
Established |
| 315 |
daac-tools/vibrato
🎤 vibrato: Viterbi-based accelerated tokenizer |
|
Established |
| 316 |
IlyaGusev/rnnmorph
Morphological analyzer for Russian and English languages based on neural... |
|
Established |
| 317 |
AndersonBY/deepseek-tokenizer
DeepSeek Tokenizer is an efficient and lightweight tokenization library with... |
|
Established |
| 318 |
affjljoo3581/canrevan
대량의 네이버 뉴스 기사를 수집하는 라이브러리입니다. |
|
Established |
| 319 |
sammous/spacy-lefff
Custom French POS and lemmatizer based on Lefff for spacy |
|
Established |
| 320 |
houbb/sensitive-word
👮♂️The sensitive word tool for java.(敏感词/违禁词/违法词/脏词。基于 DFA 算法实现的高性能 java... |
|
Established |
| 321 |
chrislit/abydos
Abydos NLP/IR library for Python |
|
Established |
| 322 |
paschmann/rasa-ui
Rasa UI is a frontend for the Rasa Framework |
|
Established |
| 323 |
yongzhuo/Pytorch-NLU
中文文本分类、序列标注工具包(pytorch),支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词、抽取式文本摘要等序列标... |
|
Established |
| 324 |
linonetwo/segmentit
任何 JS 环境可用的中文分词包,fork from leizongmin/node-segment |
|
Established |
| 325 |
taishi-i/toiro
A tool for comparing tokenizers |
|
Established |
| 326 |
nlpcloud/nlpcloud-python
NLP Cloud serves high performance pre-trained or custom models for NER,... |
|
Established |
| 327 |
RocketChat/hubot-natural
Natural Language Processing Chatbot for RocketChat |
|
Established |
| 328 |
jalajthanaki/NLPython
This repository contains the code related to Natural Language Processing... |
|
Established |
| 329 |
inception-project/inception-external-recommender
Get annotation suggestions for the INCEpTION text annotation platform from... |
|
Established |
| 330 |
jiaeyan/Jiayan
甲言,专注于古代汉语(古汉语/古文/文言文/文言)处理的NLP工具包,支持文言词库构建、分词、词性标注、断句和标点。Jiayan, the 1st... |
|
Established |
| 331 |
ssciwr/AMMICO
AI-based Media and Misinformation Content Analysis Tool: Analyze text and images |
|
Established |
| 332 |
textgain/grasp
Essential NLP & ML, short & fast pure Python code |
|
Established |
| 333 |
sushil79g/Nepali_nlp
A python based library for NLP in Nepali language |
|
Established |
| 334 |
sildar/potara
Multi-document summarization tool relying on ILP and sentence fusion |
|
Established |
| 335 |
averbis/averbis-python-api
Conveniently access the REST API of Averbis products using Python |
|
Established |
| 336 |
gerardobort/node-corenlp
CoreNLP @ NodeJS |
|
Established |
| 337 |
mideind/GreynirServer
The greynir.is Icelandic natural language processing API and website. |
|
Established |
| 338 |
batzner/tensorlm
Wrapper library for text generation / language models at character and word... |
|
Established |
| 339 |
PragatiVerma18/MLH-Quizzet
This is a smart Quiz Generator that generates a dynamic quiz from any... |
|
Established |
| 340 |
boat-group/fancy-nlp
NLP for human. A fast and easy-to-use natural language processing (NLP)... |
|
Established |
| 341 |
syuoni/eznlp
Easy Natural Language Processing |
|
Established |
| 342 |
sorenlind/lemmy
🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪 |
|
Established |
| 343 |
tokestermw/spacy_hunspell
:pencil2: Hunspell extension for spaCy 2.0. |
|
Established |
| 344 |
PaddlePaddle/ERNIE
The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade... |
|
Established |
| 345 |
openfactcheck-research/openfactcheck
An Open-source Factuality Evaluation Demo for LLMs |
|
Established |
| 346 |
VinAIResearch/PhoNLP
PhoNLP: A BERT-based multi-task learning model for part-of-speech tagging,... |
|
Established |
| 347 |
fukuball/jieba-php
"結巴"中文分詞:做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese... |
|
Established |
| 348 |
LanguageMachines/ucto
Unicode tokeniser. Ucto tokenizes text files: it separates words from... |
|
Established |
| 349 |
Planeshifter/text-miner
text mining utilities for Node.js |
|
Established |
| 350 |
giacbrd/python-dandelion-eu
A python client for connecting to all the services provided by https://dandelion.eu |
|
Established |
| 351 |
pemistahl/lingua-py
The most accurate natural language detection library for Python, suitable... |
|
Established |
| 352 |
opencog/link-grammar
The CMU Link Grammar natural language parser |
|
Established |
| 353 |
bootphon/pygamma-agreement
Gamma Agreement in Python |
|
Established |
| 354 |
bataak/dict-mn
Монгол үгийн алдаа шалгах толь, Mongolian spellchecking dictionary |
|
Established |
| 355 |
nicolay-r/AREkit
Document level Attitude and Relation Extraction toolkit (AREkit) for... |
|
Established |
| 356 |
obulat/zeyrek
Python morphological analyzer for Turkish language. Partial port of ZemberekNLP. |
|
Established |
| 357 |
luozhouyang/transformers-keras
Transformer-based models implemented in tensorflow 2.x(using keras). |
|
Established |
| 358 |
xv44586/toolkit4nlp
transformers implement (architecture, task example, serving and more) |
|
Established |
| 359 |
tanloong/neosca
L2SCA & LCA fork: cross-platform, GUI, without Java dependency |
|
Established |
| 360 |
vulnerability-lookup/VulnTrain
A tool to generate datasets and models based on vulnerabilities descriptions... |
|
Established |
| 361 |
mirth/chonky
Fully neural approach for text chunking |
|
Established |
| 362 |
avidale/compress-fasttext
Tools for shrinking fastText models (in gensim format) |
|
Established |
| 363 |
azooKey/AzooKeyKanaKanjiConverter
Kana-Kanji Conversion Module written in Swift, supporting Neural Kana-Kanji... |
|
Established |
| 364 |
messense/fasttext-serving
fastText model serving service |
|
Established |
| 365 |
Extralit/extralit
Fast and accurate systemic data extraction with LLM assistance |
|
Established |
| 366 |
zaibacu/rita-dsl
A Domain Specific Language (DSL) for building language patterns. These can... |
|
Established |
| 367 |
johncmunson/react-taggy
A simple zero-dependency React component for tagging user-defined entities... |
|
Established |
| 368 |
monpa-team/monpa
MONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型 |
|
Established |
| 369 |
dselivanov/text2vec
Fast vectorization, topic modeling, distances and GloVe word embeddings in R. |
|
Established |
| 370 |
THU-KEG/OmniEvent
A comprehensive, unified and modular event extraction toolkit. |
|
Established |
| 371 |
quickwit-oss/whichlang
A blazingly fast and lightweight language detection library for Rust |
|
Established |
| 372 |
pd3f/pd3f
🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based |
|
Established |
| 373 |
panggi/pujangga
Pujangga - Indonesian Natural Language Processing Tool with REST API, an... |
|
Established |
| 374 |
SWHL/AI-Competition-Collections
AI比赛经验帖子 & 训练和测试技巧帖子 集锦(收集整理各种人工智能比赛经验帖) |
|
Established |
| 375 |
Yale-LILY/SummerTime
An open-source text summarization toolkit for non-experts. EMNLP'2021 Demo |
|
Established |
| 376 |
guillaume-be/rust-bert
Rust native ready-to-use NLP pipelines and transformer-based models (BERT,... |
|
Established |
| 377 |
KoichiYasuoka/esupar
Tokenizer POS-Tagger and Dependency-parser with BERT/RoBERTa/DeBERTa/GPT... |
|
Established |
| 378 |
natasha/nerus
Large silver standart Russian corpus with NER, morphology and syntax markup |
|
Established |
| 379 |
giacbrd/ShallowLearn
An experiment about re-implementing supervised learning models based on... |
|
Established |
| 380 |
OpenJarbas/simple_NER
simple rule based named entity recognition |
|
Established |
| 381 |
wroberts/pygermanet
GermaNet API for Python |
|
Established |
| 382 |
web-arena-x/webarena
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents" |
|
Established |
| 383 |
AstraZeneca/KAZU
Fast, world class biomedical NER |
|
Established |
| 384 |
lunarwhite/tan-division
Chinese corpus sentiment analysis. 谭松波酒店评论中文文本情感分析 |
|
Established |
| 385 |
SlapBot/sounder
An intent recognizing algorithm to predict the intent of a given text. |
|
Established |
| 386 |
bnosac/udpipe
R package for Tokenization, Parts of Speech Tagging, Lemmatization and... |
|
Established |
| 387 |
strangetom/ingredient-parser
A tool to parse recipe ingredients into structured data |
|
Established |
| 388 |
theeluwin/lexrankr
LexRank for Korean. |
|
Established |
| 389 |
daac-tools/vaporetto
🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer |
|
Established |
| 390 |
ivan-bilan/The-NLP-Pandect
A comprehensive reference for all topics related to Natural Language Processing |
|
Established |
| 391 |
indix/whatthelang
Lightning Fast Language Prediction 🚀 |
|
Established |
| 392 |
nickdavidhaynes/spacy-cld
Language detection extension for spaCy 2.0+ |
|
Established |
| 393 |
andreekeberg/ml-classify-text-js
Machine learning based text classification in JavaScript using n-grams and... |
|
Established |
| 394 |
Ali-Alameer/NLP
This repository offers NLP resources & tutorials using keras/tensorflow.... |
|
Established |
| 395 |
neomatrix369/nlp_profiler
A simple NLP library allows profiling datasets with one or more text... |
|
Established |
| 396 |
sugarme/tokenizer
NLP tokenizers written in Go language |
|
Established |
| 397 |
nicolay-r/ARElight
Granular Viewer of Sentiments Between Entities in Massively Large Documents... |
|
Established |
| 398 |
textlint-rule/sentence-splitter
Split {Japanese, English} text into sentences. |
|
Established |
| 399 |
bltlab/seqscore
SeqScore: Scoring for named entity recognition and other sequence labeling tasks |
|
Established |
| 400 |
n3integration/classifier
A general purpose text classifier |
|
Established |