All NLP Tools

11,854 tools ranked by quality score · Page 4 of 119

Showing 301–400 of 11,854
# Tool Score Tier
301 mikahama/uralicNLP

An NLP library for Uralic languages such as Finnish, Skolt Sami, Moksha and...

57
Established
302 SergeyShk/ruTS

Библиотека для извлечения статистик из текстов на русском языке.

57
Established
303 VietHoang1512/khmer-nltk

Khmer language processing toolkit

57
Established
304 obss/jury

Comprehensive NLP Evaluation System

57
Established
305 ropensci/googleLanguageR

R client for the Google Translation API, Google Cloud Natural Language API...

57
Established
306 howl-anderson/seq2annotation

基于 TensorFlow & PaddlePaddle 的通用序列标注算法库(目前包含 BiLSTM+CRF, Stacked-BiLSTM+CRF...

57
Established
307 graykode/toeicbert

TOEIC(Test of English for International Communication) solving using...

57
Established
308 yohasebe/wp2txt

A command-line tool to extract plain text from Wikipedia dumps with category...

57
Established
309 FreeDiscovery/FreeDiscovery

Web Service for E-Discovery Analytics

57
Established
310 markuskiller/textblob-de

German language support for TextBlob.

57
Established
311 mirkosertic/FXDesktopSearch

A JavaFX based desktop search application.

57
Established
312 CGCL-codes/naturalcc

NaturalCC: An Open-Source Toolkit for Code Intelligence

57
Established
313 messense/jieba-rs

The Jieba Chinese Word Segmentation Implemented in Rust

57
Established
314 gaphex/bert_experimental

code and supplementary materials for a series of Medium articles about the BERT model

57
Established
315 daac-tools/vibrato

🎤 vibrato: Viterbi-based accelerated tokenizer

57
Established
316 IlyaGusev/rnnmorph

Morphological analyzer for Russian and English languages based on neural...

57
Established
317 AndersonBY/deepseek-tokenizer

DeepSeek Tokenizer is an efficient and lightweight tokenization library with...

57
Established
318 affjljoo3581/canrevan

대량의 네이버 뉴스 기사를 수집하는 라이브러리입니다.

57
Established
319 sammous/spacy-lefff

Custom French POS and lemmatizer based on Lefff for spacy

57
Established
320 houbb/sensitive-word

👮‍♂️The sensitive word tool for java.(敏感词/违禁词/违法词/脏词。基于 DFA 算法实现的高性能 java...

57
Established
321 chrislit/abydos

Abydos NLP/IR library for Python

57
Established
322 paschmann/rasa-ui

Rasa UI is a frontend for the Rasa Framework

57
Established
323 yongzhuo/Pytorch-NLU

中文文本分类、序列标注工具包(pytorch),支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词、抽取式文本摘要等序列标...

57
Established
324 linonetwo/segmentit

任何 JS 环境可用的中文分词包,fork from leizongmin/node-segment

57
Established
325 taishi-i/toiro

A tool for comparing tokenizers

57
Established
326 nlpcloud/nlpcloud-python

NLP Cloud serves high performance pre-trained or custom models for NER,...

57
Established
327 RocketChat/hubot-natural

Natural Language Processing Chatbot for RocketChat

57
Established
328 jalajthanaki/NLPython

This repository contains the code related to Natural Language Processing...

57
Established
329 inception-project/inception-external-recommender

Get annotation suggestions for the INCEpTION text annotation platform from...

57
Established
330 jiaeyan/Jiayan

甲言,专注于古代汉语(古汉语/古文/文言文/文言)处理的NLP工具包,支持文言词库构建、分词、词性标注、断句和标点。Jiayan, the 1st...

57
Established
331 ssciwr/AMMICO

AI-based Media and Misinformation Content Analysis Tool: Analyze text and images

57
Established
332 textgain/grasp

Essential NLP & ML, short & fast pure Python code

57
Established
333 sushil79g/Nepali_nlp

A python based library for NLP in Nepali language

57
Established
334 sildar/potara

Multi-document summarization tool relying on ILP and sentence fusion

57
Established
335 averbis/averbis-python-api

Conveniently access the REST API of Averbis products using Python

57
Established
336 gerardobort/node-corenlp

CoreNLP @ NodeJS

57
Established
337 mideind/GreynirServer

The greynir.is Icelandic natural language processing API and website.

56
Established
338 batzner/tensorlm

Wrapper library for text generation / language models at character and word...

56
Established
339 PragatiVerma18/MLH-Quizzet

This is a smart Quiz Generator that generates a dynamic quiz from any...

56
Established
340 boat-group/fancy-nlp

NLP for human. A fast and easy-to-use natural language processing (NLP)...

56
Established
341 syuoni/eznlp

Easy Natural Language Processing

56
Established
342 sorenlind/lemmy

🤘Lemmy is a lemmatizer for Danish 🇩🇰 and Swedish 🇸🇪

56
Established
343 tokestermw/spacy_hunspell

:pencil2: Hunspell extension for spaCy 2.0.

56
Established
344 PaddlePaddle/ERNIE

The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade...

56
Established
345 openfactcheck-research/openfactcheck

An Open-source Factuality Evaluation Demo for LLMs

56
Established
346 VinAIResearch/PhoNLP

PhoNLP: A BERT-based multi-task learning model for part-of-speech tagging,...

56
Established
347 fukuball/jieba-php

"結巴"中文分詞:做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese...

56
Established
348 LanguageMachines/ucto

Unicode tokeniser. Ucto tokenizes text files: it separates words from...

56
Established
349 Planeshifter/text-miner

text mining utilities for Node.js

56
Established
350 giacbrd/python-dandelion-eu

A python client for connecting to all the services provided by https://dandelion.eu

56
Established
351 pemistahl/lingua-py

The most accurate natural language detection library for Python, suitable...

56
Established
352 opencog/link-grammar

The CMU Link Grammar natural language parser

56
Established
353 bootphon/pygamma-agreement

Gamma Agreement in Python

56
Established
354 bataak/dict-mn

Монгол үгийн алдаа шалгах толь, Mongolian spellchecking dictionary

56
Established
355 nicolay-r/AREkit

Document level Attitude and Relation Extraction toolkit (AREkit) for...

56
Established
356 obulat/zeyrek

Python morphological analyzer for Turkish language. Partial port of ZemberekNLP.

56
Established
357 luozhouyang/transformers-keras

Transformer-based models implemented in tensorflow 2.x(using keras).

56
Established
358 xv44586/toolkit4nlp

transformers implement (architecture, task example, serving and more)

56
Established
359 tanloong/neosca

L2SCA & LCA fork: cross-platform, GUI, without Java dependency

56
Established
360 vulnerability-lookup/VulnTrain

A tool to generate datasets and models based on vulnerabilities descriptions...

56
Established
361 mirth/chonky

Fully neural approach for text chunking

56
Established
362 avidale/compress-fasttext

Tools for shrinking fastText models (in gensim format)

56
Established
363 azooKey/AzooKeyKanaKanjiConverter

Kana-Kanji Conversion Module written in Swift, supporting Neural Kana-Kanji...

56
Established
364 messense/fasttext-serving

fastText model serving service

56
Established
365 Extralit/extralit

Fast and accurate systemic data extraction with LLM assistance

55
Established
366 zaibacu/rita-dsl

A Domain Specific Language (DSL) for building language patterns. These can...

55
Established
367 johncmunson/react-taggy

A simple zero-dependency React component for tagging user-defined entities...

55
Established
368 monpa-team/monpa

MONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型

55
Established
369 dselivanov/text2vec

Fast vectorization, topic modeling, distances and GloVe word embeddings in R.

55
Established
370 THU-KEG/OmniEvent

A comprehensive, unified and modular event extraction toolkit.

55
Established
371 quickwit-oss/whichlang

A blazingly fast and lightweight language detection library for Rust

55
Established
372 pd3f/pd3f

🏭 PDF text extraction pipeline: self-hosted, local-first, Docker-based

55
Established
373 panggi/pujangga

Pujangga - Indonesian Natural Language Processing Tool with REST API, an...

55
Established
374 SWHL/AI-Competition-Collections

AI比赛经验帖子 & 训练和测试技巧帖子 集锦(收集整理各种人工智能比赛经验帖)

55
Established
375 Yale-LILY/SummerTime

An open-source text summarization toolkit for non-experts. EMNLP'2021 Demo

55
Established
376 guillaume-be/rust-bert

Rust native ready-to-use NLP pipelines and transformer-based models (BERT,...

55
Established
377 KoichiYasuoka/esupar

Tokenizer POS-Tagger and Dependency-parser with BERT/RoBERTa/DeBERTa/GPT...

55
Established
378 natasha/nerus

Large silver standart Russian corpus with NER, morphology and syntax markup

55
Established
379 giacbrd/ShallowLearn

An experiment about re-implementing supervised learning models based on...

55
Established
380 OpenJarbas/simple_NER

simple rule based named entity recognition

55
Established
381 wroberts/pygermanet

GermaNet API for Python

55
Established
382 web-arena-x/webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

55
Established
383 AstraZeneca/KAZU

Fast, world class biomedical NER

55
Established
384 lunarwhite/tan-division

Chinese corpus sentiment analysis. 谭松波酒店评论中文文本情感分析

55
Established
385 SlapBot/sounder

An intent recognizing algorithm to predict the intent of a given text.

55
Established
386 bnosac/udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and...

55
Established
387 strangetom/ingredient-parser

A tool to parse recipe ingredients into structured data

55
Established
388 theeluwin/lexrankr

LexRank for Korean.

55
Established
389 daac-tools/vaporetto

🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer

54
Established
390 ivan-bilan/The-NLP-Pandect

A comprehensive reference for all topics related to Natural Language Processing

54
Established
391 indix/whatthelang

Lightning Fast Language Prediction 🚀

54
Established
392 nickdavidhaynes/spacy-cld

Language detection extension for spaCy 2.0+

54
Established
393 andreekeberg/ml-classify-text-js

Machine learning based text classification in JavaScript using n-grams and...

54
Established
394 Ali-Alameer/NLP

This repository offers NLP resources & tutorials using keras/tensorflow....

54
Established
395 neomatrix369/nlp_profiler

A simple NLP library allows profiling datasets with one or more text...

54
Established
396 sugarme/tokenizer

NLP tokenizers written in Go language

54
Established
397 nicolay-r/ARElight

Granular Viewer of Sentiments Between Entities in Massively Large Documents...

54
Established
398 textlint-rule/sentence-splitter

Split {Japanese, English} text into sentences.

54
Established
399 bltlab/seqscore

SeqScore: Scoring for named entity recognition and other sequence labeling tasks

54
Established
400 n3integration/classifier

A general purpose text classifier

54
Established