All NLP Tools

11,854 tools ranked by quality score · Page 5 of 119

Showing 401–500 of 11,854
# Tool Score Tier
401 yxuansu/SimCTG

[NeurIPS'22 Spotlight] A Contrastive Framework for Neural Text Generation

54
Established
402 brightertiger/pygarble

Python Package to detect garbled, gibberish text for EN

54
Established
403 explosion/spacy-experimental

🧪 Cutting-edge experimental spaCy components and features

54
Established
404 jenojp/extractacy

Spacy pipeline object for extracting values that correspond to a named...

54
Established
405 yongzhuo/Macadam

Macadam是一个以Tensorflow(Keras)和bert4keras为基础,专注于文本分类、序列标注和关系抽取的自然语言处理工具包。支持RAND...

54
Established
406 winkjs/wink-lemmatizer

English lemmatizer

54
Established
407 jaguarliuu/rookie_text2data

Dify插件 - 自然语言获取数据库数据

54
Established
408 Blake-Madden/OleanderStemmingLibrary

Porter stemming library (C++)

54
Established
409 AnthonyMRios/pymetamap

Python wraper for MetaMap

54
Established
410 GateNLP/python-gatenlp

Python text processing, pattern matching, and NLP framework

54
Established
411 explosion/spacy-loggers

📟 Logging utilities for spaCy

54
Established
412 OpenPecha/pybo

🦜 NLP for Tibetan, in Python.

54
Established
413 naver/claf

CLaF: Open-Source Clova Language Framework

54
Established
414 htaghizadeh/PersianStemmer-Python

PersianStemmer-Python

54
Established
415 UlugbekSalaev/UzTransliterator

UzTransliterator | State-of-the-art machine transliteration tool for Uzbek language

54
Established
416 guotong1988/BERT-pre-training

multi-gpu pre-training in one machine for BERT without horovod (Data Parallelism)

54
Established
417 Shark-NLP/OpenICL

OpenICL is an open-source framework to facilitate research, development, and...

53
Established
418 Wluper/edm

Python package for understanding the difficulty of text classification...

53
Established
419 DataScienceUIBK/HintEval

HintEval💡: A Comprehensive Framework for Hint Generation and Evaluation for Questions

53
Established
420 bjascob/pyInflect

A python module for word inflections designed for use with spaCy.

53
Established
421 microsoft/LMChallenge

A library & tools to evaluate predictive language models.

53
Established
422 gentaiscool/code-switching-papers

A curated list of research papers and resources on code-switching

53
Established
423 preligens-lab/textnoisr

Adding random noise to a text dataset, and controlling very accurately the...

53
Established
424 PKSHATechnology-Research/tdmelodic

A Japanese accent dictionary generator

53
Established
425 CLUEbenchmark/CLUECorpus2020

Large-scale Pre-training Corpus for Chinese 100G 中文预训练语料

53
Established
426 proycon/colibri-core

Colibri core is an NLP tool as well as a C++ and Python library for working...

53
Established
427 hamelsmu/ktext

Utilities for preprocessing text for deep learning with Keras

53
Established
428 nlpaueb/gr-nlp-toolkit

The Greek NLP toolkit for Python. Supports NER/DP/POS...

53
Established
429 jfilter/clean-text

🧹 Python package for text cleaning

53
Established
430 vzhong/embeddings

Fast, DB Backed pretrained word embeddings for natural language processing.

53
Established
431 SkBlaz/rakun2

RaKUn 2.0 - A fast keyword detection algorithm

53
Established
432 explosion/spacy-transformers

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

53
Established
433 winkjs/wink-porter2-stemmer

Javascript Implementation of Porter Stemmer Algorithm V2 by Dr Martin F Porter

53
Established
434 palewire/storysniffer

Inspect a URL and estimate if it contains a news story

53
Established
435 dccuchile/wefe

WEFE: The Word Embeddings Fairness Evaluation Framework. WEFE is a framework...

53
Established
436 bab2min/tomotopy

Python package of Tomoto, the Topic Modeling Tool

53
Established
437 UBC-NLP/turjuman

TURJUMAN, a neural toolkit for translating from 20 languages into Modern...

53
Established
438 proycon/python-ucto

This is a Python binding to the tokenizer Ucto. Tokenisation is one of the...

53
Established
439 sileod/tasksource

Datasets collection and preprocessings framework for NLP extreme multitask learning

53
Established
440 SudhirGadhvi/open-vernacular-ai-kit

Clean Indian code-mixed text before it reaches your LLM.

53
Established
441 Multiverse-of-Projects/NewsAI

A dynamic NewsAI dashboard that uses NLP to analyze news articles, visualize...

53
Established
442 keyATM/keyATM

An R package for Keyword Assisted Topic Models

53
Established
443 yagays/ja-timex

自然言語で書かれた時間情報表現を抽出/規格化するルールベースの解析器

53
Established
444 amirshnll/Persian-Swear-Words

Persian Swear Dataset - you can use in your production to filter unwanted...

53
Established
445 WZBSocialScienceCenter/germalemma

A lemmatizer for German language text

53
Established
446 lefterisloukas/edgar-crawler

The only open-source toolkit that can download SEC EDGAR financial reports...

53
Established
447 shibing624/nerpy

🌈 NERpy: Implementation of Named Entity Recognition using Python....

53
Established
448 andifunke/topic-labeling

The project proposes a framework to apply topic models on a text-corpus and...

53
Established
449 mbanon/fastspell

Targetted language identifier, based on FastText and Hunspell.

52
Established
450 tmalsburg/txl.el

Emacs extension providing direct access to DeepL's machine translation API.

52
Established
451 KRLabsOrg/rulechef

Learn rule-based models from examples using LLM-powered synthesis. Replace...

52
Established
452 mbejda/Node-OpenNLP

Apache OpenNLP wrapper for Nodejs

52
Established
453 codewithzichao/DeepClassifier

DeepClassifier is aimed at building general text classification model...

52
Established
454 demidko/aot

Russian morphology analyzer for Java | Морфологический словарь русского...

52
Established
455 LHNCBC/metamaplite

A near real-time named-entity recognizer

52
Established
456 nabeelxy/syara

SYARA: Super YARA Rules for GenAI Era

52
Established
457 JulesBelveze/concepcy

💫 SpaCy wrapper for ConceptNet 💫

52
Established
458 darija-open-dataset/dataset

darija <-> english dataset

52
Established
459 alibaba-damo-academy/SpokenNLP

A wide variety of research projects developed by the SpokenNLP team of...

52
Established
460 kororo/excelcy

Excel Integration with spaCy. Training NER using Excel/XLSX from PDF, DOCX,...

52
Established
461 nitotm/efficient-language-detector-js

Fast and accurate natural language detection. Detector written in...

52
Established
462 Ricardokevins/Kevinpro-NLP-demo

All NLP you Need Here. 目前包含15个NLP demo的pytorch实现(大量代码借鉴于其他开源项目,原先是自己玩的,后来干脆也开源出来)

52
Established
463 amaiya/causalnlp

CausalNLP is a practical toolkit for causal inference with text as...

52
Established
464 seanghay/awesome-khmer-language

A large collection of Khmer language resources. Khmer is a language used by Cambodia.

52
Established
465 labteral/ernie

Simple State-of-the-Art BERT-Based Sentence Classification with Keras /...

52
Established
466 MartinoMensio/spacy-dbpedia-spotlight

A spaCy wrapper for DBpedia Spotlight

52
Established
467 uoneway/KoBertSum

KoBertSum은 BertSum모델을 한국어 데이터에 적용할 수 있도록 수정한 한국어 요약 모델입니다.

52
Established
468 openvenues/libpostal

A C library for parsing/normalizing street addresses around the world....

52
Established
469 laugustyniak/awesome-sentiment-analysis

Repository with all what is necessary for sentiment analysis and related areas

52
Established
470 RTIInternational/gobbli

Deep learning with text doesn't have to be scary.

52
Established
471 vrasneur/pyfasttext

Yet another Python binding for fastText

52
Established
472 changwookjun/nlp-paper

NLP Paper

52
Established
473 mmmaurer/elfen

A python package to efficiently extract linguistic features for text/NLP datasets

52
Established
474 jiesutd/NCRFpp

NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence...

51
Established
475 crownpku/Awesome-Chinese-NLP

A curated list of resources for Chinese NLP 中文自然语言处理相关资料

51
Established
476 linuxscout/mishkal

Mishkal is an arabic text vocalization software

51
Established
477 brightmart/text_classification

all kinds of text classification models and more with deep learning

51
Established
478 jamesmullenbach/caml-mimic

multilabel classification of EHR notes

51
Established
479 qq547276542/Agriculture_KnowledgeGraph

农业知识图谱(AgriKG):农业领域的信息检索,命名实体识别,关系抽取,智能问答,辅助决策

51
Established
480 fido-ai/ua-datasets

A collection of datasets for Ukrainian language

51
Established
481 lonePatient/albert_pytorch

A Lite Bert For Self-Supervised Learning Language Representations

51
Established
482 ymcui/Chinese-XLNet

Pre-Trained Chinese XLNet(中文XLNet预训练模型)

51
Established
483 apachecn/nlp-pytorch-zh

《Natural Language Processing with PyTorch》中文翻译

51
Established
484 carpedm20/lstm-char-cnn-tensorflow

in progress

51
Established
485 IndoNLP/indonlu

The first-ever vast natural language processing benchmark for Indonesian...

51
Established
486 lionsoul2014/jcseg

Jcseg is a light weight NLP framework developed with Java. Provide CJK and...

51
Established
487 openvenues/pypostal

Python bindings to libpostal for fast international address parsing/normalization

51
Established
488 ShawnyXiao/TextClassification-Keras

Text classification models implemented in Keras, including: FastText,...

51
Established
489 atulkum/pointer_summarizer

pytorch implementation of "Get To The Point: Summarization with...

51
Established
490 ChenglongChen/kaggle-HomeDepot

3rd Place Solution for HomeDepot Product Search Results Relevance...

51
Established
491 ChenRocks/fast_abs_rl

Code for ACL 2018 paper: "Fast Abstractive Summarization with...

51
Established
492 vncorenlp/VnCoreNLP

A Vietnamese natural language processing toolkit (NAACL 2018)

51
Established
493 akoumjian/datefinder

Find dates inside text using Python and get back datetime objects

51
Established
494 soumyadip007/Microsoft-Student-Partner-Workshop-Learning-Materials-AI-NLP

This repository contains all codes and materials of the current session. It...

51
Established
495 carpedm20/MemN2N-tensorflow

"End-To-End Memory Networks" in Tensorflow

51
Established
496 fossology/safaa

Agent to compliment FOSSology's copyright scanner and find false positive findings.

51
Established
497 NirantK/NLP_Quickbook

NLP in Python with Deep Learning

51
Established
498 chakki-works/chariot

Deliver the ready-to-train data to your NLP model.

51
Established
499 FreedomIntelligence/TextClassificationBenchmark

A Benchmark of Text Classification in PyTorch

51
Established
500 guillaumegenthial/tf_ner

Simple and Efficient Tensorflow implementations of NER models with...

51
Established