Word Stemming Stemmers NLP Tools
Tools and libraries for reducing words to their root or base form through stemming algorithms across various languages. Includes language-specific stemmers, Porter stemming implementations, and multilingual stemming frameworks. Does NOT include lemmatization, morphological analysis beyond stemming, or general text normalization.
There are 53 word stemming stemmers tools tracked. 1 score above 70 (verified tier). The highest-rated is hplt-project/sacremoses at 71/100 with 495 stars and 2,424,232 monthly downloads.
Get all 53 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=word-stemming-stemmers&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
hplt-project/sacremoses
Python port of Moses tokenizer, truecaser and normalizer |
|
Verified |
| 2 |
adbar/simplemma
Simple multilingual lemmatizer for Python, especially useful for speed and efficiency |
|
Established |
| 3 |
winkjs/wink-porter2-stemmer
Javascript Implementation of Porter Stemmer Algorithm V2 by Dr Martin F Porter |
|
Established |
| 4 |
sorenlind/lemmy
π€Lemmy is a lemmatizer for Danish π©π° and Swedish πΈπͺ |
|
Emerging |
| 5 |
htaghizadeh/PersianStemmer-Python
PersianStemmer-Python |
|
Emerging |
| 6 |
Blake-Madden/OleanderStemmingLibrary
Porter stemming library (C++) |
|
Emerging |
| 7 |
winkjs/wink-lemmatizer
English lemmatizer |
|
Emerging |
| 8 |
WZBSocialScienceCenter/germalemma
A lemmatizer for German language text |
|
Emerging |
| 9 |
michmech/lemmatization-lists
Machine-readable lists of lemma-token pairs in 23 languages. |
|
Emerging |
| 10 |
bastienbot/nlp-js-tools-french
POS Tagger, lemmatizer and stemmer for french language in javascript |
|
Emerging |
| 11 |
damzaky/sastrawijs
Indonesian language stemmer. Javascript port of PHP Sastrawi project. |
|
Emerging |
| 12 |
donderom/stemerge
A collection of stemmers in Erlang π± |
|
Emerging |
| 13 |
deniskyashif/ssfst
Rewrite text in linear time. |
|
Emerging |
| 14 |
master/spark-stemming
Spark MLlib wrapper for the Snowball framework |
|
Emerging |
| 15 |
LeonieWeissweiler/CISTEM
Stemmer for German |
|
Emerging |
| 16 |
yohasebe/lemmatizer
Lemmatizer for text in English. Inspired by Python's... |
|
Emerging |
| 17 |
xiamx/gen_fst
Elixir module that implements a generic finite state transducer with... |
|
Emerging |
| 18 |
xiamx/lemma
A Morphological Parser (Analyser) / Lemmatizer written in Elixir. |
|
Emerging |
| 19 |
putuwaw/linggapy
Library for Stemming Balinese Text Language |
|
Emerging |
| 20 |
luridarmawan/StemmingWord
Tools StemmingWord berbasis web, menggunakan bahasa pascal dengan framework FastPlaz |
|
Emerging |
| 21 |
zentrum-lexikographie/sfst-transduce
Python bindings for SFST focusing on transducer usage |
|
Emerging |
| 22 |
writecrow/lemmatizer
A PHP library for getting a lemma from a given word, and getting a list of... |
|
Emerging |
| 23 |
htaghizadeh/JPersianStemmer
Persian stemmer |
|
Emerging |
| 24 |
FinNLP/lemmatizer
π¦ English word lemmatizer |
|
Experimental |
| 25 |
dzieciou/pystempel
Python port of Stempel, an algorithmic stemmer for Polish language. |
|
Experimental |
| 26 |
tokenmill/snowball
Snowball version of the Porter stemmer for the Lithuanian language. |
|
Experimental |
| 27 |
Cirice/Ereina
Language rules for Persian texts |
|
Experimental |
| 28 |
SeekStorm/snowball-stemmers-rs
snowball_stemmers_rs: a Snowball stemmer in 38 languages, in Rust |
|
Experimental |
| 29 |
andrianllmm/tagalog-stemmer
A Python library for Tagalog word stemming |
|
Experimental |
| 30 |
andrianllmm/aklanon-stemmer
A Python library for Aklanon word stemming |
|
Experimental |
| 31 |
naomilago/pt_lemmatizer
This repo aims to store code for a Portuguese Lemmatizer, a PyPI package. |
|
Experimental |
| 32 |
anishLearnsToCode/porter-stemmer
Python Implimentation of the Famous Porter Stemmer Algorithm used in... |
|
Experimental |
| 33 |
kampsy/gwizo
Simple Go implementation of the Porter Stemmer algorithm with powerful features. |
|
Experimental |
| 34 |
htaghizadeh/PersianStemmer
A New Rule-Based Persian Stemmer Using Regular Expression |
|
Experimental |
| 35 |
stdlib-js/nlp-porter-stemmer
Extract the stem of a given word. |
|
Experimental |
| 36 |
greenat92/arabicstemmer_frontend
frontend web app for snowball arabic stemmer algorithm |
|
Experimental |
| 37 |
mshka/farsi_processor
Farsi processor is a Ruby gem to process (stem and normalize) Persian/Farsi text |
|
Experimental |
| 38 |
domPatera/stemmer-bundle
This bundle integrates the dompat/stemmer library into Symfony. It provides... |
|
Experimental |
| 39 |
domPatera/stemmer
PHP Library for word stemming. This library helps reduce words to their base... |
|
Experimental |
| 40 |
joom/Divan.hs
Ottoman Divan poetry vezin checker in Haskell! |
|
Experimental |
| 41 |
golang-nlp/stopwords
Stopwords module for golang |
|
Experimental |
| 42 |
Flight-School/lemma
A command-line utility that lemmatizes words in natural language text. |
|
Experimental |
| 43 |
rojvv/rustress
JavaScript library to mark stresses in Russian text. |
|
Experimental |
| 44 |
openderocknlp/extract-lemmatized-nonstop-words
Extracts a pure list of stemmed words of a text filtered by stop words |
|
Experimental |
| 45 |
antonbaumann/german-go-stemmer
An efficient implementation of the German porter-stemming algorithm in Golang. |
|
Experimental |
| 46 |
N8Brooks/snowball
β Snowball stemmers for Deno. |
|
Experimental |
| 47 |
ancatmara/early-irish-lemmatizer
A DIL-based lemmatizer for Early Irish data. |
|
Experimental |
| 48 |
FinNLP/en-stemmer
π¦ Porter stemmer implementation |
|
Experimental |
| 49 |
renan823/portuguese-stemmer
Go implementation of Snowball Portuguese Stemmer |
|
Experimental |
| 50 |
crlwingen/TagalogWordStemmer
Tagalog Word Stemmer made with Java. |
|
Experimental |
| 51 |
ontypehq/libfst
Finite-state transducer library for text normalization |
|
Experimental |
| 52 |
olga-black/truecase_german
A program for truecasing German text with incorrect capitalization |
|
Experimental |
| 53 |
Wollaston/ArabicStemmer
A small web app that uses NLTK's Arabic stemming algorithms to identify the... |
|
Experimental |