Word Stemming Stemmers NLP Tools

Tools and libraries for reducing words to their root or base form through stemming algorithms across various languages. Includes language-specific stemmers, Porter stemming implementations, and multilingual stemming frameworks. Does NOT include lemmatization, morphological analysis beyond stemming, or general text normalization.

There are 53 word stemming stemmers tools tracked. 1 score above 70 (verified tier). The highest-rated is hplt-project/sacremoses at 71/100 with 495 stars and 2,424,232 monthly downloads.

Get all 53 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=word-stemming-stemmers&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 hplt-project/sacremoses

Python port of Moses tokenizer, truecaser and normalizer

71
Verified
2 adbar/simplemma

Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

54
Established
3 winkjs/wink-porter2-stemmer

Javascript Implementation of Porter Stemmer Algorithm V2 by Dr Martin F Porter

53
Established
4 sorenlind/lemmy

🀘Lemmy is a lemmatizer for Danish πŸ‡©πŸ‡° and Swedish πŸ‡ΈπŸ‡ͺ

49
Emerging
5 htaghizadeh/PersianStemmer-Python

PersianStemmer-Python

47
Emerging
6 Blake-Madden/OleanderStemmingLibrary

Porter stemming library (C++)

47
Emerging
7 winkjs/wink-lemmatizer

English lemmatizer

47
Emerging
8 WZBSocialScienceCenter/germalemma

A lemmatizer for German language text

46
Emerging
9 michmech/lemmatization-lists

Machine-readable lists of lemma-token pairs in 23 languages.

43
Emerging
10 bastienbot/nlp-js-tools-french

POS Tagger, lemmatizer and stemmer for french language in javascript

42
Emerging
11 damzaky/sastrawijs

Indonesian language stemmer. Javascript port of PHP Sastrawi project.

42
Emerging
12 donderom/stemerge

A collection of stemmers in Erlang 🌱

39
Emerging
13 deniskyashif/ssfst

Rewrite text in linear time.

37
Emerging
14 master/spark-stemming

Spark MLlib wrapper for the Snowball framework

35
Emerging
15 LeonieWeissweiler/CISTEM

Stemmer for German

34
Emerging
16 yohasebe/lemmatizer

Lemmatizer for text in English. Inspired by Python's...

33
Emerging
17 xiamx/gen_fst

Elixir module that implements a generic finite state transducer with...

32
Emerging
18 xiamx/lemma

A Morphological Parser (Analyser) / Lemmatizer written in Elixir.

32
Emerging
19 putuwaw/linggapy

Library for Stemming Balinese Text Language

32
Emerging
20 luridarmawan/StemmingWord

Tools StemmingWord berbasis web, menggunakan bahasa pascal dengan framework FastPlaz

31
Emerging
21 zentrum-lexikographie/sfst-transduce

Python bindings for SFST focusing on transducer usage

30
Emerging
22 writecrow/lemmatizer

A PHP library for getting a lemma from a given word, and getting a list of...

30
Emerging
23 htaghizadeh/JPersianStemmer

Persian stemmer

30
Emerging
24 FinNLP/lemmatizer

πŸ“¦ English word lemmatizer

29
Experimental
25 dzieciou/pystempel

Python port of Stempel, an algorithmic stemmer for Polish language.

27
Experimental
26 tokenmill/snowball

Snowball version of the Porter stemmer for the Lithuanian language.

27
Experimental
27 Cirice/Ereina

Language rules for Persian texts

26
Experimental
28 SeekStorm/snowball-stemmers-rs

snowball_stemmers_rs: a Snowball stemmer in 38 languages, in Rust

26
Experimental
29 andrianllmm/tagalog-stemmer

A Python library for Tagalog word stemming

24
Experimental
30 andrianllmm/aklanon-stemmer

A Python library for Aklanon word stemming

24
Experimental
31 naomilago/pt_lemmatizer

This repo aims to store code for a Portuguese Lemmatizer, a PyPI package.

23
Experimental
32 anishLearnsToCode/porter-stemmer

Python Implimentation of the Famous Porter Stemmer Algorithm used in...

23
Experimental
33 kampsy/gwizo

Simple Go implementation of the Porter Stemmer algorithm with powerful features.

23
Experimental
34 htaghizadeh/PersianStemmer

A New Rule-Based Persian Stemmer Using Regular Expression

23
Experimental
35 stdlib-js/nlp-porter-stemmer

Extract the stem of a given word.

22
Experimental
36 greenat92/arabicstemmer_frontend

frontend web app for snowball arabic stemmer algorithm

22
Experimental
37 mshka/farsi_processor

Farsi processor is a Ruby gem to process (stem and normalize) Persian/Farsi text

21
Experimental
38 domPatera/stemmer-bundle

This bundle integrates the dompat/stemmer library into Symfony. It provides...

20
Experimental
39 domPatera/stemmer

PHP Library for word stemming. This library helps reduce words to their base...

20
Experimental
40 joom/Divan.hs

Ottoman Divan poetry vezin checker in Haskell!

19
Experimental
41 golang-nlp/stopwords

Stopwords module for golang

17
Experimental
42 Flight-School/lemma

A command-line utility that lemmatizes words in natural language text.

16
Experimental
43 rojvv/rustress

JavaScript library to mark stresses in Russian text.

15
Experimental
44 openderocknlp/extract-lemmatized-nonstop-words

Extracts a pure list of stemmed words of a text filtered by stop words

13
Experimental
45 antonbaumann/german-go-stemmer

An efficient implementation of the German porter-stemming algorithm in Golang.

13
Experimental
46 N8Brooks/snowball

β›„ Snowball stemmers for Deno.

13
Experimental
47 ancatmara/early-irish-lemmatizer

A DIL-based lemmatizer for Early Irish data.

12
Experimental
48 FinNLP/en-stemmer

πŸ“¦ Porter stemmer implementation

11
Experimental
49 renan823/portuguese-stemmer

Go implementation of Snowball Portuguese Stemmer

11
Experimental
50 crlwingen/TagalogWordStemmer

Tagalog Word Stemmer made with Java.

11
Experimental
51 ontypehq/libfst

Finite-state transducer library for text normalization

11
Experimental
52 olga-black/truecase_german

A program for truecasing German text with incorrect capitalization

10
Experimental
53 Wollaston/ArabicStemmer

A small web app that uses NLTK's Arabic stemming algorithms to identify the...

10
Experimental