adbar/simplemma

Simple multilingual lemmatizer for Python, especially useful for speed and efficiency

54
/ 100
Established

Operates on pre-built morphological dictionaries covering 49 languages, enabling lemmatization without morphosyntactic tagging or external dependencies. Features optional greedy decomposition for handling affixes and compound words, chainable language fallbacks for improved coverage, and built-in tokenization with a `lang_detector()` function for automatic language identification. Pure Python implementation with minimal footprint makes it suitable for low-resource environments, educational use, or baseline NLP systems.

188 stars and 59,856 monthly downloads. Used by 2 other packages. No commits in the last 6 months. Available on PyPI.

Stale 6m
Maintenance 2 / 25
Adoption 22 / 25
Maturity 18 / 25
Community 12 / 25

How are scores calculated?

Stars

188

Forks

15

Language

Python

License

MIT

Last pushed

Jun 06, 2025

Monthly downloads

59,856

Commits (30d)

0

Reverse dependents

2

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/adbar/simplemma"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.