miso-belica/sumy
Module for automatic summarization of text documents and HTML pages.
Implements multiple extractive summarization algorithms (LSA, LexRank, Luhn, Edmundson) with multilingual tokenizer support across 50+ languages, enabling language-agnostic text processing. Provides both a Python API and CLI interface with built-in evaluation framework for comparing summaries against reference texts. Handles diverse input formats including HTML pages, plain text, and URLs with configurable output lengths.
3,665 stars and 165,186 monthly downloads. Used by 3 other packages. Actively maintained with 1 commit in the last 30 days. Available on PyPI.
Stars
3,665
Forks
541
Language
Python
License
Apache-2.0
Category
Last pushed
Feb 14, 2026
Monthly downloads
165,186
Commits (30d)
1
Dependencies
7
Reverse dependents
3
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/miso-belica/sumy"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
theeluwin/lexrankr
LexRank for Korean.
summanlp/textrank
TextRank implementation for Python 3.
Wordcab/wordcab-python
📖 Transcribe and Summarize any business communication at scale with Wordcab's API
ArtistScript/FastTextRank
中文文本摘要/关键词提取
ebenso/TextSummarizer
TextRank implementation for C#