pemistahl/lingua-rs
The most accurate natural language detection library for Rust, suitable for short text and mixed-language text
Based on the README, here's a technical summary: Combines rule-based and statistical Naive Bayes classification without neural networks or external dictionaries, enabling offline language detection from single words to full sentences across 75 languages. Trained on Leipzig University corpora with separate train/test splits from news data, delivering measurably higher accuracy on short text than competing libraries like CLD2 and Whatlang. Includes minimal configuration requirements and ships with bundled language models for immediate use without API dependencies.
1,067 stars and 235,303 monthly downloads. Actively maintained with 2 commits in the last 30 days.
Stars
1,067
Forks
53
Language
Rust
License
Apache-2.0
Category
Last pushed
Mar 09, 2026
Monthly downloads
235,303
Commits (30d)
2
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/nlp/pemistahl/lingua-rs"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
forzagreen/n2words
Convert numerical numbers to written numbers, in 52+ languages.
PyThaiNLP/nlpo3
Thai natural language processing library in Rust, with Python and Node bindings.
greyblake/whatlang-rs
Natural language detection library for Rust. Try demo online: https://whatlang.org/
wikimedia/sentencex
A sentence segmentation library with wide language support optimized for speed and utility.
fbilhaut/gline-rs
Inference engine for GLiNER models, in Rust