lingua-rs and whatlang-rs
Both are standalone natural language detection libraries that compete for the same use case, though lingua-rs claims superior accuracy while whatlang-rs emphasizes speed and minimal dependencies.
About lingua-rs
pemistahl/lingua-rs
The most accurate natural language detection library for Rust, suitable for short text and mixed-language text
Based on the README, here's a technical summary: Combines rule-based and statistical Naive Bayes classification without neural networks or external dictionaries, enabling offline language detection from single words to full sentences across 75 languages. Trained on Leipzig University corpora with separate train/test splits from news data, delivering measurably higher accuracy on short text than competing libraries like CLD2 and Whatlang. Includes minimal configuration requirements and ships with bundled language models for immediate use without API dependencies.
About whatlang-rs
greyblake/whatlang-rs
Natural language detection library for Rust. Try demo online: https://whatlang.org/
Detects 70 languages using trigram-based language models, returning both language identification and script recognition (Latin, Cyrillic, etc.) alongside confidence metrics. Built entirely in Rust with zero external dependencies, it calculates detection reliability using a hyperbolic threshold function that considers unique trigram frequency and inter-language score differential. Provides FFI bindings and integrates with major search engines like Meilisearch and Sonic, with optional serde/enum-map support for ecosystem compatibility.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work