Language Identification NLP Tools
Tools for automatically detecting and classifying the language of input text. Does NOT include language-specific NLP processing, multilingual models for downstream tasks, or code-switching analysis beyond language identification.
There are 38 language identification tools tracked. The highest-rated is pemistahl/lingua-py at 49/100 with 1,659 stars. 1 of the top 10 are actively maintained.
Get all 38 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=language-identification&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Tool | Score | Tier |
|---|---|---|---|
| 1 |
pemistahl/lingua-py
The most accurate natural language detection library for Python, suitable... |
|
Emerging |
| 2 |
nickdavidhaynes/spacy-cld
Language detection extension for spaCy 2.0+ |
|
Emerging |
| 3 |
indix/whatthelang
Lightning Fast Language Prediction π |
|
Emerging |
| 4 |
mbanon/fastspell
Targetted language identifier, based on FastText and Hunspell. |
|
Emerging |
| 5 |
nitotm/efficient-language-detector-js
Fast and accurate natural language detection. Detector written in... |
|
Emerging |
| 6 |
nitotm/efficient-language-detector
Fast and accurate natural language detection. Detector written in PHP. Nito-ELD, ELD. |
|
Emerging |
| 7 |
patrickschur/language-detection
A language detection library for PHP. Detects the language from a given text string. |
|
Emerging |
| 8 |
Ankush-Chander/messandei
A simple stopword based language detector |
|
Emerging |
| 9 |
searchpioneer/lingua-dotnet
Natural language detection library for .NET, suitable for long and short text alike |
|
Experimental |
| 10 |
zamgi/lingvo--LanguageDetector
Implementation of detection a few language |
|
Experimental |
| 11 |
honeybhardwaj/Language_Identification
it is a language identifier that detect different languages. |
|
Experimental |
| 12 |
fedelopez77/langdetect
A language detection software |
|
Experimental |
| 13 |
kaiidams/LanguageDetection
C# port of https://github.com/shuyo/language-detection |
|
Experimental |
| 14 |
aravind-selvam/language_identification-using-cnn-and-audio-processing
An Web application Language Identification project uses Pytorch and... |
|
Experimental |
| 15 |
nitotm/efficient-language-detector-py
Fast and accurate natural language detection. Detector written in Python.... |
|
Experimental |
| 16 |
loonghuey/native-language-cnn
Speech subtask of the 2017 NLI Shared Task |
|
Experimental |
| 17 |
tomelf/CNIT623-Native-Language-Identification-On-English-Learner-Dataset
Exploring how to identify the nationality of authors who answered exam... |
|
Experimental |
| 18 |
lkevers/ldig-models-TAL62-3
Language identification models for 17 European official languages and... |
|
Experimental |
| 19 |
floydhub/language-identification-template
Detect the languages from short pieces of text |
|
Experimental |
| 20 |
ilinguistics/geoLid
Geographically-informed language identification |
|
Experimental |
| 21 |
SomeAB/somelang
Natural Language Detection |
|
Experimental |
| 22 |
Al00X/LanguageDetector
Detect language from a text string in Swift! |
|
Experimental |
| 23 |
ffreemt/fast-langid
Detect language of a given text, fast |
|
Experimental |
| 24 |
andrianllmm/tagLID
A word-level Language Identification (LID) tool for Tagalog-English (Taglish) text |
|
Experimental |
| 25 |
Jason-Oleana/fasttext-language-detection
Fasttext language detection wrapped in Fastapi + Dockerπ |
|
Experimental |
| 26 |
PhilWicke/Language_Identifier
Language Identification classification using XGBoost |
|
Experimental |
| 27 |
javadr/Language_Detection
Detection of the language of a text with Multinomial Naive Bayes method and... |
|
Experimental |
| 28 |
br-pki/detectLanguage
To 1) create train/test samples of Tatoeba sentences for NLP-related tasks &... |
|
Experimental |
| 29 |
Lidan0241/language-detection
A language detection model for code-switched texts in es/en/zh |
|
Experimental |
| 30 |
masalha-alaa/native-language-recognition
Mother tongue prediction from reddit posts (Deep Learning vs. Regular... |
|
Experimental |
| 31 |
aparnadutta/code-mixed-lid
Word-level language identification for Bangla-English code-mixed social... |
|
Experimental |
| 32 |
javadr/PyTorch-Detect-Code-Switching
Implementation of a deep learning model (BiLSTM) to detect code-switching |
|
Experimental |
| 33 |
Interaction-Bot/LanguageDetection
Experimental language detector used by Interaction Bot. |
|
Experimental |
| 34 |
Ehsan-Tavan/Language_Identification
Automatic detection of languages in text utilizing machine learning and Deep... |
|
Experimental |
| 35 |
Aayushinit/LanguageDetectorApp
Real-time background subtraction using OpenCV + Flask with switchable... |
|
Experimental |
| 36 |
giacomolat/MuseumLangID---Model-for-Identifying-the-Language-of-Texts-for-a-Museum
This repository contains a Language Identification project to classify... |
|
Experimental |
| 37 |
xzhren/PreferenceAwareLID
Unsupervised Preference-Aware Language Identification |
|
Experimental |
| 38 |
AigozhiyevB/kazakh-russian-classification
ΠΠ΅Π±ΠΎΠ»ΡΡΠ°Ρ ΠΌΠΎΠ΄Π΅Π»Ρ ΠΊΠ»Π°ΡΡΠΈΡΠΈΠΊΠ°ΡΠΈΠΈ ΠΊΠ°Π·Π°Ρ ΡΠΊΠΎΠ³ΠΎ ΠΈ ΡΡΡΡΠΊΠΎΠ³ΠΎ ΡΠ·ΡΠΊΠΎΠ² |
|
Experimental |