Language Identification NLP Tools

Tools for automatically detecting and classifying the language of input text. Does NOT include language-specific NLP processing, multilingual models for downstream tasks, or code-switching analysis beyond language identification.

There are 38 language identification tools tracked. The highest-rated is pemistahl/lingua-py at 49/100 with 1,659 stars. 1 of the top 10 are actively maintained.

Get all 38 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=nlp&subcategory=language-identification&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 pemistahl/lingua-py

The most accurate natural language detection library for Python, suitable...

49
Emerging
2 nickdavidhaynes/spacy-cld

Language detection extension for spaCy 2.0+

47
Emerging
3 indix/whatthelang

Lightning Fast Language Prediction πŸš€

47
Emerging
4 mbanon/fastspell

Targetted language identifier, based on FastText and Hunspell.

45
Emerging
5 nitotm/efficient-language-detector-js

Fast and accurate natural language detection. Detector written in...

45
Emerging
6 nitotm/efficient-language-detector

Fast and accurate natural language detection. Detector written in PHP. Nito-ELD, ELD.

39
Emerging
7 patrickschur/language-detection

A language detection library for PHP. Detects the language from a given text string.

37
Emerging
8 Ankush-Chander/messandei

A simple stopword based language detector

35
Emerging
9 searchpioneer/lingua-dotnet

Natural language detection library for .NET, suitable for long and short text alike

29
Experimental
10 zamgi/lingvo--LanguageDetector

Implementation of detection a few language

29
Experimental
11 honeybhardwaj/Language_Identification

it is a language identifier that detect different languages.

27
Experimental
12 fedelopez77/langdetect

A language detection software

27
Experimental
13 kaiidams/LanguageDetection

C# port of https://github.com/shuyo/language-detection

26
Experimental
14 aravind-selvam/language_identification-using-cnn-and-audio-processing

An Web application Language Identification project uses Pytorch and...

26
Experimental
15 nitotm/efficient-language-detector-py

Fast and accurate natural language detection. Detector written in Python....

26
Experimental
16 loonghuey/native-language-cnn

Speech subtask of the 2017 NLI Shared Task

24
Experimental
17 tomelf/CNIT623-Native-Language-Identification-On-English-Learner-Dataset

Exploring how to identify the nationality of authors who answered exam...

24
Experimental
18 lkevers/ldig-models-TAL62-3

Language identification models for 17 European official languages and...

23
Experimental
19 floydhub/language-identification-template

Detect the languages from short pieces of text

23
Experimental
20 ilinguistics/geoLid

Geographically-informed language identification

22
Experimental
21 SomeAB/somelang

Natural Language Detection

21
Experimental
22 Al00X/LanguageDetector

Detect language from a text string in Swift!

20
Experimental
23 ffreemt/fast-langid

Detect language of a given text, fast

19
Experimental
24 andrianllmm/tagLID

A word-level Language Identification (LID) tool for Tagalog-English (Taglish) text

17
Experimental
25 Jason-Oleana/fasttext-language-detection

Fasttext language detection wrapped in Fastapi + DockerπŸ‹

16
Experimental
26 PhilWicke/Language_Identifier

Language Identification classification using XGBoost

16
Experimental
27 javadr/Language_Detection

Detection of the language of a text with Multinomial Naive Bayes method and...

15
Experimental
28 br-pki/detectLanguage

To 1) create train/test samples of Tatoeba sentences for NLP-related tasks &...

14
Experimental
29 Lidan0241/language-detection

A language detection model for code-switched texts in es/en/zh

14
Experimental
30 masalha-alaa/native-language-recognition

Mother tongue prediction from reddit posts (Deep Learning vs. Regular...

13
Experimental
31 aparnadutta/code-mixed-lid

Word-level language identification for Bangla-English code-mixed social...

13
Experimental
32 javadr/PyTorch-Detect-Code-Switching

Implementation of a deep learning model (BiLSTM) to detect code-switching

13
Experimental
33 Interaction-Bot/LanguageDetection

Experimental language detector used by Interaction Bot.

12
Experimental
34 Ehsan-Tavan/Language_Identification

Automatic detection of languages in text utilizing machine learning and Deep...

12
Experimental
35 Aayushinit/LanguageDetectorApp

Real-time background subtraction using OpenCV + Flask with switchable...

11
Experimental
36 giacomolat/MuseumLangID---Model-for-Identifying-the-Language-of-Texts-for-a-Museum

This repository contains a Language Identification project to classify...

11
Experimental
37 xzhren/PreferenceAwareLID

Unsupervised Preference-Aware Language Identification

11
Experimental
38 AigozhiyevB/kazakh-russian-classification

НСбольшая модСль классификации казахского ΠΈ русского языков

10
Experimental