Hate Speech Detection Transformer Models

Tools and models for identifying, classifying, and mitigating hate speech, offensive language, and toxic content in text. Does NOT include general sentiment analysis, stance detection, or content moderation for non-hateful policy violations.

There are 52 hate speech detection models tracked. The highest-rated is viddexa/moderators at 37/100 with 5 stars and 129 monthly downloads.

Get all 52 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=hate-speech-detection&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Model Score Tier
1 viddexa/moderators

One package to moderate them all

37
Emerging
2 StyrbjornKall/TRIDENT

A collection of transformer-based models and developmental scripts presented...

32
Emerging
3 Nithin-Holla/meme_challenge

Repository containing code from team Kingsterdam for the Hateful Memes Challenge

32
Emerging
4 jaygala24/fed-hate-speech

The official code repository for the paper titled "A Federated Approach for...

27
Experimental
5 richouzo/hate-speech-detection-survey

Trained Neural Networks (LSTM, HybridCNN/LSTM, PyramidCNN, Transformers,...

26
Experimental
6 MusadiqPasha/Turkish-Hate-Speech-Classification-Explanation

Classify, explain, and rewrite Turkish hate speech tweets using BERT, SHAP,...

26
Experimental
7 muhammadadyl/SarcasmDetection

This project used in my dissertation for detecting sarcasm in twitter...

26
Experimental
8 MatteoFasulo/Sexism-detection

Natural Language Processing for Sexism Detection

24
Experimental
9 ilias-ant/toxic-spans-detection

An attempt at SemEval 2021 Task 5: Toxic Spans Detection.

24
Experimental
10 iqbal-sk/Detecting-Persuasion-Techniques-in-Memes

Hierarchical, multilingual, multimodal detection of persuasion techniques in...

23
Experimental
11 leorrose/ChatGPT-Hate-Speech

Ben Gurion University "Natural Language Processing (372.2.5702)" course project

23
Experimental
12 TimeLordRaps/satisfiable-ai

Verified training data for frontier AI. Every sample passes a SAT gate....

23
Experimental
13 avrtt/telegram-content-moderator

NLP/ViT-driven bot for detection & moredation of inappropriate content in...

23
Experimental
14 GU-DataLab/stance-detection-KE-MLM

Official resource of the paper "Knowledge Enhanced Masked Language Model for...

22
Experimental
15 premiouhxu4525/tinysafe-2

Classify text as safe or unsafe using a 141M parameter DeBERTa-v3 model with...

22
Experimental
16 jdleo/tinysafe-1

71M parameter safety classifier (DeBERTa-v3-xsmall). Dual-head: binary...

21
Experimental
17 jdleo/tinysafe-2

141M param safety model (not much better than v1, but a great learning)

20
Experimental
18 nikhil6041/OLI-and-Meme-Classification

Author's implementation of the paper...

20
Experimental
19 cvcio/rtaa-classifier

Comments & Twitter accounts gRPC classification service.

19
Experimental
20 KvaytG/ru-toxicity-detector

A simple toxicity detector.

19
Experimental
21 shruti-sivakumar/Multimodal-Hateful-Memes-Detection

Multimodal deep learning pipeline for hateful meme detection using ResNet50...

19
Experimental
22 nayanpreet/AI-Powered-Toxic-Comment-Detection-and-Moderation-System

Transformer-based multi-label toxicity classifier with GenAI-assisted...

19
Experimental
23 YukiFujimatsu/Personalized-Flaming-Prediction

Implementation of personalized real-time flaming risk prediction model (IIAI...

19
Experimental
24 ArunavaKumar/offenseval-nlp

Transformer-based offensive language detection using DistilBERT embeddings,...

19
Experimental
25 pkdubey/content_moderation

An AI-powered content moderation system using Python and Hugging Face...

16
Experimental
26 eftekhar-hossain/CUET_NLP-EACL_2021

This repository contains the system description and the codes that we...

16
Experimental
27 gerzin/irony-and-sarcasm-detector-italian

NLP project that analyses Italian tweets and finds out if they are ironic or...

15
Experimental
28 HumasFurquan/Hate-Speech-Detection-2.0

End-to-end hate speech detection system using Transformer-based NLP models,...

15
Experimental
29 chuachinhon/transformers_state_trolls_cch

Detect state trolls on Twitter using Transformers + Comparison of results...

15
Experimental
30 Fabio295/tinysafe-1

Detect harmful content with a 71M-parameter safety classifier using...

14
Experimental
31 Brahmendra-Ramoju/TrustLayer_AI

AI-powered content moderation API with toxicity detection and trust scoring...

14
Experimental
32 jaychampaneri14/content-moderator

Multi-label content moderation for text and images

14
Experimental
33 mathildeoutters/Detect-patronizing-language

Participation to SemEval-2022 Task4 - Patronizing and Condescending Language...

14
Experimental
34 AditiBagora/Hasoc2021CodeMix

HASOC2021: Subtask 2 a) Codemix Challenge; Contains baselines and...

14
Experimental
35 minuva/fast-nlp-text-toxicity

Fast text toxicity classification model

12
Experimental
36 Damarcreative/secure-upload

Remove adult content in discord channels better with Artificial Intelligence.

12
Experimental
37 hiyouga/Toxic_Detection

BUAA SCSE Autumn 2021 Machine Learning Group Homework

12
Experimental
38 training-datalab/gold-standard-toxicity

Gold Standard for Toxicity and Incivility Project

12
Experimental
39 muzmax/MSTAR_feature_extraction

General Feature Extraction in SAR Target Classification: A Contrastive...

11
Experimental
40 CoGian/Detecting-toxic-comments-and-minimizing-of-unintetional-prejudice-using-neural-networks

This is my repository and all the code needed to complete my Bachelor thesis...

11
Experimental
41 atrip0305/hate-speech-detection

A machine learning–based hate speech detection system that classifies text...

11
Experimental
42 yellatp/detoxify-telugu

A Fine-Tuned BERT-Based Language Model for Hate Speech Detection in Telugu & Tenglish

11
Experimental
43 Kirti-Vatsh/NLP---Toxic-Comment-Classification

Classifying toxic comments using NLP, machine learning, and deep learning....

11
Experimental
44 Sayan-Mondal2022/comment_toxicity_classifier

This is an NLP project that detects and labels toxic or harmful content in...

11
Experimental
45 lopezrbn/kaggle_toxicity_challenge

Multi-label toxic-comment classifier (DeBERTa v3, Kaggle Jigsaw Challenge) —...

11
Experimental
46 Cisem-Cy/distilbert-hate-speech-detection

This repository contains a neural NLP project focused on classifying social...

11
Experimental
47 Bhawnakapri/DeepSignal-AI-Safety-Engine

Transformer-based AI Safety Intelligence System for multi-label...

11
Experimental
48 francescobaio/NLP_Assignments

Assignments in the realm of Natural Language Processing for Sexism...

11
Experimental
49 kanincityy/misogyny_detection_transformers

Building an Effective Misogyny Detection Classifier for Low-Resource Languages

11
Experimental
50 Het-Somaiya/cyber-bullying-detection

Comparative NLP framework evaluating DistilBERT vs. dual-layer LSTMs....

11
Experimental
51 devroopsaha744/HateSpeechDetect-text

In this project, I focused on benchmarking various machine learning models,...

10
Experimental
52 shivanshka/Multilingual-Toxic-Comment-Classifier

Created a system which will detect whether any text (comment) is toxic or...

10
Experimental