rishabhmisra/News-Headlines-Dataset-For-Sarcasm-Detection

High quality dataset for the task of Sarcasm Detection

/ 100

Emerging

Contains 28,619 professionally-written news headlines (13,635 sarcastic from *The Onion*, 14,984 non-sarcastic from *HuffPost*) with self-contained, noise-free labels and 23.35% out-of-vocabulary rate for word2vec embeddings. Addresses Twitter dataset limitations by using formal news text without spelling errors or contextual dependencies, enabling more reliable sarcasm detection model training. Data is distributed as JSONL with headline text, sarcasm labels, and source article links for supplementary data collection.

No commits in the last 6 months.

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 9 / 25

Maturity 1 / 25

Community 22 / 25

How are scores calculated?

Stars

Forks

Language

—

License

—

Higher-rated alternatives

Hironsan/HateSonar

Hate Speech Detection Library for Python.

t-davidson/hate-speech-and-offensive-language

Repository for the paper "Automated Hate Speech Detection and the Problem of Offensive...

franciellevargas/HateBR

HateBR is the first large-scale expert annotated dataset of Brazilian Instagram comments for...

b4k0/CBDA

Cyber Bullying Detection Application (CBDA)

raklugrin01/Disaster-Tweets-Analysis-and-Classification

Analysing Disaster related tweets dataset and build a classifier using deep learning and deploy...

Explore ML Frameworks

All categories Trending ML Framework directory Insights