lgalke/text-clf-baselines

WideMLP for Text Classification

32
/ 100
Emerging

Implements comparative benchmarks across Bag-of-Words MLPs, graph neural networks (TextGCN, HeteGCN), and Transformer models (BERT, DistilBERT) on standard text classification datasets. The wide MLP approach uses GloVe embeddings with a simple dense architecture, demonstrating competitive or superior performance to graph-based methods while maintaining significantly lower computational overhead—avoiding the O(N²) graph construction and O(L²) attention computations. Includes modular implementations of tokenization, data loading for five benchmark datasets (20ng, R8, R52, OHSUMED, MR), and reproducible experiment scripts from the ACL 2022 paper.

No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 2 / 25
Adoption 7 / 25
Maturity 9 / 25
Community 14 / 25

How are scores calculated?

Stars

29

Forks

5

Language

Python

License

MIT

Last pushed

Aug 10, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/lgalke/text-clf-baselines"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.