jinhangjiang/textregress

TextRegress is a Python package designed to help researchers perform advanced regression analysis on long-form text data.

40
/ 100
Emerging

Researchers often need to predict numerical outcomes based on long text documents, like sentiment scores from reviews or risk levels from reports. This project helps by taking your text data and any additional numerical features, processing them, and then outputting precise numerical predictions along with explanations of which parts of the text or features contributed most. It's designed for quantitative researchers, data scientists, and analysts working with rich, unstructured text.

No commits in the last 6 months. Available on PyPI.

Use this if you need to build robust regression models that can accurately predict continuous values from extensive text documents, potentially combined with other structured data.

Not ideal if your primary goal is text classification (categorizing text) rather than predicting a numerical outcome, or if you only have short, simple text snippets.

quantitative-research text-analytics predictive-modeling data-science sentiment-analysis
Stale 6m
Maintenance 2 / 25
Adoption 4 / 25
Maturity 25 / 25
Community 9 / 25

How are scores calculated?

Stars

7

Forks

1

Language

Python

License

Apache-2.0

Last pushed

Jul 06, 2025

Commits (30d)

0

Dependencies

9

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/nlp/jinhangjiang/textregress"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.