caojie54/OTSeq2Set

OTSeq2Set, XMTC

/ 100

Experimental

This project helps categorize text documents into a very large number of relevant topics or labels, which is known as Extreme Multi-label Text Classification (XMTC). You provide the system with a collection of text documents and a vast vocabulary of possible labels, and it outputs the most appropriate labels for each document. This is useful for anyone needing to automatically organize or tag large text datasets, like legal professionals classifying documents, e-commerce managers tagging product descriptions, or content curators categorizing articles.

No commits in the last 6 months.

Use this if you need to assign multiple specific tags or categories from an extremely large list to individual text documents.

Not ideal if you're dealing with a small, fixed number of categories or if your text classification needs are simple.

text-classification document-tagging information-retrieval content-categorization large-scale-labeling

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 8 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Higher-rated alternatives

ryanjgallagher/shifterator

Interpretable data visualizations for understanding how texts differ at the word level

HLasse/TextDescriptives

A Python library for calculating a large variety of metrics from text

jboynyc/textnets

Text analysis with networks.

DemetersSon83/Quantitative-Discursive-Analysis

A tool for quantitatively measuring discursive similarity between bodies of text.

sciknoworg/tib-sid

TIB-SID: A bilingual (English/German) dataset of library catalog records with GND subject...

Explore NLP Tools

All categories Trending NLP directory Insights