NoYo25/ClusteringTableHeaders

This project aims at creating an RDF schema given a list of column headers of a tabular dataset. It first transforms the given header list into meaningful vectors, then it applies a distance-based Clustering algorithm such that it maximizes the similarity among headers inside one cluster. The user has the facility to move items from one cluster to another and merge among some clusters. The system can suggest cluster names based on the commonality among its members. If no common word found, it will produce Unknown. Afterwards, the user can rename the automatically generated names. Finally, it can expose the resultant clusters in an RDF format.

15
/ 100
Experimental

No commits in the last 6 months.

No License Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 2 / 25
Maturity 1 / 25
Community 12 / 25

How are scores calculated?

Stars

2

Forks

1

Language

Python

License

Last pushed

Feb 23, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/NoYo25/ClusteringTableHeaders"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.