uhh-lt/Taxonomy_Refinement_Embeddings
Taxonomy refinement method to improve domain-specific taxonomy systems.
Leverages Poincaré embeddings in hyperbolic space to detect and correct misplaced hyponyms and reconnect orphaned terms in pre-induced taxonomies, outperforming Euclidean embeddings on hierarchical lexical relationships. The pipeline operates on partial taxonomies from existing systems (TAXI, USAAR, JUNLP) across multiple languages and domains, comparing word2vec and hyperbolic embeddings trained on both WordNet and noisy corpus-extracted relations. Achieves state-of-the-art results on SemEval-2016 Task 13 taxonomy extraction benchmarks.
No commits in the last 6 months.
Stars
29
Forks
1
Language
Python
License
GPL-3.0
Category
Last pushed
Jun 17, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/uhh-lt/Taxonomy_Refinement_Embeddings"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
TorchDR/TorchDR
TorchDR - PyTorch Dimensionality Reduction
derrickburns/generalized-kmeans-clustering
Production-ready K-Means clustering for Apache Spark with pluggable Bregman divergences (KL,...
abhilash1910/ClusterTransformer
Topic clustering library built on Transformer embeddings and cosine similarity...
md-experiments/picture_text
Interactive tree-maps with SBERT & Hierarchical Clustering (HAC)
nlpub/watset-java
An implementation of the Watset clustering algorithm in Java.