abhilash1910/ClusterTransformer
Topic clustering library built on Transformer embeddings and cosine similarity metrics.Compatible with all BERT base transformers from huggingface.
Supports both k-means and agglomerative clustering strategies with configurable hyperparameters for similarity thresholds and cluster count. Built on PyTorch and integrates seamlessly with HuggingFace's transformer ecosystem, enabling inference batching and optional embedding normalization across any pretrained BERT-compatible model (ALBERT, RoBERTa, DistilBERT, etc.).
No commits in the last 6 months. Available on PyPI.
Stars
44
Forks
15
Language
Python
License
—
Category
Last pushed
Jun 11, 2021
Monthly downloads
14
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/abhilash1910/ClusterTransformer"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
TorchDR/TorchDR
TorchDR - PyTorch Dimensionality Reduction
derrickburns/generalized-kmeans-clustering
Production-ready K-Means clustering for Apache Spark with pluggable Bregman divergences (KL,...
md-experiments/picture_text
Interactive tree-maps with SBERT & Hierarchical Clustering (HAC)
nlpub/watset-java
An implementation of the Watset clustering algorithm in Java.
mainlp/semantic_components
Finding semantic components in your neural representations.