ddangelov/Top2Vec
Top2Vec learns jointly embedded topic, document and word vectors.
Combines Doc2Vec, BERT Sentence Transformers, or Universal Sentence Encoder embeddings with UMAP dimensionality reduction and HDBSCAN clustering to automatically discover topics without predefined counts or stop word lists. The contextual variant uses token-level embeddings to identify multiple topics per document and intra-document topic spans, exposing results through methods for topic distribution, relevance scoring, and token-level topic assignments.
3,109 stars and 5,399 monthly downloads. No commits in the last 6 months. Available on PyPI.
Stars
3,109
Forks
377
Language
Python
License
BSD-3-Clause
Category
Last pushed
Nov 14, 2024
Monthly downloads
5,399
Commits (30d)
0
Dependencies
9
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/ddangelov/Top2Vec"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Compare
Related tools
shibing624/text2vec
text2vec, text to vector....
predict-idlab/pyRDF2Vec
đ Python Implementation and Extension of RDF2Vec
IITH-Compilers/IR2Vec
Implementation of IR2Vec, LLVM IR Based Scalable Program Embeddings
IntuitionEngineeringTeam/chars2vec
Character-based word embeddings model based on RNN for handling real world texts
stephantul/reach
Load embeddings and featurize your sentences.