explosion/sense2vec
🦆 Contextually-keyed word vectors
Disambiguates word vectors using part-of-speech tags and entity labels to represent multi-word phrases as distinct senses (e.g., "natural_language_processing|NOUN" vs "natural|ADJ"). Integrates as a spaCy v3 pipeline component with extension attributes for vector lookup and nearest-neighbor queries, plus optional neighbor caching for performance. Supports training custom vectors from raw text using pretrained spaCy models combined with GloVe or fastText embeddings.
1,672 stars and 4,793 monthly downloads. No commits in the last 6 months. Available on PyPI.
Stars
1,672
Forks
239
Language
Python
License
MIT
Category
Last pushed
Apr 23, 2025
Monthly downloads
4,793
Commits (30d)
0
Dependencies
6
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/explosion/sense2vec"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related tools
shibing624/similarities
Similarities: a toolkit for similarity calculation and semantic search....
chakki-works/chakin
Simple downloader for pre-trained word vectors
pdrm83/sent2vec
How to encode sentences in a high-dimensional vector space, a.k.a., sentence embedding.
sebischair/Lbl2Vec
Lbl2Vec learns jointly embedded label, document and word vectors to retrieve documents with...
code-kern-ai/embedders
With embedders, you can easily convert your texts into sentence- or token-level embeddings...