DiceTechJobs/VectorsInSearch

Dice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the Activate 2018 search conference, and the 'Searching with Vectors' talk from Haystack 2019 (US). Builds upon my conceptual search and semantic search work from 2015

42
/ 100
Emerging

Implements three approximate nearest neighbor search algorithms (LSH with Sim Hash, K-Means Tree, and Vector Thresholding) that encode dense vectors as inverted index queries for efficient large-scale similarity search. The approach generates boolean OR queries optimized by Lucene's Block Max WAND algorithm, with Python utilities for vector indexing and a custom Solr similarity plugin to score results based on vector proximity. Targets Solr 7.5+ and provides both the algorithmic implementations and index configuration needed to integrate semantic search into existing search infrastructure.

No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 17 / 25

How are scores calculated?

Stars

86

Forks

15

Language

Python

License

Apache-2.0

Last pushed

May 12, 2021

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/DiceTechJobs/VectorsInSearch"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.