DiceTechJobs/ConceptualSearch
Train a Word2Vec model or LSA model, and Implement Conceptual Search\Semantic Search in Solr\Lucene - Simon Hughes Dice.com, Dice Tech Jobs
Trains Word2Vec or LSA models on domain-specific documents using gensim, then generates Solr synonym files (with optional payload weighting) or vector clusters to enable semantic matching beyond keyword queries. The pipeline includes preprocessing, optional keyphrase extraction, model training, and multiple output formats—synonyms for query expansion or clustered vectors for field-based conceptual search without requiring custom plugins. Integrates directly with Apache Solr via synonym filters and optional custom plugins (PayloadEdismax, PayloadAwareDefaultSimilarity) for weighted semantic retrieval, or works with any search engine supporting synonym files.
259 stars. No commits in the last 6 months.
Stars
259
Forks
54
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Apr 26, 2019
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/DiceTechJobs/ConceptualSearch"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
openviglet/turing
:sparkles: :dna: Turing ES - Enterprise Search, Semantic Navigation, Chatbot using Search Engine...
nixiesearch/nixiesearch
Hybrid search engine, combining best features of text and semantic search worlds
DiceTechJobs/SolrConfigExamples
Examples of Solr configuration entries for Solr plugins and Conceptual Search\Semantic Search...
DiceTechJobs/RelevancyFeedback
Dice.com's relevancy feedback solr plugin created by Simon Hughes (Dice). Contains request...
openviglet/turing-java-sdk
:sparkles: :dna: :coffee: Java Library to access Turing AI