rustyneuron01/Conversation-Genome-Project

Structured data & semantic tagging pipeline. Turns raw text (conversations, web pages, surveys) into tagged data for AI and search. Coordinators set ground truth; workers run LLM inference on windows. Scoring via cosine similarity. Python, FastAPI, OpenAI/Anthropic/OpenRouter, embeddings, Docker.

/ 100

Emerging

Implements a distributed coordinator-worker architecture where coordinators establish semantic ground truth via full-document embeddings, then score worker outputs using multi-method cosine similarity (weighted mean/median/max of top-3 matches with tag overlap penalties). Supports pluggable LLM backends (OpenAI, Anthropic, OpenRouter, Groq, Chutes) with PyTorch embeddings and integrates with Weights & Biases for experiment tracking and custom FastAPI conversation servers for private data pipelines.

No Package No Dependents

Maintenance 13 / 25

Adoption 6 / 25

Maturity 9 / 25

Community 17 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Higher-rated alternatives

ob-labs/ChatBot

ChatBot, show how to implement a RAG based on OceanBase or OceanBase seekdb AI capabilities...

pmbstyle/Alice

Alice is a voice-first desktop AI assistant application built with Vue.js, Vite, and Electron....

stackitcloud/rag-template

Template for AI chatbots & document management using Retrieval-Augmented Generation with vector...

GGyll/condo_gpt

An intelligent assistant for querying and analyzing real estate condo data in Miami.

zaldivards/ContextQA

ContextQA - The open-source tool for data-driven conversations

Explore Vector Databases

All categories Trending Vector Database directory Insights