rustyneuron01/Conversation-Genome-Project

Structured data & semantic tagging pipeline. Turns raw text (conversations, web pages, surveys) into tagged data for AI and search. Coordinators set ground truth; workers run LLM inference on windows. Scoring via cosine similarity. Python, FastAPI, OpenAI/Anthropic/OpenRouter, embeddings, Docker.

45
/ 100
Emerging

Implements a distributed coordinator-worker architecture where coordinators establish semantic ground truth via full-document embeddings, then score worker outputs using multi-method cosine similarity (weighted mean/median/max of top-3 matches with tag overlap penalties). Supports pluggable LLM backends (OpenAI, Anthropic, OpenRouter, Groq, Chutes) with PyTorch embeddings and integrates with Weights & Biases for experiment tracking and custom FastAPI conversation servers for private data pipelines.

No Package No Dependents
Maintenance 13 / 25
Adoption 6 / 25
Maturity 9 / 25
Community 17 / 25

How are scores calculated?

Stars

23

Forks

8

Language

Python

License

MIT

Last pushed

Mar 13, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/vector-db/rustyneuron01/Conversation-Genome-Project"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.