unmonoqueteclea/voilib
🎧 Podcast Search Engine. Try it now for free or run your own instance.
ArchivedImplements semantic search over podcast transcripts by dividing episodes into ~40-word fragments and storing their embeddings (384-dimensional vectors) in Qdrant. The pipeline chains OpenAI's Whisper for transcription, embedding generation for semantic indexing, and vector similarity search—supporting both RSS-sourced podcasts and custom audio files. Deployable entirely self-hosted via Docker Compose with no external paid dependencies.
Stars
75
Forks
6
Language
Python
License
GPL-3.0
Category
Last pushed
Oct 11, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/unmonoqueteclea/voilib"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
DiceTechJobs/VectorsInSearch
Dice.com repo to accompany the dice.com 'Vectors in Search' talk by Simon Hughes, from the...
damiandelmas/flexvec
Composable vector search with SQL
IuriiD/pinecone-faiss-pgvector
Comparing vector DBs Pinecone, FAISS & pgvector in combination with OpenAI Embeddings for semantic search
omni-front/ConstructIQ
Semantic search API for building permits using vector embeddings, FastAPI & Pinecone
RubenGarrod/ClinicCloud
Advanced semantic search system for medical and scientific documentation using BioBERT and pgvector.