unmonoqueteclea/voilib

🎧 Podcast Search Engine. Try it now for free or run your own instance.

Archived
35
/ 100
Emerging

Implements semantic search over podcast transcripts by dividing episodes into ~40-word fragments and storing their embeddings (384-dimensional vectors) in Qdrant. The pipeline chains OpenAI's Whisper for transcription, embedding generation for semantic indexing, and vector similarity search—supporting both RSS-sourced podcasts and custom audio files. Deployable entirely self-hosted via Docker Compose with no external paid dependencies.

Archived No Package No Dependents
Maintenance 0 / 25
Adoption 9 / 25
Maturity 16 / 25
Community 10 / 25

How are scores calculated?

Stars

75

Forks

6

Language

Python

License

GPL-3.0

Last pushed

Oct 11, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/unmonoqueteclea/voilib"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.