hayabhay/frogbase
Transform audio-visual content into navigable knowledge.
ArchivedOrchestrates a complete pipeline linking media sources (yt_dlp, local files) through OpenAI Whisper transcription, SentenceTransformers embeddings, and hnswlib vector search—enabling semantic queries across multi-modal content. Includes both a Python API and Streamlit UI, allowing developers to build search applications or non-technical users to run locally without coding.
778 stars. No commits in the last 6 months.
Stars
778
Forks
92
Language
Python
License
MIT
Category
Last pushed
Oct 25, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/hayabhay/frogbase"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
ssrajadh/sentrysearch
Semantic search over videos using Gemini Embedding 2.
zilliz-bootcamp/audio_search
This project use PANNs for audio tagging and sound event detection, and finally get audio...
kyegomez/Pegasus
PegasusX: The Future of Multimodal Embeddings đŸ¦„ đŸ¦„
ashvardanian/SwiftSemanticSearch
Real-time on-device text-to-image and image-to-image Semantic Search with video stream camera...
tomfalainen/word_spotting
Semantic and Verbatim Word Spotting in Torch