GURPREETKAURJETHRA/Image-to-Speech-GenAI-Tool-Using-LLM
AI tool that generates an Audio short story based on the context of an uploaded image by prompting a GenAI LLM model, Hugging Face AI models together with OpenAI & LangChain
Implements a three-stage pipeline: Salesforce's BLIP image-captioning model extracts visual context, OpenAI's GPT-3.5-turbo crafts narrative prompts via LangChain, and ESPnet's VITS text-to-speech model generates audio output. Built with Streamlit for local deployment and published on both Streamlit Cloud and Hugging Face Spaces, supporting direct API token configuration via environment variables.
No commits in the last 6 months.
Stars
52
Forks
24
Language
Python
License
MIT
Category
Last pushed
Jan 11, 2024
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/GURPREETKAURJETHRA/Image-to-Speech-GenAI-Tool-Using-LLM"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
Addy-shetty/Vibe-Prompting
🎨 AI-Powered Prompt Generator | Transform ideas into powerful AI prompts instantly with smart...
OCEANOFANYTHINGOFFICIAL/AI-Blog-Article-Generator
The AI Blog Article Generator is a Python-based tool that utilizes the Cohere API to generate...
pH-7/youtube-to-medium-blog-posts-automation
Turn any YouTube videos into well-written Medium blog posts with this automation script (made...
Pro-GenAI/Auto-Trendy-Keywords
Real-time AI-driven Trending keyword generation for SEO
CogitoNTNU/MarketingAI
A software application is that allows users to input specific themes or topics for a meme or...