GURPREETKAURJETHRA/Image-to-Speech-GenAI-Tool-Using-LLM

AI tool that generates an Audio short story based on the context of an uploaded image by prompting a GenAI LLM model, Hugging Face AI models together with OpenAI & LangChain

/ 100

Emerging

Implements a three-stage pipeline: Salesforce's BLIP image-captioning model extracts visual context, OpenAI's GPT-3.5-turbo crafts narrative prompts via LangChain, and ESPnet's VITS text-to-speech model generates audio output. Built with Streamlit for local deployment and published on both Streamlit Cloud and Hugging Face Spaces, supporting direct API token configuration via environment variables.

No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 8 / 25

Maturity 16 / 25

Community 20 / 25

How are scores calculated?

Stars

Forks

Language

Python

License

MIT

Compare

Image-to-Speech-GenAI-Tool-Using-LLM and image-to-text-to-speech

Higher-rated alternatives

Addy-shetty/Vibe-Prompting

🎨 AI-Powered Prompt Generator | Transform ideas into powerful AI prompts instantly with smart...

OCEANOFANYTHINGOFFICIAL/AI-Blog-Article-Generator

The AI Blog Article Generator is a Python-based tool that utilizes the Cohere API to generate...

pH-7/youtube-to-medium-blog-posts-automation

Turn any YouTube videos into well-written Medium blog posts with this automation script (made...

Pro-GenAI/Auto-Trendy-Keywords

Real-time AI-driven Trending keyword generation for SEO

CogitoNTNU/MarketingAI

A software application is that allows users to input specific themes or topics for a meme or...

Explore Prompt Engineering Tools

All categories Trending Prompt Engineering directory Insights