GURPREETKAURJETHRA/Image-to-Speech-GenAI-Tool-Using-LLM

AI tool that generates an Audio short story based on the context of an uploaded image by prompting a GenAI LLM model, Hugging Face AI models together with OpenAI & LangChain

44
/ 100
Emerging

Implements a three-stage pipeline: Salesforce's BLIP image-captioning model extracts visual context, OpenAI's GPT-3.5-turbo crafts narrative prompts via LangChain, and ESPnet's VITS text-to-speech model generates audio output. Built with Streamlit for local deployment and published on both Streamlit Cloud and Hugging Face Spaces, supporting direct API token configuration via environment variables.

No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 8 / 25
Maturity 16 / 25
Community 20 / 25

How are scores calculated?

Stars

52

Forks

24

Language

Python

License

MIT

Last pushed

Jan 11, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/GURPREETKAURJETHRA/Image-to-Speech-GenAI-Tool-Using-LLM"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.