Zahidaslam786/MultiModal-Creative-AI-Agent

A multi-modal AI agent capable of bridging the gap between text and vision. This project integrates open-source Stable Diffusion and Vision models to generate high-fidelity images from text prompts and perform intelligent visual analysis, all optimized for local or cloud-based T4 GPU environments.

/ 100

Experimental

No License No Package No Dependents

Maintenance 10 / 25

Adoption 0 / 25

Maturity 1 / 25

Community 0 / 25

How are scores calculated?

Stars

—

Forks

—

Language

Jupyter Notebook

License

—

Category

video-synthesis-generation

Last pushed

Jan 19, 2026

Commits (30d)

GitHub

Video Synthesis Generation · 34 tools

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/generative-ai/Zahidaslam786/MultiModal-Creative-AI-Agent"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

open-mmlab/mmagic

OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄:...

jdh-algo/JoyVASA

Diffusion-based Portrait and Animal Animation

haidog-yaqub/EzAudio

High-quality Text-to-Audio Generation with Efficient Diffusion Transformer

CMLab-Korea/Awesome-Video-Frame-Interpolation

[IEEE TCSVT'26] 🂡 AceVFI: A Comprehensive Survey of Advances in Video Frame Interpolation

linzhiqiu/t2v_metrics

Evaluating text-to-image/video/3D models with VQAScore

Explore Generative AI Tools

All categories Trending Generative AI directory Insights