nerdyrodent/VQGAN-CLIP

Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

49
/ 100
Emerging

Combines OpenAI's CLIP vision-language model with CompVis's VQGAN vector quantized autoencoder to optimize image generation through iterative gradient-based optimization, enabling text-to-image synthesis with weighted multi-prompt support. Supports both CUDA and ROCm backends with configurable resolution (380x380 to 900x900) and features advanced capabilities like story mode sequencing, style transfer effects, and video generation through iterative feedback loops. Integrates PyTorch Lightning for training infrastructure and includes specialized tooling for batch processing, frame-by-frame video styling, and dynamic zoom effects.

2,653 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 23 / 25

How are scores calculated?

Stars

2,653

Forks

426

Language

Python

License

Last pushed

Oct 02, 2022

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/nerdyrodent/VQGAN-CLIP"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.