eps696/aphantasia

CLIP + FFT/DWT/RGB = text to image/video

47
/ 100
Emerging

Parameterizes image generation using FFT, DWT (wavelets), or direct RGB optimization—avoiding GANs entirely—enabling high-resolution outputs (fullHD/4K+) with stable, controllable synthesis. Supports multi-modal queries combining text prompts, image references, style descriptions, and negative prompts with weighted syntax, plus continuous video generation via frame interpolation with optional depth-based 3D effects. Integrates with multiple CLIP vision models (ViT and ResNet variants) and includes experimental aesthetic loss and progressive learning rate strategies for compositional control.

789 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 21 / 25

How are scores calculated?

Stars

789

Forks

104

Language

Python

License

MIT

Last pushed

Feb 13, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/eps696/aphantasia"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.