google-research/pix2seq
Pix2Seq codebase: multi-tasks with generative modeling (autoregressive and diffusion)
ArchivedUnifies vision tasks (detection, segmentation, captioning, keypoint detection) through a single sequence-generation framework built on encoder-decoder transformers with pluggable diffusion or autoregressive decoders. Implements FitTransformer as an optional backbone and includes TPU/GPU optimization via TensorFlow 2, with pretrained checkpoints across ResNet and ViT architectures available on Google Cloud Storage.
939 stars. No commits in the last 6 months.
Stars
939
Forks
73
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
Nov 07, 2023
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/google-research/pix2seq"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
NVIDIA/pix2pixHD
Synthesizing and manipulating 2048x1024 images with conditional GANs
GaParmar/clean-fid
PyTorch - FID calculation with proper image resizing and quantization steps [CVPR 2022]
albertpumarola/GANimation
GANimation: Anatomically-aware Facial Animation from a Single Image (ECCV'18 Oral) [PyTorch]
yuanming-hu/exposure
Learning infinite-resolution image processing with GAN and RL from unpaired image datasets,...
yiranran/APDrawingGAN
Code for APDrawingGAN: Generating Artistic Portrait Drawings from Face Photos with Hierarchical...