davidelobba/TEMU-VTOFF

[ICLR 2026] "Inverse Virtual Try-On: Generating Multi-Category Product-Style Images from Clothed Individuals"

43
/ 100
Emerging

Based on the README, here's the technical summary: Implements a dual-Diffusion Transformer (dual-DiT) architecture that combines pretrained feature extraction with text-enhanced generation, using a multimodal hybrid attention mechanism to integrate garment descriptions with person features for synthesizing occluded regions. A lightweight DINOv2-based garment aligner module conditions generation on target in-shop images rather than traditional denoising objectives. Supports multi-category garment handling (upper/lower/full-body) across Dress Code and VITON-HD datasets, with pre-extracted features from CLIP, OpenCLIP, and T5 encoders, and requires Stable Diffusion 3 Medium via HuggingFace.

No Package No Dependents
Maintenance 10 / 25
Adoption 7 / 25
Maturity 15 / 25
Community 11 / 25

How are scores calculated?

Stars

35

Forks

4

Language

Python

License

Category

virtual-try-on

Last pushed

Mar 06, 2026

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/davidelobba/TEMU-VTOFF"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.