Zheng-Chong/CatVTON
[ICLR 2025] CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).
Employs a concatenation-based architecture that directly feeds garment and person features into the diffusion model's latent space, eliminating complex alignment modules and enabling mask-free inference variants. Built on Stable Diffusion v1.5 with localized DensePose and SCHP pose/parsing extractors, and supports deployment across multiple frameworks including ComfyUI, Gradio, and HuggingFace Spaces, with emerging DiT-based variants (CatV2TON) extending to video try-on.
1,615 stars.
Stars
1,615
Forks
207
Language
Python
License
—
Category
Last pushed
Dec 16, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/diffusion/Zheng-Chong/CatVTON"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Related models
rizavelioglu/tryoffdiff
[CVPR'25-Demo] Official repository of "TryOffDiff: Virtual-Try-Off via High-Fidelity Garment...
muzishen/IMAGGarment
[TVCG 2026] 🎨 IMAGGarment🎨 : Fine-Grained Garment Generation with Controllable Structure,...
muzishen/IMAGDressing
[AAAI 2025]👔IMAGDressing👔: Interactive Modular Apparel Generation for Virtual Dressing. It...
nxnai/Voost
[SIGGRAPH Asia 25] Voost: A Unified and Scalable Diffusion Transformer for Bidirectional Virtual...
miccunifi/ladi-vton
[ACM MM 2023] - LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On