knightyxp/VideoCoF

[CVPR 2026] VideoCoF: Unified Video Editing with Temporal Reasoner

/ 100

Emerging

Employs a "See → Reason → Edit" pipeline where a temporal reasoner predicts spatial reasoning tokens before generating edited video frames, enabling accurate multi-task editing (removal, addition, swap, style transfer) with just 50k training pairs. Achieves 16× length extrapolation on single-shot videos (512 frames) and 14× on multi-shot sequences despite training on only 33-frame clips, using FlashAttention-3 optimization and DMD LoRA for ~10s inference on H100 GPUs. Built on the Wan-2.1-T2V-14B foundation model with modular LoRA acceleration adapters available on Hugging Face.

159 stars.

No Package No Dependents

Maintenance 10 / 25

Adoption 10 / 25

Maturity 13 / 25

Community 10 / 25

How are scores calculated?

Stars

159

Forks

Language

Python

License

Apache-2.0

Higher-rated alternatives

hao-ai-lab/FastVideo

A unified inference and post-training framework for accelerated video generation.

thu-ml/TurboDiffusion

TurboDiffusion: 100–200× Acceleration for Video Diffusion Models

ModelTC/LightX2V

Light Image Video Generation Inference Framework

PKU-YuanGroup/Helios

Helios: Real Real-Time Long Video Generation Model

PKU-YuanGroup/MagicTime

[TPAMI 2025🔥] MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators

Explore Diffusion Models

All categories Trending Diffusion directory Insights