VBench and ChronoMagic-Bench
These are complementary evaluation frameworks that address different temporal aspects of video generation—VBench provides general-purpose video quality metrics across multiple dimensions, while ChronoMagic-Bench specializes in evaluating temporal consistency and metamorphic transformations specific to time-lapse video generation.
About VBench
Vchitect/VBench
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
Provides hierarchical evaluation across 16+ dimensions (temporal consistency, motion smoothness, dynamic degree, etc.) with dimension-specific metrics and a curated prompt suite, enabling fine-grained assessment of video generation quality. Implements custom evaluation pipelines combining vision models (CLIP, optical flow, scene detection) with automatic metrics aligned to human preferences. Extends to image-to-video and long-form video evaluation while assessing trustworthiness dimensions like fairness and safety.
About ChronoMagic-Bench
PKU-YuanGroup/ChronoMagic-Bench
[NeurIPS 2024 D&B Spotlight🔥] ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Provides metamorphic evaluation of text-to-video models through time-lapse generation tasks grounded in physics, biology, and chemistry priors, with curated ChronoMagic-Pro datasets containing 460K+ video-text pairs. Introduces CHScore, a robust temporal coherence metric for assessing physics-aware transformations, and hosts an open leaderboard for benchmarking diverse text-to-video models including proprietary systems like Sora.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work