VBench and vistorybench
These are complementary evaluation frameworks that address different aspects of video generation assessment: VBench provides general-purpose video quality metrics (temporal consistency, motion, aesthetics), while ViStoryBench specifically evaluates narrative coherence and story comprehension in AI-generated video sequences.
About VBench
Vchitect/VBench
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
Provides hierarchical evaluation across 16+ dimensions (temporal consistency, motion smoothness, dynamic degree, etc.) with dimension-specific metrics and a curated prompt suite, enabling fine-grained assessment of video generation quality. Implements custom evaluation pipelines combining vision models (CLIP, optical flow, scene detection) with automatic metrics aligned to human preferences. Extends to image-to-video and long-form video evaluation while assessing trustworthiness dimensions like fairness and safety.
About vistorybench
ViStoryBench/vistorybench
[CVPR 2026] ViStoryBench: AI Story Visualization Benchmark
Provides a modular evaluation framework built on a `BaseEvaluator` abstract class that supports pluggable metrics for assessing narrative consistency, character fidelity, and visual coherence across 80 diverse stories in Chinese and English. The benchmark includes standardized dataset adapters for major story visualization methods (StoryDiffusion, UNO, StoryGen, etc.) and handles long-text prompts via SD embeddings to overcome token limitations. Published results and an active leaderboard are maintained on HuggingFace and a dedicated web portal for continuous community evaluation.
Related comparisons
Scores updated daily from GitHub, PyPI, and npm data. How scores work