A Very Big Video Reasoning Suite

We bet on a future that video reasoning is the next fundamental intelligence paradigm, after language reasoning, where spatiotemporal embodied world experiences could be more naturally captured.

Data Engines

View All

identify_one_and_nine

GitHub

Knowledge training set

Prompt

The image shows a subset of digits chosen from 1 to 9 placed in different positions. Find digit 1 and digit 9. Only circle digits '1' and '9'. Do not circle other digits. Draw a red circle around each target digit.

First Frame

Last Frame

Video

control_panel

GitHub

Abstraction out-of-domain testset

Prompt

The image shows a control panel with six identical control units. Each unit has a colored indicator light at the top and a control lever at the bottom that can be moved to three positions (left, middle, or right). Observe the current control panel to infer the relationship between lever positions and light colors. Based on this inferred relationship, adjust the levers that need to be changed to make all indicator lights show yellow color.

First Frame

Last Frame

Video

select_leftmost_shape

GitHub

Spatiality out-of-domain testset

Prompt

Multiple shapes are shown. Circle the leftmost one. Do not change anything else.

First Frame

Last Frame

Video

2d_object_rotation

GitHub

Transformation out-of-domain testset

Prompt

The scene contains 4 2D object(s). Show them rotating clockwise by 53 degrees around their respective centroids.

First Frame

Last Frame

Video

select_longest_polygon_side

GitHub

Perception out-of-domain testset

Prompt

The image shows an irregular polygon with 6 sides. First compare the lengths of all polygon edges, then mark the single longest side by drawing a small circle at its midpoint. Do not change anything else. Show the complete solution step by step.

First Frame

Last Frame

Video

Inference Results

View Full Bench

Traffic Light - Samples

Task Domains 1/5

Traffic Light

Knowledge in-domain testset

Shape Outline Then Move

Abstraction in-domain testset

LEGO Construction

Spatiality in-domain testset

Shape Sorter

Transformation out-of-domain testset

Mark Second Largest Shape

Perception out-of-domain testset

Prompt

Ground Truth

First

Final

Model Outputs

VBVR-Wan2.2

CogVideoX 1.5

Kling 2.6

LTX-2

Runway Gen-4

Sora 2

Veo 3

Wan 2.2 I2V

Hunyuan I2V

Seedance 2.0

Prompt

Ground Truth

Model Outputs

VBVR-BAGEL

BAGEL

SenseNova-U1

VBVR-ThinkMorph

ThinkMorph

GPT Image 2

Nano Banana

Leaderboard

Modality

Split

Type