A Very Big Video Reasoning Suite

We bet on a future that video reasoning is the next fundamental intelligence paradigm, after language reasoning, where spatiotemporal embodied world experiences could be more naturally captured.

Data Engines

View All
circle_central_dot
GitHub
Knowledge out-of-domain testset
A row of dots is shown. Circle the dot that is in the middle by count (the one with an equal number of dots on each side).
First Frame
Last Frame
light_sequence
GitHub
Abstraction in-domain testset
The scene shows 6 circular lights in a horizontal row on a white background. Lights on are orange with glow; lights off are gray. Initially, some lights are on and some are off. Your task: Modify the light states so that all lights at even positions (counting from left to right: 1st is odd, 2nd is even, 3rd is odd, and so on) are on (orange with glow), and all lights at odd positions are off (gray). Turn lights on/off as needed. Lights change from gray to orange (with glow) when turned on, and from orange to gray (glow disappears) when turned off. Lights stay in fixed positions; only their states change.
First Frame
Last Frame
grid_number_sequence
GitHub
Spatiality in-domain testset
The scene shows a 10x10 grid with a green start point, a red end point, and yellow cells marked with numbers 1, 2, and 3. An orange circular agent is positioned at the green start point. The agent can move to adjacent cells (up, down, left, right). Starting from the green start point, the agent must visit the numbered yellow cells in numerical order (1, then 2, then 3), taking the shortest path between each consecutive pair of numbered cells. The agent is allowed to pass through the red end point when visiting the numbered cells if needed. After visiting all numbered cells in sequence, the agent must reach the red end point, also following the shortest path.
First Frame
Last Frame
multiple_occlusions_vertical
GitHub
Transformation in-domain testset
The scene shows 3 objects arranged in a horizontal line in the center of the frame, with a dark rectangular mask initially positioned above them. Move the mask vertically downward in a continuous motion until it leaves the frame. As it moves, the mask passes in front of the objects, temporarily blocking them from view.
First Frame
Last Frame
arrange_circles_by_circumference
GitHub
Perception out-of-domain testset
The scene shows 5 circles of different sizes and colors arranged randomly. Keep every circle unchanged in size and color. Only rearrange their positions. Align all circles on a single horizontal line. Center the entire row of circles in the image. Sort them from left to right by circumference, from largest to smallest.
First Frame
Last Frame

Inference Results

View Full Bench
Ball Bounces - Samples
00
01
02
03
04
Task Domains 1/5
Ball Bounces
Knowledge in-domain testset
Shape Color Then Scale
Abstraction in-domain testset
Multiple Keys One Door
Spatiality out-of-domain testset
2D Geometric Transform
Transformation out-of-domain testset
Counting Objects
Perception in-domain testset
Prompt
Loading...
Ground Truth
First
First Frame
Final
Final Frame
Model Outputs
1/
VBVR-Wan2.2
VBVR-Wan2.2
CogVideoX 1.5
Kling 2.6
LTX-2
Runway Gen-4
Sora 2
Veo 3
Wan 2.2 I2V
Hunyuan I2V
Seedance 2.0

Leaderboard

Modality
Split
Type
Category
2026-04-28