A Very Big Video Reasoning Suite
We bet on a future that video reasoning is the next fundamental intelligence paradigm, after language reasoning, where spatiotemporal embodied world experiences could be more naturally captured.
Data Engines
hit_target_after_bounce
GitHub
Prompt
The scene shows a ball with an arrow indicating its initial direction, and several empty target positions (hollow circles) on the right side. Simulate the ball moving along this direction and bouncing off walls following the law of reflection (the angle of reflection equals the angle of incidence). The ball will follow a complete trajectory and eventually align exactly with and completely overlap one of the target positions.
First Frame
Last Frame
Video
object_subtraction
GitHub
Prompt
Remove all green objects from the scene. Keep all other objects unchanged.
First Frame
Last Frame
Video
locate_topmost_unobscured_figure
GitHub
Prompt
Multiple shapes partially overlap. Outline the topmost (unobscured) shape.
First Frame
Last Frame
Video
object_packing
GitHub
Prompt
The scene shows objects on the left side and a container on the right side. Place the objects into the container one by one in the color order: pink - green - orange - purple. Each object must be placed individually in the exact order specified, and all objects must end up inside the container.
First Frame
Last Frame
Video
locate_line_intersections
GitHub
Prompt
Circle all intersection points of the line segments with red circles.
First Frame
Last Frame
Video
Inference Results
View All Results
High Density Liquid - Samples
00
01
02
03
04
Prompt
Loading...
Ground Truth
First
Final
Model Outputs
1/9
VBVR-Wan2.2
VBVR-Wan2.2
CogVideoX 1.5
Kling 2.6
LTX-2
Runway Gen-4
Sora 2
Veo 3
Wan 2.2 I2V
Hunyuan I2V
Leaderboard
Reference
Strong Baseline
Proprietary
Open-source