Can today’s AI video models accurately model how the real world works?
SMRTR summary
Researchers tested Google's Veo 3 AI video model on 62 tasks to see if it accurately simulates real-world physics and behaviors, finding highly inconsistent performance across different challenges. While the model succeeded on some tasks, it failed most trials for complex activities like solving mazes (10/12 failures) and sorting numbers (11/12 failures), yet researchers still considered any success rate above zero as evidence of capability rather than practical reliability.
SMRTR provides this summary for quick context. The original article belongs to Ars Technica.
Read the original article