DeepMind thinks its new Genie 3 world model presents a stepping stone toward AGI
SMRTR summary
A glowing screen flickers with simulated skiers barreling down mountains and forklifts navigating warehouses, all created from simple text prompts. This is Genie 3, Google DeepMind's latest AI breakthrough that generates interactive 3D environments that remember what they've previously created.
"Genie 3 is the first real-time interactive general-purpose world model," explains Shlomi Fruchter, research director at DeepMind. "It can generate both photo-realistic and imaginary worlds, and everything in between."
Unlike its predecessor that could only produce 20-second clips, Genie 3 creates minutes of 720p resolution environments at 24 frames per second with remarkable physical consistency over time.
Most significantly, researchers believe this technology represents a crucial stepping stone toward artificial general intelligence by providing environments where AI agents can learn through trial and error, similar to humans.
When tested with DeepMind's generalist agent SIMA in a warehouse setting, the agent successfully completed tasks like "approach the bright green trash compactor" by interacting with Genie 3's consistent world.
While limitations remain—it can't sustain hours-long simulations needed for comprehensive training—researchers believe it could usher in a new era of embodied AI learning.
SMRTR provides this summary for quick context. The original article belongs to TechCrunch.
Read the original article