Google DeepMind introduced an AI model capable of creating 3D worlds that are “playable,” based on one prompt image, with potential applications in gaming and further AI product testing.

Called a foundation world model, Genie 2 can generate an “endless variety of action-controllable, playable 3D environments for training and evaluating embodied agents,” said Google DeepMind.

The company shared clips showing how players could use their keyboard or mouse controls to move through both realistic and fantastical landscapes. They could see diverse player perspectives, such as first and third-person views or aerial views, apart from exploring various angles and once again viewing objects that have already gone out of sight.

Google DeepMind noted that Genie 2 could generate “consistent” worlds for up to 60 seconds, while it showed examples lasting 10-20 seconds. Many of the recorded scenes were soft and lightly blurred, especially when in motion.

“Genie 2 is a world model, meaning it can simulate virtual worlds, including the consequences of taking any action (e.g. jump, swim, etc.). It was trained on a large-scale video dataset and, like other generative models, demonstrates various emergent capabilities at scale, such as object interactions, complex character animation, physics, and the ability to model and thus predict the behavior of other agents,” said Google DeepMind in a post.

Earlier in the week, the AI company World Labs led by computer scientist Fei-Fei Li introduced its own AI media generator that allows classical paintings, still images, and photos to be turned into interactive 3D images.

World Labs also allowed users to try out the generated 3D images where they could take a very limited 360-degree tour and travel a few steps.

Published - December 05, 2024 11:40 am IST