Apple's New AI “Depth Pro” Revolutionizes AR - Capturing 3D Space from a Single Image in Just Seconds

Apple's New AI “Depth Pro” Revolutionizes AR - Capturing 3D Space from a Single Image in Just Seconds

Not a week goes by without something new in AI development that moves the technology forward, and this week it was announced by a small Cupertino tech company

While all eyes were on Apple Intelligence and its eventual release, the company also unveiled a new AI model called Depth Pro

As the name suggests, this new artificial intelligence model maps the depth of an image in real time More excitingly, it does not require an Nvidia H100 and can run this on standard home computing hardware

Depth Pro is a research model and not something Apple is necessarily looking to commercialize, but if we can get our hands on Apple Glass, it would certainly help Apple make augmented reality work better or improve the AR capabilities of Vision Pro no doubt

Apple's new model estimates relative and absolute depth and uses them to generate “metric depth” This data can be used in a variety of ways along with images

When a user takes a picture, Depth Pro draws accurate measurements between items in the image Apple's model should also avoid inconsistencies such as thinking the sky is part of the background or misjudging the foreground and background of a shot

Terminator 2 aside, the possibilities are almost endless Autonomous vehicles (which, ironically, Apple seems to have stopped offering), drones, and robotic vacuum cleaners could use accurate depth sensing to improve object avoidance, and augmented reality technology and online furniture stores, whether real or virtual, could make room-wide items in a room, whether real or virtual, could be placed more accurately

Medical technology could also be improved by depth perception, with better reconstruction of anatomical structures and mapping of internal organs

Generative AI, such as the Luma Dream Machine, could also be used to more accurately convert images into moving images This works by passing depth data along with the image to the video model to give it a better understanding of how to process the placement and movement of objects in that space

Categories