Except for improvements in quality and dimensions, the evolution of photography has not had much room for innovation in the last century. For this reason, a technology firm set out to revolutionize the imaging market through an artificial intelligence (AI) process to convert any two-dimensional photo into 3D.
And although it is currently not too complicated to achieve a 3D photo with a specialized camera – such as those used to create virtual reality environments – the results leave much to be desired if professional equipment is not used.
The first time a snapshot was taken 75 years ago with a Polaroid camera, it managed to “capture the 3D world in a realistic 2D image,” as NVIDIA recalled in a statement. Now, the company boasts of achieving just the opposite: “Turn a collection of photographs into a 3D scene in seconds”.
It was thus that the expert firm in graphic plates presented a new technology called Instant Neural Radiance Field (NeRF) that trains AI algorithms to create 3D objects from two-dimensional photos.
Known as “reverse rendering,” the process uses AI to “approximate the way light behaves in the real world by allowing researchers to reconstruct a 3D scene from a handful of 2D images taken from different angles.”
The neural network fills in the blank spaces of the 360-degree panorama and predicts the color of light emanating from any direction, from any point in 3D space, for more realistic results. Nvidia says that this technique works on some kind of occlusion.
In the video provided by the company, it can be seen how four photos taken from different points of view merge into a three-dimensional image.
Instant NeRF uses deep learning techniques and opens the door for anyone with relatively affordable equipment to achieve a 3D image almost instantly. This makes it quite easy to create content for virtual environments. Although this is only one of the possible things that this technology allows.
Possible uses of NeRF
Instant NeRF could be used for train robots and autonomous cars to understand the size and shape of real-world objects. For this, the capture of 2D images or video sequences would be used. It could also be harnessed in architecture and entertainment to rapidly generate digital representations of real environments that creators can modify and build.
As the company has defended, it is “the fastest to date”. “The model requires just a few seconds to train on a few dozen still photos, plus data about the camera angles they were taken from, and can then render the resulting 3D scene in tens of milliseconds,” speeding up the process by 1000x. .
The company explains about this process that “in a scene that includes people or other moving elements, the faster the photos, the better.” In this sense, if there is too much mobility during the 2D capture process, the 3D scene generated by the AI “will be blurry”.