Netflix presents VOID, an open source AI to remove objects from videos and modify interactions with them

Netflix has presented VOID, an artificial intelligence (AI) model capable of removing moving objects from videos, as well as modifying those interactions with them through a reconstruction.

Currently, methods for removing objects from videos focus on filling in the content behind the object and correcting its shadows and reflections. However, this task is complicated when the eliminated object has interactions with others, which ends in implausible results.

Faced with this problem, Netflix has developed a model based on CogVideoX architecture y optimized for image processing in videos using interaction-sensitive quad mask conditioning, as explained in the Hugging Face repository.

Specifically, VOID works with a four value mask which encodes the main target to eliminate it, the overlapping areas, the parts it interacts with and the background to maintain it.

In this way, VOID performs a first pass with the object and its interactions removed. If an error is detected, a second pass with the aim of stabilizing the shape of the object following the analyzed trajectory.

To train the model, the Netflix team together with the University of Sofia (Bulgaria) have relied on two sources: HUMOTOfor interactions between humans and objects for rendering in Blender with physics simulation; and Kubricfor interactions only between objects using ‘Google Scanned Objects’.

VOID, which stands for Video Object and Interaction Deletion, is a open source model and can be found in repositories like GitHub or Hugging Face, so creators and researchers alike can try it out and experiment with it for free.

Thanks to the VOID model, people can be eliminated or videos modified as users wish. To show some results, the official VOID website has included demonstrations with different modes of use to check how this model works compared to others.

One of the clearest examples is that VOID eliminates a press that crushes a rubber duck. While the rest of the models eliminate the press and keep the rubber duck crushed, VOID keeps the duck intact, thus editing the object with which the eliminated one interacts.

However, although it is an advance, this technology also presents potential risks. Improper use of it could help generate manipulated content and promote misinformationfurther blurring the line between reality and fiction.

By Editor

One thought on “Netflix presents VOID, an open source AI to remove objects from videos and modify interactions with them”

Leave a Reply