DeepMind shows its new AI system, Cat, which successfully performs more than 600 tasks such as chatting or playing Atari

Gato is the name by which the new artificial intelligence (AI) system from DeepMind, which has the ability to complete more than 600 different tasks, from writing image descriptions to controlling a robotic arm.

The system is a single generalist agent designed from the approach followed in large-scale language modeling. has been trained with 604 different tasks and with different modalities, information that has been serialized into a flat series of tokens and processed with a neural network with the same weight for all tasks.

“Cat autoregressively samples the action vector one token at a time. Once all the tokens that make up the action vector (as determined by the environment’s action specification) have been sampled, the action is decoded by reversing the tokenization procedure. This action is sent to the environment which gives way and produces a new observation. The procedure is repeated,” they explain in the text of the DeepMind study.

As a single generalist agent that can act as a vision and language model and perform actions in the real world. Cat understands the context and based on it decide whether to generate text, press a button, or rotate a joint.

With this, an AI system capable of successfully performing tasks as varied as maintaining a chat conversation, controlling a robotic arm to stack blocks, writing the descriptions of images and playing Atari video games has been achieved.

DeepMind researchers have raised with Gato the possibility of training an agent in a large number of tasks, and that this general agent “can be adapted with little additional data to make it successful in an even greater number of tasks“.

By Editor