GPT-5.5 Cyber has a Claude Mythos-like ability to autonomously attack systems

GPT-5.5 Cyber, OpenAI’s cybersecurity-focused model, has similar capabilities to Anthropic’s Claude Mythos to attack systems autonomously, as demonstrated by tests carried out by the AI Security Institute (AISI), part of the Department of Science, Innovation and Technology of the United Kingdom government.

GPT-5.5-Cyber is a variant of OpenAI’s GPT-5.5 model designed specifically to protect businesses and infrastructure, following Anthropic’s Claude Mythos Preview. Both are part of a new trend, that of artificial intelligence models with performance capable of completing attack simulations on a corporate network, an operation that requires several steps and that would take a human person “around 20 hours.”

This is stated by the AISI, which has shared the results of the tests it has subjected the OpenAI model to, and which had previously carried out the Claude Mythos Preview. Specifically, the capabilities of GPT-5.5-Cyber were tested in 95 cybersecurity tasks in ‘catch the flag’ format, divided into four levels of difficulty.

GPT-5.5-Cyber, like Claude Mythos, completes basic tasks without problems. Regarding the most advanced, these are divided into two levels: Practitioner and Expert. According to the AISI, on Expert level tasks, GPT-5.5 has shown to outperform Claude Mythos, having an average pass rate of 71.4 percent compared to 68.6 percent for the Anthropic model.

These tasks focus on the autonomous investigation and exploitation of vulnerabilities against realistic targets and modern mitigations, and require skills such as reverse engineering binaries without source code, developing reliable exploits for stack overflows, and recovering keys using padding oracle attacks, among others.

The AISI has highlighted two simulations in particular: ‘Cooling Tower’ and The Last Ones’. The latter is a 32-step corporate network attack simulation, modeled after the attack chain of an enterprise intrusion and spanning four subnets and approximately twenty hosts,’ which would take a human 20 hours to complete.

In it, the performance of Claude Mythos stood out, who was able to solve three of ten attempts, while GPT-5.5 Cyber came in second place after completing two of ten attempts.

The AISI has also indicated that GPT-5.5 could not solve the simulation of an attack on an industrial control system known as ‘Cooling Tower’, which requires completing seven steps and which takes an expert human about 15 hours. Although he has also pointed out that “no model has achieved it until now.”

It should be noted that the tests have been carried out in controlled environments that simulate real situations with network access, but in which active defense measures have not been incorporated, so the organization cannot say whether “GPT-5.5 would be successful against a well-protected target.”

“GPT-5.5 demonstrates that rapid improvement in cyber tasks could be part of a more general trend. If offensive cyber capability emerges as a consequence of more general improvements in autonomy, reasoning and long-term programming, further increases in cyber capability of the models can be expected in the near future, possibly consecutively,” he added.

AISI previously subjected Claude Mythos to controlled assessments that included chat-based polls, capture-flag challenges, and multi-step cyberattack simulations; also in environments without security measures or penalties.

In their results, they highlighted that Anthropic’s AI model developed to assist in defensive security has the ability to autonomously attack small companies that have weak protections.

GPT-5.5 Cyber has a Claude Mythos-like ability to autonomously attack systems

ByEditor

By Editor

Related Post

Final Fantasy XI returns stronger after the FF XIV crossover: its director gives the keys to its success and future

How to use air conditioner when the heat is over 40 degrees Celsius

China sends an astronaut into space for a year on his way to the Moon

Leave a Reply Cancel reply

You missed

The Israeli Navy abandons several ships of the last flotilla adrift

Brutal attack in Split: One man in the hospital with serious injuries, the attacker was arrested

Clear-cutting at Füllkrug Club: No Champions League: Milan throws out coach and management

The Sumud convoy, which was heading to break the blockade of Gaza, was blocked in Libya.

The Observatorial