Apple Intelligence’s local LLM hacked with instruction injection attack

RSAC Research researchers have managed to bypass the security measures of the large language model (LLM) that drives Apple Intelligence locally by injection of instructions or ‘prompt injection’.

In December 2025 there were about 200 million devices of Apple in use around the world with the ability to use Apple Intelligence, the generative artificial intelligence ecosystem that introduces intelligent functions both at the operating system level and in compatible applications.

Apple Intelligence uses two LLMs: a smaller one, which runs locally, on the device itself; and another larger one, which works on the server, within a private cloud called Private Cloud Compute.

The RSAC Research team set out to circumvent the security that Apple has implemented in the small model, which Interact with users and applications through the Foundation Models Framework API.

As the researchers explain, this API also enforces company policies, monitors model behavior, and attempts to prevent misuse, presumably through input and output filters that prevent malicious input and unwanted responses, since Apple has not detailed this.

To address the problem posed by the input filter, the researchers used what is called ‘Neural Exec’a type of adversarial input generated by machine learning that tricks the LLM into performing an impermissible action.

“Neural Execs seem unintelligible to humans, but they work perfectly in LLMs and are universal,” they explain in the research publication, shared on the RSAC blog.

To bypass the filters, they turned to Unicode, specifically, the Unicode right-to-left overridewhich they have described as an “infallible hacker” trick. “Essentially, we encode the malicious/offensive English text by writing it backwards and using our Unicode trick to force the LLM to display it correctly,” they explained.

The researchers say that they tested this technique with more than one hundred random indications and that they achieved an average attack success rate of 76 percent.

Following these findings, Apple strengthened the security of Apple Intelligence on iOS 26.4 and macOS 26.4. Although RSAC has not detected any signs that this vulnerability has been exploited, they advise users of Apple devices to update as soon as possible.

By Editor