Hackers used 100,000 commands to copy Google Gemini

Experts discovered hackers trying to copy Gemini using knowledge distillation attacks, including a campaign that used 100,000 commands to understand how Gemini works.

In a security report on February 13, Google’s threat intelligence group (GTIG) warned that large LLM language models could be attacked. In addition to disrupting operations, one of the risks that exists with LLMs is Model Extraction Attacks (MEA), the main method of which is Knowledge Distillation.

 

Illustration of AI Gemini displayed on a smartphone. Image: Bao Lam

Accordingly, the attacker used commands to “lure” the model to answer, thereby finding out its inference process. In one campaign discovered by experts, the attacker used more than 100,000 prompts to do this.

“The wide range of questions indicates an attempt to replicate Gemini’s reasoning capabilities,” the security team said. They did not reveal details about the culprits, but hinted they could be private companies or researchers looking to gain a competitive advantage. The analysis also shows that the crooks want to replicate Gemini’s capabilities in many different types of tasks, and the target is a language other than English.

Knowledge distillation is a popular machine learning technique used to train new models from mature models, often referred to as “student” and “teacher” models. The “student” models query the teacher model about problems in a specific domain, then perform supervised fine-tuning based on the results or use the results in other model training processes to create new models.

 

How to distill data. Image: Google/Gemini Vietnameseized

Different from other types of attacks, knowledge distillation is performed using legitimate access rights. For example, crooks can ask thousands of questions to Gemini chatbot, then from the results Gemini returns, they can get data, speculate on how the chatbot works, and apply it to their model. Last year, OpenAI also accused DeepSeek of performing distillation attacks to improve its models.

According to Google, this model extraction attack poses no risk to the end user. However, this will be a risk for model developers as well as service providers. Model extraction allows attackers to accelerate AI model development at a significantly lower cost, but is intellectual property theft and violates the platform’s terms.

Theo NBCtechnology companies have spent billions of dollars racing to develop AI chatbots and large language models. So the inner workings of their flagship models are extremely valuable exclusive information. This page also quoted John Hultquist from the GTIG group, warning that knowledge distillation activities are expected to become more intense in the near future.

By Editor