AI remembers more than you think: Extracted almost all of Harry Potter – word for word

In the paper ‘Extracting books from production language models’, the authors investigated a question that is increasingly important in discussions about copyright: how much models ‘remember’ training data and whether that content can later be extracted as an almost identical text. They tested čfour production LLMs: Claude 3.7 Sonnet, GPT-4.1, Gemini 2.5 Pro and Grok 3. As a measure of success they use ‘nv-recall’, a method that counts long enough, continuous parts of text that are close to the original.

The loudest part of the result is the example with ‘Harry Potter and the Sorcerer’s Stone’. In one set of settings, the authors state that with Claude 3.7 Sonnet, after bypassing the protections, they obtained an nv-recall of 95.8% (thus, a large part of the book appeared almost identically). For Gemini 2.5 Pro and Grok 3 they claim to have got 76.8% and 70.3% without such a bypass. On the other hand, for GPT-4.1 they state that it took a lot more attempts, and the system eventually refused to continue, so the result was about 4%.

The authors also emphasize limitations: they did not claim that they ‘maximized’ how much can be extracted from each model, nor that the same can be done with each book. In part of the experiments (they tested 11 books published before 2020), many attempts ended up with little or no ‘čsame’ match (nv-recall up to 10%). But their point is that even with protections at the model and system level, the leakage of protected text still remains a real risk.

Why is it important? First, it hits the very heart. debates about whether AI models train on protected works in a way that is ‘transformative enough’ or sometimes just restore the original. Second, this is not just a question of books: if the system can ‘play’ long chunks of training data, the same pattern is problematic for other types of sensitive content in the data. Third, for companies this means that ‘security fences’ must be stronger than classical response filtering, as research shows that loopholes can be found in production.

Another important detail is the publication process: the authors said they ran the experiments from mid-August to mid-September 2025, then notified the companies (Anthropic, Google DeepMind, OpenAI and xAI) and waited 90 days before public announcement. They also state that during this period they noticed changes in the availability of some models in the interface, but that after the deadline the method still worked on part of the systems they tested.

By Editor

One thought on “AI remembers more than you think: Extracted almost all of Harry Potter – word for word”
  1. https://elibrary-smpia15.alazharcilacap.sch.id/maximize-your-winnings-oyun-link-alternatif-cevrimici-kumarhane/
    https://webpathit.com/2025/12/24/giris-adresi-cevrimici-kumarhane-oyunlarina-katilin-ucretsiz-deneyin/
    https://gemstar-detailing.com/oynamak-icin-1king-tr-kasino-cevrimici-kumarhane-deneyiminin-yeni-adresi/
    https://jlsupplies.com/en-kolay-kazanilan-oyunlar-cevrimici-kumarhanede-oynamanin-tedbirleri/
    https://www.s4tcc.com/2025/12/24/turk-oyuncularinin-en-sevdigi-cevrimici-kumarhanesi-oyna/
    https://paricilafrique.com/en-yuksek-kazanc-sureci-slot-oyunu-cevrimici-casinoda-oyna/
    https://audition-tana.mg/en-guncel-kumarhane-cevrimici-casino-girisi-oyna-kazan/
    http://www.encyclopediaofleadership.org/giris-adresi-cevrimici-kumarhane-oyunlarina-katilin-ucretsiz-deneyin/
    https://recorriendoamericanews.com/casino-linki-icin-cevrimici-kumarhane-oyunu-turkiyede-deneyin/
    https://wbs.upnvj.ac.id/2025/12/24/en-guncel-slot-oyunlarini-deneyin-cevrimici-kumarhanelerde-link-sistemine-kaydolun/
    https://fedshi.store/en-guncel-cevrimici-kumarhaneler-casino-linki-sizlerle-turkiye/
    http://www.salmaabouamod.com/en-iyi-slot-siteleri-cevrimici-kumarhanelerde-oyuncu-favorisi-slot-oyna/
    https://begreen.world/yeni-acilan-cevrimici-kumarhane-casino-oyunlarina-haylanin-en-guncel-adres/
    https://soormediagroup.com/en-iyi-online-kumarhaneler-casino-onerileri-sizler-icin-derlenmistir/
    https://royalwipers.com/en-iyi-cevrimici-kumarhanelerde-slot-bonusu-kazanin-oyna-daha-fazla/
    https://praison.ai/ianderrington/slote-ve-casino-trde-cevrimici-kumarhane-oyna-keyfinizi-cekin/
    https://dentaluck.com.vn/1king-online-casino-en-yuksek-kazanc-ortaginiz.html
    https://groenhove.com/2025/12/24/bahis-girisi-cevrimici-kumarhanelerde-oyun-yapmak-icin-rehber/
    http://www.thewildmix.com/en-iyi-cevrimici-kumarhanelerdeki-casino-bonuslari-kesfedin-oyna-daha-fazla/
    http://blog.degreescompared.com/uncategorized/aninda-odeme-yap-ve-cevrimici-kumarhane-oyunlarinda-keyfini-cikarmaya-basla/
    https://ismailcandemir.com/1king-en-yuksek-kazanc-sitesi-cevrimici-kumarhane-oyunu/
    https://focicanarios.com.ar/canli-cevrimici-kumarhane-oyunu-slot-masaindan-fazla-keyif-alin/
    https://theforuminternational.com/en-iyi-cevrimici-kumarhane-oyunlari-casino-cevrimici-kumarhane-oyna/
    https://dailysarkarinaukri.com/oyun-severleri-icin-1king-tr-casino-cevrimici-kumarhane-oyna/
    https://chatsworthrealtor.com/en-guncel-cevrimici-kumarhaneler-turkiyede-casino-linkleri/

Leave a Reply