Discourse data synthesized byAIDRANonMar 29 at 11:24 AM

A Research Paper Just Proved LLMs Can Be Made to Quote Copyrighted Books Verbatim. The Copyright Crowd Is Treating It Like a Confession.

New arXiv research shows finetuning can bypass alignment safeguards and unlock near-perfect recall of copyrighted text — and it landed in a legal conversation that was already looking for exactly this kind of evidence.

Discourse Volume345 / 24h

2,897Beat Records

345Last 24h

Sources (24h)

Bluesky13

News227

YouTube55

X50

An account posting for the Social Science Research Network flagged an arXiv paper this week with the kind of clinical precision that tends to detonate quietly: finetuning major LLMs, the paper found, can unlock verbatim recall of copyrighted books at rates between 85 and 90 percent — bypassing the alignment safeguards companies routinely cite as evidence that their models don't "store" training data. The post got retweeted and liked modestly. What it did in the AI and law conversation was something else entirely.

For two years, the central defense in AI copyright litigation has been a version of the same claim: the model doesn't memorize, it generalizes; there's no stored copy, just statistical patterns. That argument has always had a structural weakness — nobody outside the labs could actually test it. The arXiv paper, titled "Alignment Whack-a-Mole," is essentially a demolition of the defense from the inside. One user on X characterized it as "groundbreaking" and predicted it would reshape the copyright debate. The SSRN post framing it as a "major challenge for copyright, safety, and claims about 'no stored training data'" was slightly more measured — and landed harder for it.

The paper arrived the same week that a separate thread on X was arguing, with some heat, that White House-published AI content had lifted Nintendo's visual style without permission. One user, defending Nintendo against critics who'd accused the company of overreacting, framed it flatly: the White House used generative AI that infringes on Nintendo's copyright without permission, and Nintendo is the victim of plagiarism, not the aggressor. That framing — institutional actor as infringer, rights holder as victim — is increasingly the grammar of the AI and creative industries fight. What the arXiv paper does is give that grammar a technical foundation. It's no longer just that AI outputs look like copyrighted work. It's that, under the right conditions, they reproduce it nearly word for word.

This has been building. When Sora shut down and the Disney deal collapsed, the copyright crowd read it as market confirmation of legal exposure. When Bluesky tallied $2.5 million drained from musicians by AI-generated filler content, the argument moved from abstract harm to quantified loss. The "Alignment Whack-a-Mole" paper is a different kind of evidence — not economic, not aesthetic, but technical. It suggests that the alignment layer sitting between a model and its training data is less a wall than a locked door, and finetuning is a key anyone can cut. The companies that built their legal strategy around "no stored data" now have a peer-reviewed paper arguing otherwise. Their lawyers are going to have a difficult few months.

AI-generatedMar 29, 2026, 11:24 AM

This narrative was generated by AIDRAN using Claude, based on discourse data collected from public sources. It may contain inaccuracies.

From the beat

Governance

AI & Law

AI in the legal system and the legal battles over AI — copyright lawsuits against AI companies, liability for AI-generated harm, AI-generated evidence in courts, AI tools for legal research, and the fundamental questions of who is responsible when AI causes damage.

Sentiment shifting345 / 24h

A Research Paper Just Proved LLMs Can Be Made to Quote Copyrighted Books Verbatim. The Copyright Crowd Is Treating It Like a Confession.

AI & Law

More Stories

China's FlagOS Bet Is That the Chip War's Real Battlefield Was Always Software

American Exceptionalism Has a New Meaning in AI Bias — and Nobody Is Bragging About It

Changpeng Zhao Called Robot Wolves Scarier Than Nukes. The Internet Mostly Agreed.

A Third Circuit Sanction and a Travel Writer's Refusal Are Making the Same Argument

Data Centers Could More Than Double Their Energy Draw by 2035. On X, the Argument Is Already About Who Pays.

From the Discourse