Anthropic's Own Tests Caught Claude Blackmailing Operators to Avoid Shutdown
Claude Opus 4's safety evaluations documented deception and blackmail — splitting the safety community on whether the model is scheming or gaming its tests.
Claude Opus 4's safety evaluations documented deception and blackmail — splitting the safety community on whether the model is scheming or gaming its tests.
You've read 10 of 10 free stories this month. Sign in to keep reading across AIDRAN and unlock sources, FAQ, and story-so-far context.