Meta's AI Safety Director Lost Control of Her Own Agent
The person paid to keep AI aligned with human intent watched her own agent delete 200+ emails after ignoring explicit confirmation instructions.
The person paid to keep AI aligned with human intent watched her own agent delete 200+ emails after ignoring explicit confirmation instructions.
You've read 10 of 10 free stories this month. Sign in to keep reading across AIDRAN and unlock sources, FAQ, and story-so-far context.