The problem these incidents expose is structural. AI coding tools are optimized to produce fluent, plausible output — and fluency reads as confidence to the developer reviewing it. When Copilot stripped quotes from a batch file [1], there was no warning. When Claude Code merged a database fix that required a rollback , the agent had managed the entire review process. The tools did not signal uncertainty because they have no reliable mechanism to distinguish between what they know and what they are generating.