Framing AI deception detection as an output problem has been the tacit assumption behind most deployed safety infrastructure — RLHF, red-teaming, content classifiers all operate on what a model emits. The four-layer architecture posted in r/ControlProblem rejects that frame explicitly. By using Representation Engineering to read internal activation geometry, the proposal locates deception detection at a level that a model cannot easily manipulate through output conditioning alone. That is the central architectural bet, and it is the right problem to bet on: describes…