The architecture Huang described is precise enough to read as a product roadmap : models handle reasoning, harnesses impose structure, tools extend capability, and runtime is the operational environment where agents do actual work. Each layer has hardware implications, and NVIDIA is claiming the runtime layer as its next territory.
This is structurally different from the GPU buildout. Training infrastructure concentrated in large data centers and benefited from NVIDIA's existing hyperscaler relationships. Agent runtime infrastructure is distributed — different form factors, different latency requirements, a different competitive field. The CI tooling failures already visible in agent deployments