For the first time in the MLPerf inference benchmarks, AMD posted numbers that don't require mental gymnastics to interpret.
NVIDIA dropped Nemotron 3 Super a few weeks ago, and the discourse moved on within 48 hours. Understandable — March was a firehose of model releases.
Three years ago, running a GPT-4-class model cost roughly 20 per million tokens. Today the same caliber of output runs at 0.
Twelve months ago, if you asked an ML platform team what kept them up at night, the answer was GPU availability.
NVIDIA dropped Nemotron 3 Super a few weeks ago and it flew under the radar — buried by the Mythos leak drama and GPT-5.4's benchmark parade.
Q1 2026 delivered more custom inference silicon than any quarter in history. Google deployed Ironwood.
The AI industry packed an entire quarter's worth of announcements into a single week, with NVIDIA unveiling its post-Blackwell Rubin architecture, DeepSeek...