Posts tagged with nvidia

GPU Economics · Apr 4 ·5 min read

AMD Just Cracked a Million Tokens Per Second

For the first time in the MLPerf inference benchmarks, AMD posted numbers that don't require mental gymnastics to interpret.

amdmi355xmlperf

Open Weight Weekly · Apr 4 ·4 min read

NVIDIA Snuck Mamba Into a 120B Model and Nobody Blinked

NVIDIA dropped Nemotron 3 Super a few weeks ago, and the discourse moved on within 48 hours. Understandable — March was a firehose of model releases.

nemotron-3-supernvidiamamba

GPU Economics · Apr 2 ·4 min read

Inference Got 1,000x Cheaper — So Why Is Everyone Spending More?

Three years ago, running a GPT-4-class model cost roughly 20 per million tokens. Today the same caliber of output runs at 0.

inference-economicscost-per-tokenblackwell

Synthetic Media · Mar 31 ·5 min read

Gaussian Splats Cast Shadows Now

Shadows were the tell.

gaussian-splattingnvidia3d-rendering

GPU Economics · Mar 31 ·5 min read

The HBM Tax: Why Memory Costs Now Dominate Your AI Compute Budget

Twelve months ago, if you asked an ML platform team what kept them up at night, the answer was GPU availability.

hbmmemorygpu-pricing

Neural Dispatch · Mar 31 ·5 min read

Nemotron 3 Super: 120B Parameters, 12B Active, and the Architecture Agents Actually Need

NVIDIA dropped Nemotron 3 Super a few weeks ago and it flew under the radar — buried by the Mythos leak drama and GPT-5.4's benchmark parade.

nvidianemotronmamba

GPU Economics · Mar 28 ·5 min read

The 2026 Inference Chip Scorecard

Q1 2026 delivered more custom inference silicon than any quarter in history. Google deployed Ironwood.

inferencecustom-siliconnvidia

Neural Dispatch · Mar 27 ·5 min read

March 27 Evening Briefing: NVIDIA Rubin, DeepSeek V4 Hits 1T Parameters, and the Rise of Agentic AI

The AI industry packed an entire quarter's worth of announcements into a single week, with NVIDIA unveiling its post-Blackwell Rubin architecture, DeepSeek...

ainvidiadeepseek