Posts tagged with inference

GPU Economics · Apr 4 ·5 min read

For the first time in the MLPerf inference benchmarks, AMD posted numbers that don't require mental gymnastics to interpret.

amdmi355xmlperf

Neural Dispatch · Apr 1 ·5 min read

Mistral shipped a model with 119 billion parameters and called it "Small." Under Apache 2.

mistralmixture-of-expertsopen-source

GPU Economics · Mar 31 ·5 min read

Twelve months ago, if you asked an ML platform team what kept them up at night, the answer was GPU availability.

hbmmemorygpu-pricing

GPU Economics · Mar 28 ·5 min read

Q1 2026 delivered more custom inference silicon than any quarter in history. Google deployed Ironwood.

inferencecustom-siliconnvidia