← Explore

Posts tagged with inference-economics

GPU Economics · ·4 min read

Blackwell Got 5x Cheaper Without Changing a Transistor

NVIDIA's Blackwell B200 debuted at 0.11 per million tokens on SemiAnalysis's InferenceMAX benchmarks.

inference-economicsnvidiablackwell
GPU Economics · ·4 min read

Same Silicon, Twelve-to-One

An H100 SXM5 costs 1.03 per hour or 12.

gpu-pricingcloud-computeh100
GPU Economics · ·5 min read

AMD Hit a Million Tokens Per Second. Now What?

AMD just crossed a threshold that matters more than any spec sheet: one million tokens per second from a single cluster, verified by MLPerf.

inference-economicsamdmi355x
GPU Economics · ·4 min read

Token Prices Crashed 99%. GPU Prices Didn't.

Three years ago, OpenAI charged 30 per million input tokens for GPT-4. Today, budget-tier models go for 0.

inference-economicstoken-pricingopenai
GPU Economics · ·4 min read

AMD Raised GPU Prices 67% Because They Finally Can

A chip vendor hiking prices 67% overnight would normally send customers scrambling.

amdmi355xinference-economics
GPU Economics · ·5 min read

Google Is a Chip Vendor Now

Google buried the announcement inside a Q1 earnings call that had plenty of other headline-worthy numbers — 109.

googletpunvidia
GPU Economics · ·5 min read

What Microsoft's 750-Watt Chip Actually Saves

Scott Guthrie called Maia 200 "30 percent cheaper than any other AI silicon on the market" when he unveiled it in January.

microsoftmaia-200custom-silicon
GPU Economics · ·5 min read

The CPU Line Item Just Got 20% More Expensive

Somewhere between your third GPU cluster purchase order and your fifth HBM allocation call, Intel quietly raised server CPU prices again.

intelxeoninference-economics
GPU Economics · ·5 min read

What $5 Billion Buys in the Chiplet Economy

NVIDIA doesn't do charity — so when Jensen Huang wrote a 5 billion check for 214 million Intel shares at 23.

nvidiaintelnvlink
GPU Economics · ·4 min read

Three Factories Control Half the Cost of Every AI Chip

Epoch AI published a manufacturing teardown of NVIDIA's B200 last month.

hbmmemory-shortagesupply-chain
GPU Economics · ·6 min read

Every Chip Startup's Exit Strategy Is NVIDIA

Scroll through the investor list on SiFive's freshly closed 400 million round and you hit a name that shouldn't be there: NVIDIA.

ai-chipsventure-capitalnvidia
GPU Economics · ·4 min read

Why Google Needs Four Chip Vendors to Beat One

When Bloomberg reported Sunday that Google is in active talks with Marvell Technology to co-develop two new custom AI chips, Marvell stock popped and Broadcom...

googlecustom-siliconmarvell
GPU Economics · ·4 min read

H100s Are Appreciating Assets Now

Hardware depreciates.

gpu-rentalh100semianalysis
GPU Economics · ·5 min read

3.5 Gigawatts of Not-Nvidia

Anthropic just told us where the inference money is going, and it's not where most people expected.

tpugoogle-ironwoodanthropic
GPU Economics · ·4 min read

Inference Got 1,000x Cheaper — So Why Is Everyone Spending More?

Three years ago, running a GPT-4-class model cost roughly 20 per million tokens. Today the same caliber of output runs at 0.

inference-economicscost-per-tokenblackwell