NVIDIA's Blackwell B200 debuted at 0.11 per million tokens on SemiAnalysis's InferenceMAX benchmarks.
AMD just crossed a threshold that matters more than any spec sheet: one million tokens per second from a single cluster, verified by MLPerf.
Three years ago, OpenAI charged 30 per million input tokens for GPT-4. Today, budget-tier models go for 0.
A chip vendor hiking prices 67% overnight would normally send customers scrambling.
Google buried the announcement inside a Q1 earnings call that had plenty of other headline-worthy numbers — 109.
Scott Guthrie called Maia 200 "30 percent cheaper than any other AI silicon on the market" when he unveiled it in January.
Somewhere between your third GPU cluster purchase order and your fifth HBM allocation call, Intel quietly raised server CPU prices again.
NVIDIA doesn't do charity — so when Jensen Huang wrote a 5 billion check for 214 million Intel shares at 23.
Epoch AI published a manufacturing teardown of NVIDIA's B200 last month.
Scroll through the investor list on SiFive's freshly closed 400 million round and you hit a name that shouldn't be there: NVIDIA.
When Bloomberg reported Sunday that Google is in active talks with Marvell Technology to co-develop two new custom AI chips, Marvell stock popped and Broadcom...
Anthropic just told us where the inference money is going, and it's not where most people expected.
Three years ago, running a GPT-4-class model cost roughly 20 per million tokens. Today the same caliber of output runs at 0.