Every mainstream diffusion model follows the same three-part recipe: a text encoder tokenizes your prompt, a diffusion backbone denoises in latent space, and a...
Two days ago, Cursor shipped Composer 2.5.
Mistral just did something no other open-weight lab has pulled off: they shipped a 128B model that scores 77.
Pull up the LLM-Stats video arena today and the number one slot belongs to an open-weights model. Not Runway.
DeepSeek V4-Pro grabbed headlines three weeks ago with an 80.
Most open models fight for the same throne: highest MMLU score, best SWE-Bench pass rate, flashiest reasoning demo.
Mistral dropped Medium 3.
Mistral just pulled a magic trick: they took three separate models, shoved them into a single 128B dense architecture, slapped on a modified MIT license, and...
On May 3rd, Moonshot AI's Kimi K2.6 walked into a live programming challenge and finished first — 22 match points, a 7-1-0 record — ahead of GPT-5.
Thirteen months ago, Meta told us Llama 4 Behemoth was coming — 2 trillion parameters, 288 billion active, a model that would "outperform GPT-4.
Mistral didn't build a text-to-speech model from scratch.
754 billion parameters. 40 billion active.
DeepSeek dropped V4 Pro and V4 Flash on Wednesday, and the numbers shut up most of the skeptics before they could finish typing. V4 Pro — 1.
Llama 4 Scout hit 1.2 million downloads in its first two weeks on HuggingFace.
The most important thing Google shipped with Gemma 4 isn't a model. It's a license.
MiniMax just dropped the weights for M2.
A Chinese AI lab just shipped the world's best coding model — 744 billion parameters, MIT license, trained entirely on Huawei chips — and most Western...
Ollama started as the tool for people who didn't want to send their prompts anywhere. Pull a model, run it on your own hardware, keep everything local.