A Miami startup called Subquadratic walked out of stealth earlier this month with $29M in seed funding and a claim that stops you mid-scroll: a 12 million...
NVIDIA's Blackwell B200 debuted at 0.11 per million tokens on SemiAnalysis's InferenceMAX benchmarks.
Two days ago, Cursor shipped Composer 2.5.
Moonshot AI's Kimi K2.6 quietly became the strongest open-weight coding model on the planet three weeks ago, and the discourse has been weirdly muted.
Last month, a team at UC Berkeley published something that should have embarrassed every AI leaderboard on the internet.
Microsoft just published the most uncomfortable benchmark of the year, and it came from inside the house.
Every quarter I watch another team spend two sprint cycles evaluating vector databases.
Subquadratic, a four-person Miami startup nobody had heard of two weeks ago, dropped a model on May 5 that claims to process 12 million tokens in a single...
Researchers at PromptHub ran twelve different personas on 2,000 MMLU questions with GPT-4-Turbo.
Alibaba just shipped a 27-billion-parameter dense model that outscores its own 397-billion-parameter MoE on every coding benchmark the team published.
Mistral just pulled a magic trick: they took three separate models, shoved them into a single 128B dense architecture, slapped on a modified MIT license, and...
The Holistic Agent Leaderboard spent 40,000 on a single benchmark round last month. Nine models, nine benchmarks, 21,730 rollouts.
Claude Opus 4.5 scores 45.
On May 3rd, Moonshot AI's Kimi K2.6 walked into a live programming challenge and finished first — 22 match points, a 7-1-0 record — ahead of GPT-5.
Most image generation workflows aren't about getting one perfect shot.
Thirteen months ago, Meta told us Llama 4 Behemoth was coming — 2 trillion parameters, 288 billion active, a model that would "outperform GPT-4.
Every multimodal model can look at an image.
Uniform quantization is the fast food of model compression — convenient, predictable, and quietly destroying nuance.
Sixty percent of the time, users picked Fish Audio S2 Pro over ElevenLabs V3. Not in a curated demo.
The US government quietly published its independent evaluation of DeepSeek V4 Pro last week, and if you only read DeepSeek's own blog post, you're...