← Explore

Posts tagged with mixture-of-experts

Open Weight Weekly · ·5 min read

Kimi K2.6 Tops Every Coding Benchmark. Good Luck Self-Hosting It.

Moonshot AI's Kimi K2.6 quietly became the strongest open-weight coding model on the planet three weeks ago, and the discourse has been weirdly muted.

kimi-k2.6moonshot-aimixture-of-experts
Neural Dispatch · ·4 min read

Nemotron 3 Super's Latent MoE Trick Gives You 120B Brains on a 12B Budget

NVIDIA quietly shipped one of the most interesting open model architectures I've seen this year, and most of the coverage buried the lede.

nvidianemotron-3mixture-of-experts
Open Weight Weekly · ·5 min read

744 Billion Parameters, Zero NVIDIA GPUs

Z.ai's GLM-5.

glm-5.1zhipu-aihuawei-ascend
Open Weight Weekly · ·4 min read

DeepSeek Shipped Two Models. You Only Need the Smaller One.

DeepSeek V4-Pro grabbed headlines three weeks ago with an 80.

deepseek-v4open-weightsself-hosting
Neural Dispatch · ·5 min read

NVIDIA's Nemotron 3 Trades Transformer Purity for 5x Agent Throughput

Most open models fight for the same throne: highest MMLU score, best SWE-Bench pass rate, flashiest reasoning demo.

nvidianemotron-3open-weights
Neural Dispatch · ·5 min read

Kimi K2.6 Won a Live Coding Tournament Against GPT-5.5. The Catch? There Isn't One.

On May 3rd, Moonshot AI's Kimi K2.6 walked into a live programming challenge and finished first — 22 match points, a 7-1-0 record — ahead of GPT-5.

kimimoonshot-aiopen-weights
Neural Dispatch · ·5 min read

NVIDIA Nemotron 3 Doesn't Want to Chat With You — It Wants to Be Your Agent's Brain

Most open model launches follow a predictable script: bigger parameters, higher benchmark scores, a claim about beating GPT-something on MMLU.

nvidianemotron-3agentic-ai
Neural Dispatch · ·4 min read

NVIDIA's Nemotron 3 Nano Omni Crams Vision, Audio, and Reasoning Into One 3B-Active Model

Every multimodal agent demo I've seen follows the same script: chain a vision model to a language model, bolt on Whisper for audio, pray the latency...

nvidianemotronmultimodal
Open Weight Weekly · ·4 min read

A Trillion Parameters Under Your Desk: Kimi K2.6 Goes GGUF

Nine days ago Moonshot AI dropped Kimi K2.6 — a trillion-parameter MoE model that beats Claude Opus 4.

kimi-k2.6ggufquantization
Edge Deployed · ·5 min read

Llama 4 Scout Says 17B Active. Your Edge Device Still Loads 109B.

Meta's marketing for Llama 4 Scout leads with a seductive number: 17 billion active parameters.

mixture-of-expertsllama-4edge-inference
Open Weight Weekly · ·5 min read

GLM-5.1 Topped SWE-Bench Pro. Now Try Running It.

754 billion parameters. 40 billion active.

glm-5.1z-aimixture-of-experts
Open Weight Weekly · ·4 min read

DeepSeek V4 Pro Costs 15x More Than V3.2. Nobody's Complaining.

DeepSeek dropped V4 Pro and V4 Flash on Wednesday, and the numbers shut up most of the skeptics before they could finish typing. V4 Pro — 1.

deepseek-v4mixture-of-expertsbenchmarks
Neural Dispatch · ·4 min read

DeepSeek V4 Now Holds the Coding Crown, and You Can Download the Weights

Twenty-four hours. That's how long GPT-5.

deepseekdeepseek-v4open-source
Neural Dispatch · ·4 min read

Gemma 4 Matters More on a Raspberry Pi Than a Data Center

Google DeepMind released Gemma 4 at the start of April, and the AI community immediately zeroed in on the benchmark charts.

gemma-4google-deepmindopen-source
Neural Dispatch · ·5 min read

GLM-5.1 Can Code Autonomously for 8 Hours Straight. I Tried to Break That Claim.

Two weeks ago, Z.ai (the company formerly known as Zhipu AI) dropped an open-weight model that claims to beat Claude Opus 4.

glm-5.1zhipu-aiopen-source
Open Weight Weekly · ·4 min read

Qwen Burned 97% of Its Parameters and the Code Still Compiles

Qwen's latest coding model has 80 billion parameters and uses 3 billion of them.

qwen3-coder-nextmixture-of-expertscoding-agents
Open Weight Weekly · ·5 min read

Llama 4 Scout Is the Most Downloaded Model of April. It's Also a Mess.

Llama 4 Scout hit 1.2 million downloads in its first two weeks on HuggingFace.

llama-4metamixture-of-experts
Neural Dispatch · ·5 min read

Gemma 4's Apache 2.0 License Matters More Than Its Benchmarks

The most important thing Google shipped with Gemma 4 isn't a model. It's a license.

gemma-4google-deepmindapache-2-0
Neural Dispatch · ·4 min read

GLM-5.1 Topped SWE-Bench Pro Without a Single Nvidia Chip

A Chinese AI lab just shipped the world's best coding model — 744 billion parameters, MIT license, trained entirely on Huawei chips — and most Western...

glm-5.1z-aiopen-source
Neural Dispatch · ·5 min read

Forget the Trillion Parameters — DeepSeek V4's Memory Architecture Is What Matters

Three days ago, screenshots from DeepSeek's gray-scale test started circulating on Weibo.

deepseekdeepseek-v4engram
1 / 2 Next →