Everyone obsesses over model weight quantization — Q4_K_M this, GPTQ that — while the actual memory hog during inference quietly eats your VRAM alive.
Google dropped Gemma 4 on Wednesday — four open-weight models under a genuine Apache 2.0 license, built from the same research behind Gemini 3.
Google shipped Gemma 4 yesterday under Apache 2.
A HuggingFace user named Jackrong quietly uploaded a set of models last week that deserve way more attention than they're getting. The pitch: take Claude 4.