Gemma 4: small open models that punch far above their parameter count

AI · April 2, 2026 · 1 month ago · source (blog.google)

Google's Gemma 4 ships in four sizes: an effective 2B for edge, an effective 4B for mobile, a 26B mixture-of-experts, and a 31B dense model for maximum quality. The claim worth checking is intelligence per parameter. Google says the 31B dense model ranks #3 among open models on the Arena text leaderboard and the 26B sits at #6, both beating models considerably larger.

The feature list is aimed at agent builders rather than chat. Gemma 4 has native function calling and structured JSON output, multi-step reasoning, code generation, context windows from 128K to 256K, multimodal input including video and audio on the smaller models, and training across 140-plus languages. It ships under a commercially permissive Apache 2.0 license, which is the part that decides whether teams can actually deploy it without legal review. Read the announcement on the Google blog.

Why it matters

If you run models on your own hardware, the pairing of small sizes with high Arena placement and Apache 2.0 is the thing to verify. If those rankings hold on your tasks, a 26B or 31B open model changes what you can self-host instead of renting from an API.

Open Models Google DeepMind