A

Gemma 4 (Google)

A Tier · 8.3/10

Google DeepMind's open-weights model family -- multimodal, 256K context, runs on edge devices

Last updated: 2026-04-08Free tier available

Score Breakdown

7.0
Ease of Use
8.0
Output Quality
10.0
Value
8.0
Features

The Good and the Bad

What we like

  • +Apache 2.0 license -- truly permissive, you can use it commercially without strings attached
  • +Multimodal: handles text + image input (audio on smaller models), generates text output
  • +256K token context window -- larger than most open models
  • +140+ language support -- one of the strongest multilingual open models available
  • +Four sizes (E2B, E4B, 26B MoE, 31B Dense) cover edge devices to data centers
  • +31B Dense scores 89% on AIME 2026 and 84% on GPQA Diamond -- competitive with frontier closed models
  • +26B MoE activates only 3.8B params during inference for fast tokens-per-second

What could be better

  • Requires technical setup unless you use a hosted API provider
  • Quality still trails the very best closed models (GPT-5.4 Pro, Claude Mythos 5, Gemini 3.1 Ultra) on hardest reasoning tasks
  • No native chat UI from Google -- you're either coding against an API or using a third-party frontend
  • Smaller community than Llama -- fewer fine-tunes and tooling integrations exist

Pricing

Self-hosted

$0
  • Apache 2.0 license
  • Free download from Hugging Face/Kaggle/Ollama
  • Run on your own hardware

API (OpenRouter, Gemma 4 31B)

$0.14-0.40/per 1M tokens
  • Hosted inference
  • $0.14 input / $0.40 output
  • No infrastructure setup

Google AI Studio

$0
  • Free tier for testing
  • Web playground access

Known Issues

  • Gemma 4 launched April 2, 2026 with improved licensing -- earlier Gemma versions had restrictive use clauses that confused developersSource: The Register, Hugging Face · 2026-04
  • Function calling support is new -- some users report inconsistent tool-use behavior compared to Llama 3 or MistralSource: Hugging Face discussions · 2026-04

Best for

Developers and businesses who need a permissively licensed multimodal LLM they can self-host or fine-tune. Especially good for multilingual use cases and on-device deployment.

Not for

Non-technical users who just want to chat with an AI -- there's no consumer-facing app. Use Gemini if you want a polished chat experience.

Our Verdict

Gemma 4 is Google's answer to the open-weights race against Meta's Llama and the wave of strong Chinese open models. The Apache 2.0 license is a big deal -- it removes the legal friction that made earlier Gemma adoption awkward. The 31B Dense model is genuinely competitive with frontier closed models on benchmarks while costing $0.14/M input via API. If you're building a product on open-weights LLMs and you need multimodal + multilingual + permissive licensing, Gemma 4 is now a top choice.

Sources

  • Google DeepMind Gemma 4 page (accessed 2026-04-08)
  • Google blog: Gemma 4 launch (accessed 2026-04-08)
  • Artificial Analysis benchmarks (accessed 2026-04-08)
  • OpenRouter pricing (accessed 2026-04-08)
  • The Register coverage (accessed 2026-04-08)