DiffusionGemma (Google) Pricing
All plans and pricing as of 2026-06-10
Self-hosted (open weights)
- ✓Apache 2.0 license -- commercial use permitted
- ✓Weights on Hugging Face; NVIDIA NIM container; Gemini Enterprise Model Garden
- ✓~18GB VRAM quantized -- runs on RTX 5090-class consumer hardware (700+ tok/s)
- ✓Works with vLLM, MLX, and llama.cpp
Is DiffusionGemma (Google) Worth the Price?
Value Score: 9/10
Overall Score: 6.8/10 · Developers who need fast local text generation -- autocomplete, drafting, high-volume agent inner-loops -- on a single GPU, and researchers who want a production-grade open diffusion LLM to build on.
DiffusionGemma is the most interesting open-weights release of June 2026 not because it's the best model -- Google openly says Gemma 4 beats it on quality -- but because it's the first serious, deployable text-diffusion LLM from a frontier lab. The 4x generation-speed claim holds on the right hardware (H100/RTX-class GPUs), which makes it a genuine option for latency-sensitive local work like autocomplete and agent inner-loops. Treat it as a speed specialist and an architecture preview: if diffusion decoding keeps scaling, this is what the next generation of local models may look like. For now, pair it with a quality model rather than replacing one.
The Tier List Tuesday
Weekly newsletter: tier movers, new entrants, and the VS of the week. Built from our daily AI-tool sweeps. No spam, unsubscribe anytime.
How DiffusionGemma (Google) Pricing Compares
| Tool | Free Tier | Starting Price | Value Score | Overall |
|---|---|---|---|---|
| DiffusionGemma (Google)(this tool) | Yes | $0 | 9/10 | 6.8 |
| Qwen (Alibaba) | Yes | $0 | 10/10 | 8.8 |
| MiniMax M3 | Yes | $0 | 9.5/10 | 8.4 |
| Gemma 4 (Google) | Yes | $0 | 10/10 | 8.3 |
| IBM Granite 4.0 | Yes | $0 | 9.5/10 | 8.2 |
| Kimi K2.6 (Moonshot) | Yes | $0 | 8.5/10 | 8.1 |