Gemini (Google) logo
A

Gemini (Google)

A Tier · 8.3/10

Google's LLM with deep Google Workspace integration, 2M token context window, and native code execution

Last updated: 2026-05-07Free tier available

Score Breakdown

8.0
Ease of Use
8.0
Output Quality
9.0
Value
8.0
Features

Benchmark Scores

Benchmarks for Gemini 3.1 Ultra

Chatbot Arena ELOHuman preference rating1500
BenchmarkScore
MMLU90.5%
GPQA Diamond94.3%
HumanEval93.5%
SWE-bench80.6%
ARC-AGI77.1%

Last updated: 2026-04-13

Personality & Tone

The Google research assistant

Tone: Neutral, thorough, and slightly corporate. Gemini leans academic, cites sources readily in Deep Research mode, and keeps its tone even across topics -- rarely funny, rarely snarky.

Quirks: Tightly integrated with Google products -- pulls from Search and Workspace by default, which is useful for grounded answers but means you hear Google's worldview. Can feel evasive or overly safe on opinionated or politically charged questions.

The Good and the Bad

What we like

  • +2 million token context window is the largest available -- can process entire books and full codebases in one prompt
  • +Best Google Workspace integration (Gmail, Docs, Drive, Calendar)
  • +Free tier is more generous than Claude's
  • +Gemini Advanced includes 2TB Google One storage -- real added value
  • +API pricing is very competitive, especially for Flash model

What could be better

  • Output quality for creative writing is the weakest of the big three (GPT-4, Claude, Gemini)
  • Hallucination rate is higher than Claude in our testing
  • Google's track record of killing products makes long-term commitment feel risky
  • The Gemini app UI feels like Google slapped AI onto an existing product

Pricing

Free

$0
  • Gemini 3.1 Flash
  • Basic features
  • Google integration

Google AI Pro

$19.99/month
  • Gemini 3.1 Ultra
  • 2M token context
  • Code Execution sandbox
  • 2TB Google storage
  • Workspace integration
  • Lyria 3 access

Google AI Ultra

$249.99/month
  • Gemini 3.1 Ultra (max usage)
  • Gemini 3.1 Flash Live audio
  • Lyria 3 Pro full access
  • Highest API priority
  • 30TB Google storage

API

$0.075-5/per 1M tokens
  • All models
  • 2M context
  • Flash-Lite at $0.25/M input
  • Grounding with Google Search
  • Code Execution
  • Mandatory spend caps (April 2026)

Known Issues

  • GEMINI 3.1 FLASH-LITE GA (2026-05-07, TODAY): Generally available on the Gemini Enterprise Agent Platform. Fastest + most cost-efficient Gemini 3 series model. **2.5x faster Time-to-First-Answer-Token vs Gemini 2.5 Flash; +45% output speed**. Pricing per third-party reference: $0.25/M input, $1.50/M output (vendor blog itself omits direct pricing -- check ai.google.dev/pricing for canonical). Customer signals at GA: Gladly reports ~60% lower cost vs thinking-tier models; OffDeal cites sub-second p95 for classifiers. **This is a pre-Google-I/O staging signal** -- I/O 2026 keynote is 2026-05-19 (T-12 from this entry). Expect more pre-keynote drops over the next two weeksSource: Google Cloud blog (cloud.google.com/blog/products/ai-machine-learning/gemini-3-1-flash-lite-is-now-generally-available), blog.google · 2026-05-07
  • Gemini 2.5 family retirement dates EXTENDED (ai.google.dev deprecations page, checked 2026-04-24): Gemini 2.5 Pro, 2.5 Flash, AND 2.5 Flash-Lite now all retire 2026-10-16 (pushed out from original 2026-06-17 / 2026-07-22 dates). Gives ~6 more months to migrate to gemini-3.1-pro + gemini-3-flash. Production code still calling 2.5 model names continues to work through Oct 16, but do not ship new code on retiring endpointsSource: ai.google.dev/gemini-api/docs/deprecations (verified 2026-04-24) · 2026-04-24
  • Gemini 3.1 Flash TTS launched 2026-04-15 as a preview on Gemini API, AI Studio, Vertex AI, and Google Vids. 70+ languages, audio tags for vocal style/pace/delivery embedded in the text prompt, Elo 1,211 on Artificial Analysis TTS leaderboard. Positions Google as a direct competitor to ElevenLabs v3 on the TTS stackSource: blog.google Gemini 3.1 Flash TTS, MarkTechPost · 2026-04
  • Image generation of people was temporarily disabled after generating historically inaccurate results, partially restored but still limitedSource: The Verge, Google Blog · 2026-01
  • Gemini Pro model access removed from free API tier on April 1, 2026 -- mandatory spend caps and prepaid billing now required for new accountsSource: Google AI for Developers, FindSkill.ai · 2026-04
  • Google AI Ultra at $249.99/mo is hard to justify against Claude Max ($200) and ChatGPT Pro ($200) unless you specifically need Lyria 3 ProSource: Reddit r/Bard · 2026-04

Best for

Google Workspace power users. If you live in Gmail, Docs, and Drive, Gemini Advanced integrates directly into your workflow. Also great for developers who need the cheapest API with the longest context window.

Not for

Anyone who needs the best raw output quality. Claude and GPT-4 both write better. Also not for anyone spooked by Google's history of abandoning products.

Our Verdict

Gemini's strength is the ecosystem play. The 1M context window is genuinely useful for long documents, and the Google Workspace integration is something neither OpenAI nor Anthropic can match. But purely as an LLM, the output quality is a step behind Claude and GPT-4. Pick Gemini if you're deep in Google's ecosystem. Otherwise, the other two are better standalone.

Sources

  • Google AI for Developers: deprecations (accessed 2026-04-21)
  • Google Blog: Gemini 3.1 Flash TTS (accessed 2026-04-21)
  • Google Gemini official site (accessed 2026-04-21)
  • LMSYS Chatbot Arena rankings (accessed 2026-04-13)
  • Reddit r/Bard (accessed 2026-04-13)

The Tier List Tuesday

Weekly newsletter: tier movers, new entrants, and the VS of the week. Built from our daily AI-tool sweeps. No spam, unsubscribe anytime.

Alternatives to Gemini (Google)

Claude (Anthropic) logo

Claude (Anthropic)

Anthropic's flagship LLM -- Opus 4.7 (launched April 16, 2026) with 1M-token context, high-res vision, new xhigh reasoning level, and the most natural conversational style. Note: 2026-04-04 policy excluded third-party agent harnesses (OpenClaw etc.) from Pro/Max flat-rate, and 2026-04-16 Enterprise pricing dropped bundled tokens

A
8.5/10
Free tierFrom $0
Best writing quality of any LLM -- Opus ...1M token context window for enterprise A...
Updated 2026-05-06
Claude Mythos Preview logo

Claude Mythos Preview

Anthropic's most capable model -- a gated research preview via Project Glasswing, cybersecurity-specialized. 73% success on expert CTF tasks, 32-step autonomous network attacks. Not generally available.

C
6.5/10
From Invite only
The most capable Anthropic model availab...73% success rate on expert-level Capture...
Updated 2026-04-20
Grok logo

Grok

xAI's irreverent chatbot with a direct line to X/Twitter -- real-time data meets unfiltered personality. Grok 4.3 production launched 2026-05-02 with Custom Voices cloning + Imagine Agent Mode + ~40% API price cut to $1.25/$2.50 per 1M tokens

B
7.5/10
Free tierFrom $0
Real-time access to X/Twitter data is ge...Grok 3 benchmarks are competitive with G...
Updated 2026-05-05
Muse Spark (Meta) logo

Muse Spark (Meta)

Meta's first model from its Superintelligence Lab -- natively multimodal with Contemplating mode for multi-agent reasoning

A
8.8/10
Free tierFrom $0
Completely free to use via Meta AI app a...Natively multimodal: handles text, image...
Updated 2026-04-19
GPT-Rosalind (OpenAI) logo

GPT-Rosalind (OpenAI)

OpenAI's first domain-specific model -- life sciences, drug discovery, translational medicine. Launched 2026-04-16 as a Trusted Access research preview. Launch partners: Amgen, Moderna, Allen Institute, Thermo Fisher. Paired with a Life Sciences Codex plugin (50+ scientific tool integrations)

C
6.8/10
From Invite only
OpenAI's first named vertical/domain-spe...Launch partners Amgen, Moderna, Allen In...
Updated 2026-04-17
GPT-5.4-Cyber (OpenAI) logo

GPT-5.4-Cyber (OpenAI)

OpenAI's defensive-cybersecurity variant of GPT-5.4, launched 2026-04-16. Lowered refusal boundary for security-research tasks and native binary reverse-engineering. Access gated via Trusted Access for Cyber (TAC) program -- thousands of verified defenders, hundreds of teams, no public pricing

B
7.2/10
From Not publicly disclosed
Directly competes with Claude Mythos Pre...Lowered refusal boundary on defensive-se...
Updated 2026-04-19
Hunyuan 3 (Tencent Hy3) logo

Hunyuan 3 (Tencent Hy3)

Tencent's Hy3 Preview launched 2026-04-23 -- 295B total / 21B active MoE, 256K context, open-sourced on HuggingFace under tencent/Hy3-preview. Cheapest frontier-class API at ~1.2 RMB per million input tokens. Integrated into Yuanbao, WeChat, QQ

A
8.1/10
Free tierFrom $0
Open weights from a top-3 Chinese tech c...Pricing is aggressive. ~1.2 RMB per mill...
Updated 2026-04-25
MiMo (Xiaomi) logo

MiMo (Xiaomi)

Xiaomi's MiMo-V2.5 family launched 2026-04-22 -- Pro (1T total / 42B active MoE, 1M context, native vision+audio reasoning), Multimodal base, TTS (3 sub-models: base, VoiceDesign, VoiceClone), and ASR (open-source, English + Chinese + major dialects). Full voice pipeline for the agent era. Extra-charge 1M-context tier removed at launch

A
8.3/10
Free tierFrom $0
Full voice pipeline shipped together: a ...Native multimodal in MiMo-V2.5-Pro is th...
Updated 2026-04-25