Gemini (Google)

A Tier · 8.3/10

Google's LLM with deep Google Workspace integration, 2M token context window, and native code execution

Last updated: 2026-05-07Free tier available

Score Breakdown

8.0

Ease of Use

8.0

Output Quality

9.0

Value

8.0

Features

Benchmark Scores

Benchmarks for Gemini 3.1 Ultra

Chatbot Arena ELOHuman preference rating1500

Benchmark	Description	Score
MMLU	Knowledge across 57 subjects	90.5%
GPQA Diamond	Graduate-level science questions	94.3%
HumanEval	Python code generation	93.5%
SWE-bench	Real GitHub issue fixing	80.6%
ARC-AGI	Abstract reasoning puzzles	77.1%

Last updated: 2026-04-13

Visit Gemini (Google)

Personality & Tone

The Google research assistant

Tone: Neutral, thorough, and slightly corporate. Gemini leans academic, cites sources readily in Deep Research mode, and keeps its tone even across topics -- rarely funny, rarely snarky.

Quirks: Tightly integrated with Google products -- pulls from Search and Workspace by default, which is useful for grounded answers but means you hear Google's worldview. Can feel evasive or overly safe on opinionated or politically charged questions.

The Good and the Bad

What we like

+2 million token context window is the largest available -- can process entire books and full codebases in one prompt
+Best Google Workspace integration (Gmail, Docs, Drive, Calendar)
+Free tier is more generous than Claude's
+Gemini Advanced includes 2TB Google One storage -- real added value
+API pricing is very competitive, especially for Flash model

What could be better

−Output quality for creative writing is the weakest of the big three (GPT-4, Claude, Gemini)
−Hallucination rate is higher than Claude in our testing
−Google's track record of killing products makes long-term commitment feel risky
−The Gemini app UI feels like Google slapped AI onto an existing product

Pricing

Free

✓Gemini 3.1 Flash
✓Basic features
✓Google integration

Google AI Pro

$19.99/month

✓Gemini 3.1 Ultra
✓2M token context
✓Code Execution sandbox
✓2TB Google storage
✓Workspace integration
✓Lyria 3 access

Google AI Ultra

$249.99/month

✓Gemini 3.1 Ultra (max usage)
✓Gemini 3.1 Flash Live audio
✓Lyria 3 Pro full access
✓Highest API priority
✓30TB Google storage

API

$0.075-5/per 1M tokens

✓All models
✓2M context
✓Flash-Lite at $0.25/M input
✓Grounding with Google Search
✓Code Execution
✓Mandatory spend caps (April 2026)

Known Issues

GEMINI 3.1 FLASH-LITE GA (2026-05-07, TODAY): Generally available on the Gemini Enterprise Agent Platform. Fastest + most cost-efficient Gemini 3 series model. **2.5x faster Time-to-First-Answer-Token vs Gemini 2.5 Flash; +45% output speed**. Pricing per third-party reference: $0.25/M input, $1.50/M output (vendor blog itself omits direct pricing -- check ai.google.dev/pricing for canonical). Customer signals at GA: Gladly reports ~60% lower cost vs thinking-tier models; OffDeal cites sub-second p95 for classifiers. **This is a pre-Google-I/O staging signal** -- I/O 2026 keynote is 2026-05-19 (T-12 from this entry). Expect more pre-keynote drops over the next two weeksSource: Google Cloud blog (cloud.google.com/blog/products/ai-machine-learning/gemini-3-1-flash-lite-is-now-generally-available), blog.google · 2026-05-07
Gemini 2.5 family retirement dates EXTENDED (ai.google.dev deprecations page, checked 2026-04-24): Gemini 2.5 Pro, 2.5 Flash, AND 2.5 Flash-Lite now all retire 2026-10-16 (pushed out from original 2026-06-17 / 2026-07-22 dates). Gives ~6 more months to migrate to gemini-3.1-pro + gemini-3-flash. Production code still calling 2.5 model names continues to work through Oct 16, but do not ship new code on retiring endpointsSource: ai.google.dev/gemini-api/docs/deprecations (verified 2026-04-24) · 2026-04-24
Gemini 3.1 Flash TTS launched 2026-04-15 as a preview on Gemini API, AI Studio, Vertex AI, and Google Vids. 70+ languages, audio tags for vocal style/pace/delivery embedded in the text prompt, Elo 1,211 on Artificial Analysis TTS leaderboard. Positions Google as a direct competitor to ElevenLabs v3 on the TTS stackSource: blog.google Gemini 3.1 Flash TTS, MarkTechPost · 2026-04
Image generation of people was temporarily disabled after generating historically inaccurate results, partially restored but still limitedSource: The Verge, Google Blog · 2026-01
Gemini Pro model access removed from free API tier on April 1, 2026 -- mandatory spend caps and prepaid billing now required for new accountsSource: Google AI for Developers, FindSkill.ai · 2026-04
Google AI Ultra at $249.99/mo is hard to justify against Claude Max ($200) and ChatGPT Pro ($200) unless you specifically need Lyria 3 ProSource: Reddit r/Bard · 2026-04

Best for

Google Workspace power users. If you live in Gmail, Docs, and Drive, Gemini Advanced integrates directly into your workflow. Also great for developers who need the cheapest API with the longest context window.

Not for

Anyone who needs the best raw output quality. Claude and GPT-4 both write better. Also not for anyone spooked by Google's history of abandoning products.

Our Verdict

Gemini's strength is the ecosystem play. The 1M context window is genuinely useful for long documents, and the Google Workspace integration is something neither OpenAI nor Anthropic can match. But purely as an LLM, the output quality is a step behind Claude and GPT-4. Pick Gemini if you're deep in Google's ecosystem. Otherwise, the other two are better standalone.

Sources

Google AI for Developers: deprecations (accessed 2026-04-21)
Google Blog: Gemini 3.1 Flash TTS (accessed 2026-04-21)
Google Gemini official site (accessed 2026-04-21)
LMSYS Chatbot Arena rankings (accessed 2026-04-13)
Reddit r/Bard (accessed 2026-04-13)

Explore more Gemini (Google) rankings

Deeper leaderboards, benchmarks, task-specific tier lists, and status/pricing pages for Gemini (Google).

Full AI LLMs & Models tier list

Where Gemini (Google) ranks vs every competitor in its category

MMLU leaderboard

The 57-subject knowledge test that became the default LLM benchmark.

GPQA Diamond leaderboard

Graduate-level physics, biology, and chemistry written to defeat Google-search.

HumanEval leaderboard

164 Python programming problems: does the generated code pass unit tests?

SWE-bench Verified leaderboard

Fix real GitHub issues in 12 open-source Python repos.

Best AI tools to research a topic

Research assistants that gather, cite, and synthesize sources across the web into a structured answer.

Best AI tools to answer questions from documents

Chat-with-your-docs tools that build a retrieval layer over PDFs, transcripts, and knowledge bases.

Is Gemini (Google) down?

Outage check plus rolling log of known issues

Gemini (Google) pricing

Every tier and what's included

Gemini (Google) alternatives

Comparable tools at every tier

The Tier List Tuesday

Weekly newsletter: tier movers, new entrants, and the VS of the week. Built from our daily AI-tool sweeps. No spam, unsubscribe anytime.

Alternatives to Gemini (Google)

Claude (Anthropic)

Anthropic's flagship LLM -- Opus 4.7 (launched April 16, 2026) with 1M-token context, high-res vision, new xhigh reasoning level, and the most natural conversational style. Note: 2026-04-04 policy excluded third-party agent harnesses (OpenClaw etc.) from Pro/Max flat-rate, and 2026-04-16 Enterprise pricing dropped bundled tokens

8.5/10

Free tierFrom $0

Best writing quality of any LLM -- Opus ...1M token context window for enterprise A...

Updated 2026-05-06

Claude Mythos Preview

Anthropic's most capable model -- a gated research preview via Project Glasswing, cybersecurity-specialized. 73% success on expert CTF tasks, 32-step autonomous network attacks. Not generally available.

6.5/10

From Invite only

The most capable Anthropic model availab...73% success rate on expert-level Capture...

Updated 2026-04-20

Grok

xAI's irreverent chatbot with a direct line to X/Twitter -- real-time data meets unfiltered personality. Grok 4.3 production launched 2026-05-02 with Custom Voices cloning + Imagine Agent Mode + ~40% API price cut to $1.25/$2.50 per 1M tokens

7.5/10

Free tierFrom $0

Real-time access to X/Twitter data is ge...Grok 3 benchmarks are competitive with G...

Updated 2026-05-05

Muse Spark (Meta)

Meta's first model from its Superintelligence Lab -- natively multimodal with Contemplating mode for multi-agent reasoning

8.8/10

Free tierFrom $0

Completely free to use via Meta AI app a...Natively multimodal: handles text, image...

Updated 2026-04-19

GPT-Rosalind (OpenAI)

OpenAI's first domain-specific model -- life sciences, drug discovery, translational medicine. Launched 2026-04-16 as a Trusted Access research preview. Launch partners: Amgen, Moderna, Allen Institute, Thermo Fisher. Paired with a Life Sciences Codex plugin (50+ scientific tool integrations)

6.8/10

From Invite only

OpenAI's first named vertical/domain-spe...Launch partners Amgen, Moderna, Allen In...

Updated 2026-04-17

GPT-5.4-Cyber (OpenAI)

OpenAI's defensive-cybersecurity variant of GPT-5.4, launched 2026-04-16. Lowered refusal boundary for security-research tasks and native binary reverse-engineering. Access gated via Trusted Access for Cyber (TAC) program -- thousands of verified defenders, hundreds of teams, no public pricing

7.2/10

From Not publicly disclosed

Directly competes with Claude Mythos Pre...Lowered refusal boundary on defensive-se...

Updated 2026-04-19

Hunyuan 3 (Tencent Hy3)

Tencent's Hy3 Preview launched 2026-04-23 -- 295B total / 21B active MoE, 256K context, open-sourced on HuggingFace under tencent/Hy3-preview. Cheapest frontier-class API at ~1.2 RMB per million input tokens. Integrated into Yuanbao, WeChat, QQ

8.1/10

Free tierFrom $0

Open weights from a top-3 Chinese tech c...Pricing is aggressive. ~1.2 RMB per mill...

Updated 2026-04-25

MiMo (Xiaomi)

Xiaomi's MiMo-V2.5 family launched 2026-04-22 -- Pro (1T total / 42B active MoE, 1M context, native vision+audio reasoning), Multimodal base, TTS (3 sub-models: base, VoiceDesign, VoiceClone), and ASR (open-source, English + Chinese + major dialects). Full voice pipeline for the agent era. Extra-charge 1M-context tier removed at launch

8.3/10

Free tierFrom $0

Full voice pipeline shipped together: a ...Native multimodal in MiMo-V2.5-Pro is th...

Updated 2026-04-25