Gemini (Google)
A Tier · 8.3/10
Google's LLM with deep Google Workspace integration, 2M token context window, and native code execution -- Gemini 3.5 Flash GA 2026-05-19 (I/O 2026), Gemini 3.5 Pro rolling out June 2026, Gemini Spark agent + Managed Agents public preview in the Gemini API
Score Breakdown
Benchmark Scores
Benchmarks for Gemini 3.5 Flash (vendor-published 2026-05-19; third-party verification pending) -- legacy 3.1 Ultra retained below for context
| Benchmark | Description | Score | |
|---|---|---|---|
| Terminal-Bench 2.1 | 76.2% | ||
| MCP Atlas | 83.6% | ||
| CharXiv Reasoning (multimodal) | 84.2% | ||
| MMLU (3.1 Ultra baseline) | 90.5% | ||
| SWE-bench Verified (3.1 Ultra baseline) | 80.6% |
Last updated: 2026-05-19
Personality & Tone
The Google research assistant
Tone: Neutral, thorough, and slightly corporate. Gemini leans academic, cites sources readily in Deep Research mode, and keeps its tone even across topics -- rarely funny, rarely snarky.
Quirks: Tightly integrated with Google products -- pulls from Search and Workspace by default, which is useful for grounded answers but means you hear Google's worldview. Can feel evasive or overly safe on opinionated or politically charged questions.
The Good and the Bad
What we like
- +2 million token context window is the largest available -- can process entire books and full codebases in one prompt
- +Best Google Workspace integration (Gmail, Docs, Drive, Calendar)
- +Free tier is more generous than Claude's
- +Gemini Advanced includes 2TB Google One storage -- real added value
- +API pricing is very competitive, especially for Flash model
What could be better
- −Output quality for creative writing is the weakest of the big three (GPT-4, Claude, Gemini)
- −Hallucination rate is higher than Claude in our testing
- −Google's track record of killing products makes long-term commitment feel risky
- −The Gemini app UI feels like Google slapped AI onto an existing product
Pricing
Free
- ✓Gemini 3.5 Flash (GA 2026-05-19)
- ✓Basic features
- ✓Google integration
Google AI Pro
- ✓Gemini 3.1 Ultra (Gemini 3.5 Pro rolling out June 2026)
- ✓2M token context
- ✓Code Execution sandbox
- ✓2TB Google storage
- ✓Workspace integration
- ✓Lyria 3 access
Google AI Ultra
- ✓Gemini 3.1 Ultra (max usage)
- ✓Gemini 3.1 Flash Live audio
- ✓Gemini Spark agent access (US 18+, rolling out post-I/O 2026)
- ✓Lyria 3 Pro full access
- ✓Highest API priority
- ✓30TB Google storage
API
- ✓All models
- ✓2M context
- ✓Flash-Lite at $0.25/M input
- ✓Grounding with Google Search
- ✓Code Execution
- ✓Mandatory spend caps (April 2026)
Known Issues
- PRICE CUT (2026-06-09): **Google AI Plus dropped from $7.99 to $4.99/mo** and doubled storage 200GB → 400GB (US; tier includes Gemini Omni Flash video gen, Flow, and NotebookLM). TechCrunch framed it as 'a warning shot in the AI subscription price wars' -- it undercuts ChatGPT Go ($8/mo) by nearly half. Current Google AI plan ladder: Plus $4.99 / Pro $19.99 / Ultra 5x from $99.99 / Ultra 20x $199.99Source: blog.google (Google One subscriptions post), TechCrunch (2026-06-09), Engadget · 2026-06-09
- PARTNERSHIP WIN (2026-06-08, WWDC): Apple's rebuilt **Siri AI** runs on next-generation Apple Foundation Models that Apple says were 'custom-built in collaboration with Google and its Gemini models' -- press reports the deal at ~$1B/year for a ~1.2T-parameter custom Gemini variant. This puts Gemini-derived models behind the default assistant on qualifying iPhones, iPads, and Macs when iOS 27 / macOS 27 ship this fall. Separately, iOS 27's Extensions framework lets users select Gemini itself as the system assistant behind Siri, Writing Tools, and Image Playground -- distribution ChatGPT used to hold exclusively. Arguably Google's biggest AI distribution win to date; see the siri-ai page for the Apple-side detailSource: Apple newsroom (apple.com/newsroom/2026/06/apple-unveils-next-generation-of-apple-intelligence-siri-ai-and-more/), TechCrunch, CNBC, SiliconANGLE · 2026-06-08
- SHUTDOWN NOW IN EFFECT (2026-06-18, TODAY): As of today, **Gemini CLI and the Gemini Code Assist IDE extensions have stopped serving requests** for Google AI Pro & Ultra subscribers and free-tier Gemini Code Assist for individuals (per Google's vendor post: 'On June 18, 2026, Gemini CLI and Gemini Code Assist IDE extensions will stop serving requests for Google AI Pro and Ultra, as well as those using it free of charge'). 'Login with Google' auth for these consumer surfaces also stopped working, and there is no grace period -- any CI/CD or script calling `gemini` on a consumer plan breaks now. **Enterprise customers** (Gemini Code Assist Standard/Enterprise licenses, or Code Assist for GitHub via Google Cloud) are UNAFFECTED and keep access with ongoing model updates. Replacement: **Antigravity CLI** (available to everyone now) -- Google says 'there won't be 1:1 feature parity out of the gate' but it keeps the critical Gemini CLI features (Agent Skills, Hooks, Subagents, Extensions, now as Antigravity plugins). If you were scripting against Gemini CLI on a consumer plan, migrate immediately.Source: Google Developers Blog (developers.googleblog.com/en/an-important-update-transitioning-gemini-cli-to-antigravity-cli/), 9to5Google · 2026-06-18
- GEMINI API MODEL SHUTDOWNS (announced 2026-06-15, via the Gemini API changelog): a cluster of older media-generation model IDs are being retired. **Veo 2.0 + Veo 3.0 video models shut down 2026-06-30** (migrate to `veo-3.1-generate-preview` / `veo-3.1-fast-generate-preview` or the 3.1 GA models). **Nano Banana preview image IDs `gemini-3.1-flash-image-preview` + `gemini-3-pro-image-preview` shut down 2026-06-25** (migrate to the GA `gemini-3.1-flash-image` / `gemini-3-pro-image`, released 5/28). **Imagen 4.0 models (`imagen-4.0-generate-001`, `-ultra-`, `-fast-`) shut down 2026-08-17.** Only the model IDs change -- update integrations before each date to avoid service interruption. (Separately, the consumer Gemini CLI shutdown above is the 6/18 event.)Source: Gemini API changelog (ai.google.dev/gemini-api/docs/changelog, 2026-06-15 entries) · 2026-06-15
- I/O 2026 SHIP (2026-05-19): **GEMINI OMNI** announced -- Google's natively multimodal video-generation model, first variant **Gemini Omni Flash**. Generates video from image / audio / video / text input, supports conversational editing inside the Gemini app, physics-grounded outputs, SynthID watermarking. Availability: Gemini app for AI Plus / Pro / Ultra subscribers globally; YouTube Shorts + YouTube Create app at no extra cost; Developer API 'in the coming weeks'. Direct competitive shot at OpenAI's Sora-2 (Sora 1 retired 2026-04-26) + Runway Gen-4.5 + Pika + Luma. The differentiator is in-conversation editing inside Gemini rather than a separate video-gen app. **Aggregator-circulated 'Veo 4' name is NOT this product** -- DeepMind models page still lists Veo 3.1 as current; no Veo 4 exists. Omni is the video-gen ship that Veo's lineup didn't get at I/O 2026.Source: blog.google (blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni/) · 2026-05-19
- I/O 2026 SHIP (2026-05-19): GEMINI 3.5 FLASH GA. Available immediately in Gemini app (global), AI Mode in Google Search, Google Antigravity platform, Gemini API via Google AI Studio + Android Studio, Gemini Enterprise Agent Platform, and Gemini Enterprise. Vendor-published benchmarks: Terminal-Bench 2.1 = 76.2%, GDPval-AA = 1656 Elo, MCP Atlas = 83.6%, CharXiv Reasoning (multimodal) = 84.2%, claimed 4x faster than other frontier models. Vendor framing: 'outperforming Gemini 3.1 Pro on challenging coding and agentic benchmarks' with richer interactive web UIs and graphics vs. Gemini 3. Pricing not disclosed in launch post -- check ai.google.dev/pricing for canonical.Source: blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/, ai.google.dev/gemini-api/docs/changelog (2026-05-19 entry releases gemini-3.5-flash GA) · 2026-05-19
- I/O 2026 UPCOMING (2026-05-19): GEMINI 3.5 PRO is in internal testing with rollout 'next month' (June 2026). No specific date or pricing yet. Existing AI Pro subscribers will likely receive 3.5 Pro as the in-tier flagship replacing 3.1 Ultra for new sessions when GA hits. Vendor language consistent: 'rollout planned for next month'.Source: blog.google Gemini 3.5 announcement post (2026-05-19) · 2026-05-19
- I/O 2026 SHIP (2026-05-19): MANAGED AGENTS IN THE GEMINI API now in public preview, including the general-purpose ANTIGRAVITY AGENT (model id: antigravity-preview-05-2026). New paradigm: stateful autonomous agents executing in Google-hosted sandbox environments via the Gemini API. Differentiates the Gemini API from a stateless completion endpoint -- competes structurally with OpenAI Responses API + Anthropic Managed Agents (Dreaming/Outcomes/Multiagent Orchestration shipped 2026-05-06).Source: ai.google.dev/gemini-api/docs/changelog (2026-05-19 entries) · 2026-05-19
- I/O 2026 UPCOMING (2026-05-19): GEMINI SPARK announced -- Google's 24/7 proactive agent that 'takes action on your behalf' and runs in the background 'even if your phone and laptop are turned off'. Rolling out post-I/O to Google AI Ultra subscribers (18+, US-only at launch) plus 'select business users'. Powered by Gemini 3.5 Flash + Antigravity stack. Integrates Gmail, Calendar, Drive, Docs, Sheets, Slides, YouTube, Maps. Capabilities: task tracking, scheduled automation, custom reusable skills, file/workspace org, email categorization. Operates under user direction with approval gates for sensitive actions -- not continuous monitoring by default. Direct competitor to Anthropic Orbit (Code with Claude 5/6 announcement) and Microsoft Copilot Cowork.Source: gemini.google/overview/agent/spark/ (vendor-primary), blog.google Sundar Pichai I/O 2026 keynote post · 2026-05-19
- GEMINI 3.1 FLASH-LITE GA (2026-05-07): Generally available on the Gemini Enterprise Agent Platform. Fastest + most cost-efficient Gemini 3 series model. **2.5x faster Time-to-First-Answer-Token vs Gemini 2.5 Flash; +45% output speed**. Pricing per third-party reference: $0.25/M input, $1.50/M output (vendor blog itself omits direct pricing -- check ai.google.dev/pricing for canonical). Customer signals at GA: Gladly reports ~60% lower cost vs thinking-tier models; OffDeal cites sub-second p95 for classifiers.Source: Google Cloud blog (cloud.google.com/blog/products/ai-machine-learning/gemini-3-1-flash-lite-is-now-generally-available), blog.google · 2026-05-07
- Gemini 2.5 family retirement dates EXTENDED (ai.google.dev deprecations page, checked 2026-04-24): Gemini 2.5 Pro, 2.5 Flash, AND 2.5 Flash-Lite now all retire 2026-10-16 (pushed out from original 2026-06-17 / 2026-07-22 dates). Gives ~6 more months to migrate to gemini-3.1-pro + gemini-3-flash. Production code still calling 2.5 model names continues to work through Oct 16, but do not ship new code on retiring endpointsSource: ai.google.dev/gemini-api/docs/deprecations (verified 2026-04-24) · 2026-04-24
- Gemini 3.1 Flash TTS launched 2026-04-15 as a preview on Gemini API, AI Studio, Vertex AI, and Google Vids. 70+ languages, audio tags for vocal style/pace/delivery embedded in the text prompt, Elo 1,211 on Artificial Analysis TTS leaderboard. Positions Google as a direct competitor to ElevenLabs v3 on the TTS stackSource: blog.google Gemini 3.1 Flash TTS, MarkTechPost · 2026-04
- Image generation of people was temporarily disabled after generating historically inaccurate results, partially restored but still limitedSource: The Verge, Google Blog · 2026-01
- Gemini Pro model access removed from free API tier on April 1, 2026 -- mandatory spend caps and prepaid billing now required for new accountsSource: Google AI for Developers, FindSkill.ai · 2026-04
- Google AI Ultra at $249.99/mo is hard to justify against Claude Max ($200) and ChatGPT Pro ($200) unless you specifically need Lyria 3 ProSource: Reddit r/Bard · 2026-04
Best for
Google Workspace power users. If you live in Gmail, Docs, and Drive, Gemini Advanced integrates directly into your workflow. Also great for developers who need the cheapest API with the longest context window.
Not for
Anyone who needs the best raw output quality. Claude and GPT-4 both write better. Also not for anyone spooked by Google's history of abandoning products.
Our Verdict
Gemini's strength is the ecosystem play. The 1M context window is genuinely useful for long documents, and the Google Workspace integration is something neither OpenAI nor Anthropic can match. But purely as an LLM, the output quality is a step behind Claude and GPT-4. Pick Gemini if you're deep in Google's ecosystem. Otherwise, the other two are better standalone.
Sources
- Google Developers Blog: Transitioning Gemini CLI to Antigravity CLI (shutdown took effect 2026-06-18) (accessed 2026-06-18)
- Gemini API changelog: Veo 2.0/3.0 (6/30), Nano Banana preview (6/25), Imagen 4.0 (8/17) shutdowns (accessed 2026-06-18)
- Google Blog: Gemini 3.5 frontier intelligence with action (2026-05-19) (accessed 2026-05-20)
- Gemini API Changelog: 3.5 Flash GA + Managed Agents + Antigravity preview (2026-05-19) (accessed 2026-05-20)
- Gemini Spark product page (vendor-primary) (accessed 2026-05-20)
- Google AI for Developers: deprecations (accessed 2026-04-21)
- Google Blog: Gemini 3.1 Flash TTS (accessed 2026-04-21)
- LMSYS Chatbot Arena rankings (accessed 2026-04-13)
- Reddit r/Bard (accessed 2026-04-13)
Explore more Gemini (Google) rankings
Deeper leaderboards, benchmarks, task-specific tier lists, and status/pricing pages for Gemini (Google).
The Tier List Tuesday
Weekly newsletter: tier movers, new entrants, and the VS of the week. Built from our daily AI-tool sweeps. No spam, unsubscribe anytime.
Alternatives to Gemini (Google)
Claude (Anthropic)
Anthropic's flagship LLM family. Claude Fable 5 (launched June 9, 2026) was the first publicly available Mythos-class model -- but on June 12, 2026 a US government export-control directive ordered access suspended, and Anthropic disabled Fable 5 + Mythos 5 for ALL customers to comply (every other Claude model is unaffected). Opus 4.8 (May 28) is the available flagship: $5/$25 per 1M, 1M-token context, effort control, and a cheap fast mode
Claude Mythos 5
Anthropic's unrestricted frontier model -- launched June 9, 2026 alongside Claude Fable 5 (the same model made safe for general use). ACCESS SUSPENDED June 12, 2026: a US government export-control directive forced Anthropic to disable both Mythos 5 and Fable 5 for all customers; all other Claude models are unaffected. Mythos 5 had been gated to ~150 Project Glasswing orgs and select biology researchers.
Grok
xAI's irreverent chatbot with a direct line to X/Twitter -- real-time data meets unfiltered personality. Grok 4.3 production launched 2026-05-02 with Custom Voices cloning + Imagine Agent Mode + ~40% API price cut to $1.25/$2.50 per 1M tokens
Muse Spark (Meta)
Meta's first model from its Superintelligence Lab -- natively multimodal with Contemplating mode for multi-agent reasoning
GPT-Rosalind (OpenAI)
OpenAI's first domain-specific model -- life sciences, drug discovery, translational medicine. Launched 2026-04-16 as a Trusted Access research preview. Launch partners: Amgen, Moderna, Allen Institute, Thermo Fisher. Paired with a Life Sciences Codex plugin (50+ scientific tool integrations)
GPT-5.4-Cyber (OpenAI)
OpenAI's defensive-cybersecurity variant of GPT-5.4, launched 2026-04-16. Lowered refusal boundary for security-research tasks and native binary reverse-engineering. Access gated via Trusted Access for Cyber (TAC) program -- thousands of verified defenders, hundreds of teams, no public pricing
Microsoft MAI-Thinking-1
Microsoft's first in-house reasoning model -- launched 2026-06-02 at Build as the flagship of seven new MAI models. 35B-active / ~1T-total sparse Mixture-of-Experts, 256K context. AIME 2025 97.0%, matches leading models on SWE-Bench Pro, and beat Claude Sonnet 4.6 in human-preference testing. Available on Microsoft Foundry + OpenRouter / Fireworks / Baseten
Hunyuan 3 (Tencent Hy3)
Tencent's Hy3 Preview launched 2026-04-23 -- 295B total / 21B active MoE, 256K context, open-sourced on HuggingFace under tencent/Hy3-preview. Cheapest frontier-class API at ~1.2 RMB per million input tokens. Integrated into Yuanbao, WeChat, QQ
MiMo (Xiaomi)
Xiaomi's MiMo-V2.5 family launched 2026-04-22 -- Pro (1T total / 42B active MoE, 1M context, native vision+audio reasoning), Multimodal base, TTS (3 sub-models: base, VoiceDesign, VoiceClone), and ASR (open-source, English + Chinese + major dialects). Full voice pipeline for the agent era. Extra-charge 1M-context tier removed at launch