Grok
B Tier · 7.5/10
xAI's irreverent chatbot with a direct line to X/Twitter -- real-time data meets unfiltered personality. Grok 4.3 production launched 2026-05-02 with Custom Voices cloning + Imagine Agent Mode + ~40% API price cut to $1.25/$2.50 per 1M tokens
Score Breakdown
Benchmark Scores
Benchmarks for Grok 4.20
| Benchmark | Description | Score | |
|---|---|---|---|
| MMLU | Knowledge across 57 subjects | 88.5% | |
| GPQA Diamond | Graduate-level science questions | 85% | |
| HumanEval | Python code generation | 90% | |
| Humanity's Last Exam | Frontier difficulty questions | 50.7% |
Last updated: 2026-04-13
Personality & Tone
The irreverent contrarian
Tone: Casual, jokey, and willing to swear. Grok takes strong positions without hedging, leans into an edgy 'based' persona, and cracks jokes far more often than Claude, ChatGPT, or Gemini.
Quirks: Engages with topics other chatbots refuse, pulls live context from X so it reflects whatever is trending that hour, and will freely mock things -- including itself. In SuperGrok's multi-agent mode it can sound like several personalities arguing with each other.
The Good and the Bad
What we like
- +Real-time access to X/Twitter data is genuinely useful for tracking breaking news and trending topics
- +Grok 3 benchmarks are competitive with GPT-4o and Claude 3.5 -- this is not a vanity project anymore
- +The personality is refreshing if you're tired of overly cautious AI assistants -- it'll actually joke around
- +DeepSearch mode does solid multi-step research, pulling from web and X data simultaneously
What could be better
- −The snarky personality gets old fast when you're trying to get serious work done
- −Tied to the X ecosystem -- you need an X account, and the real-time data skews toward X's user base
- −SuperGrok at $30/mo is steep when Claude Pro and ChatGPT Plus are $20 with arguably better core models
- −Image generation and analysis capabilities lag behind what you get from ChatGPT or Gemini
Pricing
Free
- ✓~10 prompts per 2 hours
- ✓Basic Grok access
- ✓Requires X account
X Premium
- ✓Higher query limits
- ✓Grok 4.20 access
- ✓Bundled X social features
X Premium+
- ✓Higher Grok 4.20 access
- ✓Ad-free X
- ✓Priority responses
SuperGrok
- ✓Full Grok 4.20 (4-agent multi-agent system)
- ✓DeepSearch mode
- ✓Highest rate limits
- ✓Think mode
- ✓$300/yr option (16% off)
SuperGrok Heavy
- ✓Grok 4 Heavy model
- ✓Highest priority
- ✓Multi-agent at scale
- ✓Note: Grok 4.3 beta-gating ended 2026-05-02
API (Grok 4.3)
- ✓Production launch 2026-05-02 (~40% input / ~60% output price cut vs 4.20)
- ✓1M context window
- ✓Reasoning tokens billed at output rate
- ✓Native video input + PDF/PPT/spreadsheet output
- ✓Custom Voices voice cloning free on console (80+ presets, 28 languages)
- ✓Imagine Agent Mode (creative workflow agent, beta)
Known Issues
- PRODUCT (2026-05-14, TODAY): xAI launched **Grok Build CLI** in early beta -- an agentic terminal-native CLI for coding, app development, and workflow automation. Spawns up to **8 concurrent agents** in parallel. Powered by Grok 4.3 beta with a 16-agent Heavy architecture and **2M token context window**. Vendor-primary launch posts at x.ai/news/grok-build-cli and x.ai/cli, plus Musk's public invitation to wider beta testers on X. **Access gate**: launched first to SuperGrok Heavy tier ($299/mo, intro offer $99/mo for 6 months) -- not yet available to standard Premium / SuperGrok subscribers. Positions Grok as a direct competitor to Claude Code, Codex CLI, and Cursor CLI for terminal-first agentic coding workflows. The 8-agent parallelism + 2M context is the differentiating feature -- single longest context window of any production coding CLI as of todaySource: xAI news (x.ai/news/grok-build-cli), xAI product page (x.ai/cli), Musk on X · 2026-05-14
- xAI joined SpaceX on 2026-02-02 -- SpaceX acquired xAI. Procurement, billing, and compliance workflows now route through SpaceX's vendor pipeline. For regulated industries (healthcare, finance, US government) this may require re-qualifying xAI as a vendor even if Grok itself was previously approvedSource: xAI announcement (x.ai/news/xai-joins-spacex), SpaceX updates · 2026-02
- Grok Speech (STT + TTS) APIs launched 2026-04-17 as separate products from the chatbot -- see /tools/grok-voice on this site. Built on the same stack Grok Voice uses. Not included in Premium/SuperGrok consumer tiers; billed separately at $0.10/hr STT batch and $4.20/1M char TTSSource: xAI Grok STT/TTS announcement · 2026-04
- Real-time X data can surface misinformation from viral posts without adequate fact-checkingSource: Reddit r/artificial · 2026-02
- Free tier rate limits are aggressive -- many users report hitting caps within a few queriesSource: X/Twitter user reports · 2026-03
- Grok 4.20's 4-agent system (Grok, Harper, Benjamin, Lucas) can take 30+ seconds for complex queries as agents debate internally. Grok 4.20 Beta 2 (landed ~2026-04-07) improved instruction-following, reduced hallucinations, better LaTeX and image search -- partially addresses the slowness and reliability complaints from early 4.20 feedbackSource: Reddit r/grok, IBTimes · 2026-04
- PRODUCTION LAUNCH (2026-05-02): Grok 4.3 went broadly available beyond the SuperGrok Heavy beta. New consumer + API features: **Custom Voices voice cloning suite** (clone voice from ~1 minute of speech in <2 minutes, two-stage passphrase + speaker-embedding consent gate, 80+ preset voices, 28 languages, free on console); **Imagine Agent Mode** (creative production workflow agent, beta); native video input + reasoning-by-default; native PDF / PowerPoint / spreadsheet output. **API pricing: $1.25 input / $2.50 output per 1M tokens** -- ~40% input cut + ~60% output cut vs Grok 4.20. 1M context window. Reasoning tokens billed at output rate. xAI's pattern is silent ship via grok.com model selector + console UI rather than vendor blog post -- vendor-primary verification through grok.com itself plus 4+ tier-1 press sources (VentureBeat, Winbuzzer, The Decoder, Phemex)Source: VentureBeat (venturebeat.com/technology/xai-launches-grok-4-3-at-an-aggressively-low-price-and-a-new-fast-powerful-voice-cloning-suite), Winbuzzer 2026-05-03, The Decoder, grok.com console · 2026-05-02
- Grok 4.3 Beta dropped 2026-04-17 as a SuperGrok Heavy exclusive ($300/mo tier). Elon Musk clarified on 2026-04-18 that the live checkpoint is ~0.5T params; the full 1T version is ~5 days from finishing training. Beta gating ENDED 2026-05-02 with broader rollout (see entry above)Source: PiunikaWeb, BuildFastWithAI, xAI release notes, Musk posts on X (2026-04-18) · 2026-04
Best for
People who live on X/Twitter and want an AI that can tap into that data in real-time. Also good for users who find mainstream chatbots too sanitized and want something with more personality.
Not for
Enterprise users who need reliable, consistent outputs. Also not the best pick if you don't use X -- the real-time data advantage disappears and you're left with a solid-but-not-best-in-class LLM.
Our Verdict
Grok has come a long way from being dismissed as Elon's pet project. The Grok 3 models are legitimately competitive, and the real-time X integration is a unique differentiator that no other chatbot can match. But the value proposition gets muddier when you strip away the X angle -- at $30/mo for SuperGrok, you're paying a premium for personality and Twitter data. If those matter to you, Grok is great. If not, Claude or ChatGPT give you more for less.
Sources
- VentureBeat: xAI launches Grok 4.3 with voice cloning (2026-05-02) (accessed 2026-05-05)
- Winbuzzer: xAI Grok 4.3 + Custom Voices (2026-05-03) (accessed 2026-05-05)
- xAI official site (accessed 2026-04-17)
- xAI Grok 4.20 announcement (accessed 2026-04-17)
- IBTimes: Grok 4.20 Beta 2 April 2026 (accessed 2026-04-17)
- BuildFastWithAI: Grok 4.3 Beta 2026-04-17 (accessed 2026-04-17)
- Artificial Analysis: Grok 4.20 (accessed 2026-04-17)
- Reddit r/grok, r/artificial (accessed 2026-04-17)
Explore more Grok rankings
Deeper leaderboards, benchmarks, task-specific tier lists, and status/pricing pages for Grok.
The Tier List Tuesday
Weekly newsletter: tier movers, new entrants, and the VS of the week. Built from our daily AI-tool sweeps. No spam, unsubscribe anytime.
Alternatives to Grok
Claude (Anthropic)
Anthropic's flagship LLM -- Opus 4.7 (launched April 16, 2026) with 1M-token context, high-res vision, new xhigh reasoning level, and the most natural conversational style. Note: 2026-04-04 policy excluded third-party agent harnesses (OpenClaw etc.) from Pro/Max flat-rate, and 2026-04-16 Enterprise pricing dropped bundled tokens
Claude Mythos Preview
Anthropic's most capable model -- a gated research preview via Project Glasswing, cybersecurity-specialized. 73% success on expert CTF tasks, 32-step autonomous network attacks. Not generally available.
Gemini (Google)
Google's LLM with deep Google Workspace integration, 2M token context window, and native code execution
Muse Spark (Meta)
Meta's first model from its Superintelligence Lab -- natively multimodal with Contemplating mode for multi-agent reasoning
GPT-Rosalind (OpenAI)
OpenAI's first domain-specific model -- life sciences, drug discovery, translational medicine. Launched 2026-04-16 as a Trusted Access research preview. Launch partners: Amgen, Moderna, Allen Institute, Thermo Fisher. Paired with a Life Sciences Codex plugin (50+ scientific tool integrations)
GPT-5.4-Cyber (OpenAI)
OpenAI's defensive-cybersecurity variant of GPT-5.4, launched 2026-04-16. Lowered refusal boundary for security-research tasks and native binary reverse-engineering. Access gated via Trusted Access for Cyber (TAC) program -- thousands of verified defenders, hundreds of teams, no public pricing
Hunyuan 3 (Tencent Hy3)
Tencent's Hy3 Preview launched 2026-04-23 -- 295B total / 21B active MoE, 256K context, open-sourced on HuggingFace under tencent/Hy3-preview. Cheapest frontier-class API at ~1.2 RMB per million input tokens. Integrated into Yuanbao, WeChat, QQ
MiMo (Xiaomi)
Xiaomi's MiMo-V2.5 family launched 2026-04-22 -- Pro (1T total / 42B active MoE, 1M context, native vision+audio reasoning), Multimodal base, TTS (3 sub-models: base, VoiceDesign, VoiceClone), and ASR (open-source, English + Chinese + major dialects). Full voice pipeline for the agent era. Extra-charge 1M-context tier removed at launch