Codex (OpenAI)

A Tier · 8.3/10

OpenAI's coding agent -- parallel tasks, PRs, and as of 2026-07-09 merged into the unified ChatGPT desktop app (Chat + Work + Codex, every plan incl. Free) with GPT-5.6 across the lineup

Last updated: 2026-07-09Free tier availablePowered by GPT-5.6 Sol/Terra/Luna (GA 2026-07-09; Terra default on Free/Go) / GPT-5.2-Codex / GPT-5.5

Score Breakdown

8.0

Ease of Use

8.0

Output Quality

8.0

Value

9.0

Features

Benchmark Scores

Benchmarks for GPT-5.2-Codex (launched 2026-04-23 -- SOTA on SWE-Bench Pro and Terminal-Bench 2.0; first-party scores below pending detailed third-party verification)

Benchmark	Description	Score
SWE-bench	Real GitHub issue fixing	72%
HumanEval	Python code generation	95%

Last updated: 2026-04-25

Visit Codex (OpenAI)

The Good and the Bad

What we like

+Lives inside ChatGPT -- if you already pay for Plus ($20/mo), Codex is included at no extra cost
+Parallel task execution is a real differentiator -- assign 5 tasks at once and come back when they're done
+Code review feature catches bugs and suggests improvements before you merge -- genuinely useful, not just a gimmick
+Sandboxed environments per task means it can't break your local setup -- runs tests safely in the cloud
+GitHub integration lets it propose PRs directly, read your repo, and work on real issues end-to-end
+CLI, web, and IDE extension gives you three ways to interact depending on your workflow

What could be better

−Usage limits burn through fast -- 20-100 messages per 5 hours on Plus means heavy users hit the wall mid-task
−Can't be corrected mid-task -- once you send a prompt, you wait for the full result, no steering
−Struggles with complex refactors and architectural decisions -- great at straightforward tasks, mediocre on nuanced ones
−Cloud-based GitHub integration is unintuitive to set up -- many users find the workflow confusing
−No image input yet -- can't show it a screenshot of a UI bug and ask it to fix it
−Response latency can spike to 3+ minutes per response during peak hours

Pricing

Free

✓Basic Codex access
✓Quick coding tasks only
✓Explore capabilities

Go

$8/month

✓Lightweight coding tasks
✓Codex CLI access

Plus

$20/month

✓Codex web + CLI + IDE extension
✓GPT-5.4 + GPT-5.3-Codex
✓20-100 local messages per 5h
✓Slack integration
✓Cloud code review

Pro

$100/month

✓10-20x higher rate limits
✓GPT-5.3-Codex-Spark (research preview)
✓Up to 2,000 messages per 5h
✓Priority processing

Business

Pay as you go/per seat

✓30-150 local messages per 5h
✓10-60 cloud tasks per 5h
✓20-50 code reviews per 5h
✓Admin controls
✓Larger VMs

Known Issues

APP MERGER + GPT-5.6 (2026-07-09): The **standalone Codex app is merging into the new unified ChatGPT desktop app** (Mac/Windows) -- update the Codex app and it becomes the ChatGPT app with Chat, Work, and Codex surfaces; developers can set Codex as the default view and keep the Codex icon; desktop Codex projects are accessible from the ChatGPT mobile app; the old ChatGPT desktop app is renamed 'ChatGPT Classic.' Codex itself gains: **inline editing within diffs, PR review in the side panel, faster computer use (powered by GPT-5.6), and multi-repo projects**. Model lineup: **GPT-5.6 GA in Codex same day** -- Free/Go get Terra; Plus+ pick Sol/Terra/Luna with per-model effort; `max` toggleable for all GPT-5.6 users; **`ultra` (4 parallel agents) available from Plus up in Codex**. Scale disclosure: 5M+ weekly Codex users, 1M+ using it for non-development work. Net read: Codex stops being a separate app and becomes the engine inside ChatGPT's agent stack (ChatGPT Work is built on Codex technology -- see /tools/chatgpt-work)Source: OpenAI (openai.com/index/chatgpt-for-your-most-ambitious-work/), OpenAI (openai.com/index/gpt-5-6/) · 2026-07-09
FEATURE CLUSTER (May-June 2026, all vendor changelog): **5/14 Codex in the ChatGPT mobile app** (iOS/Android) -- monitor and drive Codex sessions from your phone by connecting to a Mac running the Codex app (remote control, not standalone mobile execution). **5/21 Goal mode GA** -- out of experimental, available in Codex app + IDE extension + CLI. **5/29 Computer Use on Windows** + remote control of Windows devices. **6/1 Amazon Bedrock support** -- Codex can use supported OpenAI models through Bedrock. **6/2 Sites preview** -- create AND deploy websites/web apps to OpenAI-hosted infrastructure from inside Codex. The Sites ship is the notable one: Codex now competes directly with Lovable/Bolt/v0 on the build-and-host loop, not just the code-generation stepSource: OpenAI Codex changelog (developers.openai.com/codex/changelog) · 2026-06-02
GPT-5.2-Codex shipped 2026-04-23 as a coding-specialized variant separate from the consumer GPT-5.5 launch. Available to all paid ChatGPT users across Codex web/CLI/IDE surfaces today; API access in coming weeks. Posts SOTA on SWE-Bench Pro and Terminal-Bench 2.0. Improvements: long-horizon agentic coding via context compaction, large refactors and migrations, Windows env perf, and cybersecurity. Direct upgrade over GPT-5.3-Codex for serious agentic work -- if you're on Plus or Pro, your Codex defaults are already on the new modelSource: OpenAI: Introducing GPT-5.2-Codex (openai.com/index/introducing-gpt-5-2-codex/), OpenAI Codex changelog · 2026-04-23
Codex Chronicle launched 2026-04-20/21 as an opt-in research preview for ChatGPT Pro on macOS only (NOT available in EU/UK/Switzerland). Captures screen content + builds persistent memories so Codex understands what you're working on without manual context-restating. Privacy details: screenshots stored locally in $TMPDIR/chronicle/screen_recording/ auto-deleted after 6 hours; generated memories live unencrypted as markdown at ~/.codex/memories_extensions/chronicle/; OpenAI servers don't retain processed screenshots and don't train on them. OpenAI explicitly flags 'increased prompt-injection attack surface from screen content' -- pause Chronicle before meetings or sensitive material. Currently consumes rate limits aggressively. Closest comparison is Microsoft Recall but with stronger local-storage guaranteesSource: OpenAI Chronicle docs (developers.openai.com/codex/memories/chronicle), Help Net Security, 9to5Mac · 2026-04
Security vulnerability discovered where branch parameter allowed shell command injection during environment setup -- fixed by OpenAI with improved input validationSource: BeyondTrust Phantom Labs, TechRadar · 2026-03
CLI was macOS-only at launch, frustrating Windows and Linux users -- broader platform support now rolling outSource: Reddit r/openai, GitHub issues · 2026-04
Code quality for complex tasks often needs significant human review before merging -- better at code review than code writing according to developer feedbackSource: Hacker News, Reddit r/programming · 2026-04
2026-04-16 Codex 'super app' update is substantially bigger than the initial Mac-app control headline suggested. Full feature set per OpenAI: (1) macOS computer-use agent that opens apps, clicks, and types with its own cursor in background while you use your machine, (2) image generation via gpt-image-1.5 INSIDE Codex, (3) persistent memory + user preferences across sessions, (4) in-app browser built on the Atlas browser stack, (5) 90+ new plugins combining skills, app integrations, and MCP servers. OpenAI also disclosed 3M weekly Codex users with 70% month-over-month growth. Windows / Linux computer-use support still pending. Not available in EEA, UK, or SwitzerlandSource: BigGo Finance, gHacks, Blockchain News, OpenAI release notes · 2026-04

Best for

Developers already paying for ChatGPT Plus who want a coding agent at no extra cost. Especially good for parallel task execution -- assign multiple bug fixes or feature branches and let Codex work them simultaneously.

Not for

Developers who need fine-grained control mid-task (use Claude Code or Cursor instead). Also not ideal for complex architectural refactors where the AI needs human guidance throughout the process.

Our Verdict

Codex is OpenAI's answer to Claude Code and Devin, and it has one killer advantage: it's bundled with ChatGPT Plus. If you're already paying $20/mo for ChatGPT, you get a cloud coding agent for free. The parallel task execution is genuinely unique -- no other coding agent lets you fire off 5 tasks and check back later. But the rough edges are real: you can't steer it mid-task, complex refactors fall flat, and the usage limits feel tight. For straightforward coding tasks and code review, it's excellent. For anything nuanced, Claude Code's interactive approach is still better.

Sources

OpenAI: ChatGPT Work + desktop app merger (2026-07-09) (accessed 2026-07-09)
OpenAI: GPT-5.6 GA (Codex availability matrix) (accessed 2026-07-09)
OpenAI: Introducing GPT-5.2-Codex (2026-04-23) (accessed 2026-04-25)
OpenAI Codex changelog (accessed 2026-04-25)
OpenAI Chronicle docs (Apr 2026) (accessed 2026-04-22)
Help Net Security: Chronicle screen-context memories (accessed 2026-04-22)
OpenAI official Codex page (accessed 2026-04-17)
developers.openai.com/codex/pricing (accessed 2026-04-17)
VentureBeat: Codex Mac-app control + GPT-Rosalind launch 2026-04-16 (accessed 2026-04-17)
Reddit r/openai, r/programming (accessed 2026-04-17)

Explore more Codex (OpenAI) rankings

Deeper leaderboards, benchmarks, task-specific tier lists, and status/pricing pages for Codex (OpenAI).

Full AI Code Assistants tier list

Where Codex (OpenAI) ranks vs every competitor in its category

HumanEval leaderboard

164 Python programming problems: does the generated code pass unit tests?

SWE-bench Verified leaderboard

Fix real GitHub issues in 12 open-source Python repos.

Best AI tools to debug code

Coding assistants that read a stack trace or failing test and propose a fix with a reasoned explanation.

Best AI tools to explain code

Tools that walk through what a function, file, or repo actually does in plain English.

Best AI tools to write unit tests

Tools that generate unit tests, including edge cases and mocks, from existing source code.

Is Codex (OpenAI) down?

Outage check plus rolling log of known issues

Codex (OpenAI) pricing

Every tier and what's included

Codex (OpenAI) alternatives

Comparable tools at every tier

The Tier List Tuesday

Weekly newsletter: tier movers, new entrants, and the VS of the week. Built from our daily AI-tool sweeps. No spam, unsubscribe anytime.

Alternatives to Codex (OpenAI)

GitHub Copilot

AI code assistant that lives in your editor -- autocomplete on steroids. Usage-based billing went LIVE 2026-06-01: AI Credits + token metering across all plans, code completions still free. New Copilot Max tier added the same day. New signups for Student/Pro/Pro+/Max remain PAUSED. As of 2026-06-02 (Microsoft Build), Microsoft's own MAI-Code-1-Flash is rolling into the VS Code model picker

8.3/10

Free tierFrom $0

Inline code completions feel magical -- ...Works directly in VS Code, JetBrains, Ne...

Updated 2026-07-09

Cursor

AI-native code editor, agent-first in Cursor 3 -- and now home to Grok 4.5 (launched 2026-07-08), the frontier MoE model Cursor trained jointly with SpaceXAI on trillions of Cursor tokens ($2/$6 per 1M, all plans, desktop/web/iOS/CLI/SDK), with Composer 2.5 as the fast lower-cost tier

8.3/10

Free tierFrom $0

Cursor 3's agent-first redesign (April 2...Composer 2 is Cursor's own frontier codi...

Updated 2026-07-10

Devin Desktop (formerly Windsurf)

Windsurf is now **Devin Desktop** -- Cognition retired the Windsurf brand via OTA update on June 2, 2026. Same editor, plans, pricing, settings, and extensions; the bundled agent is now 'Devin Local' and Devin Cloud agent access starts on the $20 Pro plan. Agent Command Center, Spaces, and Devin Review all carry over

7.5/10

Free tierFrom $0

Windsurf 2.0 (launched 2026-04-15) is a ...Embedded Devin cloud agent (via Cognitio...

Updated 2026-07-09

Tabnine

AI code completion that runs locally and keeps your code private -- the enterprise-friendly alternative to Copilot

6.3/10

Free tierFrom $0

Privacy-first approach -- code never lea...Works as a plugin in any major IDE (VS C...

Updated 2026-03-27

Claude Code

Anthropic's terminal-based coding agent that reads your whole repo and makes real changes -- not just suggestions. v2.1.131 (2026-05-06 Code with Claude conf) shipped Code Review GA + Remote Agents + CI Auto-Fix + Routines, plus 2x rate-limit increase from the SpaceX compute deal

7.8/10

From $20

Reads and understands your entire codeba...Actually executes code, runs tests, and ...

Updated 2026-07-10

Lovable

Describe the app you want in plain English and watch it build itself -- $500M ARR (June 2026) and ~1M new projects/week say it works

7.8/10

Free tierFrom $0

The ease of use is unmatched -- describe...Built-in Supabase integration means you ...

Updated 2026-07-04

Devin

The most autonomous AI coding agent -- now a full product family: Devin Cloud, **Devin Desktop (the renamed Windsurf IDE, June 2 2026)**, and Devin Review. Cognition raised $1B+ at a $26B valuation (May 27). Recent shipments: Claude Fable 5 support day-one (6/9), Auto-Triage (5/18), Windows VMs (5/21), Android Emulator (5/13)

7.4/10

From $20

Genuine autonomy -- you can describe a t...Desktop / GUI testing via computer-use (...

Updated 2026-06-10

Replit

Cloud IDE with an AI agent that can build full apps from prompts. **Agent 4 shipped May 2026** with parallel task execution (Replit reports automatic merge-conflict resolution ~90% of the time) -- coding optional, but recommended

7.0/10

Free tierFrom $0

Zero setup -- open a browser, describe y...Full development environment in the clou...

Updated 2026-06-10

Google Antigravity

Google's agent-first AI IDE -- deploys up to 5 autonomous coding agents in parallel on a VS Code fork. Antigravity 2.0 (I/O 2026) is the runtime substrate for Gemini Spark, and the Antigravity CLI is now the official successor to Gemini CLI, which stopped serving consumer tiers on 2026-06-18

8.0/10

Free tierFrom $0

Up to 5 autonomous agents working in par...Mission Control Manager View lets you di...

Updated 2026-06-18

Codestral 2 (Mistral)

Mistral's dedicated code model -- Codestral 2 (launched 2026-04-08) relicensed under Apache 2.0, removing the commercial-use restrictions of the original. 22B dense, strong FIM (fill-in-middle), available via Mistral API + Hugging Face

7.5/10

Free tierFrom $0

Relicensing to Apache 2.0 is the real ne...FIM (fill-in-middle) performance is clas...

Updated 2026-04-18

Roblox Assistant

Roblox Studio's agentic AI that plans, builds, and playtests games. Planning Mode (2026-04-16) + Mesh Generation + Procedural Models brings 3D-native creation to 70M+ daily creators

8.0/10

Free tierFrom $0

Planning Mode (2026-04-16) turns Assista...Agentic loop is genuinely real: Assistan...

Updated 2026-05-26