OpenAI · Codex

Codex — OpenAI's coding agent family

Three shipping surfaces on one backend. Codex CLI (Rust, terminal), ChatGPT Codex (agent inside ChatGPT), Codex Cloud (remote parallel tasks in sandboxed VMs). All backed by gpt-5-codex via the Responses API. Not the retired 2021 Codex model — this is the modern agent-first product line.

gpt-5-codex default

CLI + ChatGPT + Cloud

MCP client + server

AGENTS.md aware

Sandbox by default

ChatGPT-backed API is not a public surface

The three surfaces

one backend, three harnesses [Codex CLI] [ChatGPT Codex] [Codex Cloud] Rust · terminal web · desktop · mobile remote · parallel open source ChatGPT plan required async · sandboxed VM │ │ │ └────────────────────┴────────────────────────┘ ▼ shared auth ChatGPT OAuth (default) │ or OPENAI_API_KEY ▼ Responses API /v1/responses │ ▼ gpt-5-codex (agent-tuned GPT-5) │ ▼ tool surface shell exec · apply_patch · file edit MCP client + server · web search · browser AGENTS.md-guided · approval gate

At a glance

Key	Value
Default model	gpt-5-codex
CLI install	`npm install -g @openai/codex` or `brew install codex`
User config	`~/.codex/config.toml`
Session store	`~/.codex/sessions/`
Auth (default)	ChatGPT OAuth → `~/.codex/auth.json`
Auth (scripted)	`OPENAI_API_KEY`
Repo primer file	`AGENTS.md`
Docs	developers.openai.com/codex
Source (CLI)	openai/codex

Phase 01 · Mental model

Three surfaces · one backend

Pick the surface based on where the work fits, not on what Codex can do — all three run the same model.

Codex CLI

Terminal-native. Local repo. Approval-gated shell + file edit. Rust binary; open source. Best for inline dev work, scripting, CI.

ChatGPT Codex

Inside ChatGPT on web / desktop / mobile. Each task runs in an attached container. Codex edits files in the sandbox, opens PRs on GitHub. Best for tasks you want to launch and monitor from a browser.

Codex Cloud

Remote agent. Async. Fires off in a sandboxed VM. Good for 15-minute-plus work you don't want to babysit — overnight runs, parallel branch experiments.

When to reach for which

Use case	Surface
Quick edit while you're in the terminal anyway	CLI
Scripted / CI / automation	CLI with `codex exec` + `--dangerously-bypass-approvals-and-sandbox`
Browsing issues, want to fire off 5 fixes in parallel	ChatGPT Codex
Phone or tablet; you want to kick off work away from your laptop	ChatGPT Codex mobile
"Do this thing, take as long as needed, open a PR when done"	Codex Cloud
Long-running multi-branch experiment	Codex Cloud with per-repo setup script

Same auth, same plan

All three surfaces share your ChatGPT OAuth (default) or API key. ChatGPT-plan users get usage included. API-key users are billed per-token. One login covers the lot.

Phase 02 · Setup

Install & first run

Codex CLI installs in seconds; ChatGPT Codex needs nothing beyond a ChatGPT plan.

Codex CLI

# install npm install -g @openai/codex # or brew install codex # first launch — opens interactive TUI codex # one-shot codex exec "summarise this repo's architecture" # resume last session codex resume --last # pin a model codex --model gpt-5-codex

ChatGPT Codex

Sign in to ChatGPT with a Plus, Pro, Business, or Enterprise plan.
Open a chat, select the Codex tool from the composer.
Connect a GitHub repo via OAuth (first time only).
Fire off a task; Codex picks up the repo, runs in-sandbox, reports back.

Codex Cloud

No separate install — reached from ChatGPT Codex by choosing "run remotely" when starting a task, or from the API. Per-repo setup script + environment variables get configured once in the ChatGPT Codex settings.

Platforms

macOS — first-class. Seatbelt sandbox isolates shell calls.
Linux — first-class. Landlock + seccomp sandbox.
Windows — WSL2 recommended; native Windows support is improving.
Docker — pair with --dangerously-bypass-approvals-and-sandbox for CI pipelines.

Phase 03 · Terminal

Codex CLI

Rust binary. Interactive TUI + non-interactive exec mode. Sandboxed by default; approval-gated.

Commands

Command	Effect
codex	Interactive TUI in the current dir
codex exec "<prompt>"	Non-interactive one-shot — prompt → tool calls → response → exit
codex resume	Interactive session picker
codex resume --last	Resume the most recent session
codex login / logout	Swap ChatGPT auth on/off (falls back to OPENAI_API_KEY)
codex whoami	Show the active auth method
codex mcp	Run Codex itself as an MCP server — other agents can call it

Flags

Flag	Effect
--model <name>	Pin model (`gpt-5-codex`, `gpt-5`, `gpt-4o`)
-p <profile>	Load profile from `config.toml`
--cwd <path>	Override working directory
-a, --ask-for-approval <mode>	Set approval mode at launch
--dangerously-bypass-approvals-and-sandbox	YOLO mode. For sandboxed external contexts only.

Approval modes

read-only (default)Every shell command + file write needs approval. Safest for unfamiliar repos.

autoAuto-approve edits inside the working directory. Shell commands still prompt.

full-accessAuto-approve everything — but sandbox still active. Network-access still gated separately.

danger-full-accessBypass approval + sandbox. For CI / Docker / intentional YOLO.

Keyboard

Esc — interrupt the current action.
Y / N — approve / deny the pending action.
Shift+A — approve this type of action for the rest of the session.
Ctrl+C — quit.
↑ / ↓ — cycle history.

Phase 04 · Browser agent

ChatGPT Codex

Codex inside ChatGPT. Each task runs in a sandboxed container tied to your repo.

Task lifecycle

1 · Attach repoOAuth to GitHub once; enable per repo. Codex respects branch-protection rules.

2 · Kick off taskIn ChatGPT, select Codex from the composer, write the task prompt.

3 · Sandbox spins upFresh container with the repo cloned. Setup script (if configured) runs first.

4 · Agent worksCodex plans, runs shell, edits files, iterates. You watch the diff live.

5 · PR or pushCodex offers a PR (default) or a branch push. Review inline before merging.

Parallel tasks

The model that makes ChatGPT Codex distinct from the CLI: you can fire off multiple tasks in parallel against the same repo on separate branches. No working-tree contention. Great for batch fixes ("apply these five linter fixes in parallel") or exploring alternative implementations.

Mobile

ChatGPT Codex works on the ChatGPT mobile apps. Kick off a task from your phone, walk away, come back to a PR when it's done. Same sandbox + PR flow as web/desktop.

Plan tiers

Plan	Codex access
Free	Not included — Codex requires a paid plan
Plus	Codex included, standard quota
Pro	Higher quota + priority
Business / Enterprise	Org-scoped repo access · SSO · audit log

Phase 05 · Remote async

Codex Cloud

Codex running on OpenAI infra in a sandboxed VM. Async. Good for long-running work you'd rather not tie up a chat for.

When to use Cloud over ChatGPT Codex

Task exceeds ~15 minutes. ChatGPT chat sessions can feel bad for long waits; Cloud tasks run in the background and notify on completion.
You need many parallel agents. Cloud scales better than sequential ChatGPT Codex tasks.
You want repo-scoped setup done once. Cloud sandboxes reuse the configured setup script — faster cold start than spinning up a fresh ChatGPT container each time.
You need environment secrets. Cloud supports repo-scoped env vars injected at run time.

Setup script

Per-repo hook that runs once when the Cloud VM is provisioned. Install dependencies, warm caches, pre-build. Same shape as a GitHub Action step.

# example setup script pnpm install --frozen-lockfile pnpm build pnpm test -- --listTests > /dev/null # warm jest cache

Environment + secrets

Configure per-repo env vars in the ChatGPT Codex settings. Injected into the Cloud VM at run time. Never echoed to chat output. Typical use: DATABASE_URL for a test DB, OPENAI_API_KEY for a repo that itself calls the OpenAI API, signing tokens.

Cloud gotchas

No free tier. Cloud consumption counts against your plan quota.
Secrets are powerful. A repo-scoped DATABASE_URL means Codex can hit that DB. Don't ship production creds to Cloud; use a test environment.
Setup script errors block the run. If npm install fails, the task fails. Treat the setup script like CI.

Phase 06 · Who are you

Auth

Two paths. ChatGPT OAuth (default, entitlement-funded) or API key (metered).

ChatGPT OAuth (default)

codex login — opens a browser.
Approve Codex in ChatGPT.
Token cached at ~/.codex/auth.json.
Inference covered by your ChatGPT plan quota. No per-call OpenAI API billing.

The ChatGPT-backed API surface

Under the hood, ChatGPT auth routes inference through chatgpt.com/backend-api/codex — a.k.a. codex_responses. It's not a public API surface; it returns intermittent empty response.output=[] under load. Fine for interactive use where you retry. Third-party wrappers (like the Hermes Pi harness) need self-heal paths to handle the flakiness.

API key (scripted / CI)

export OPENAI_API_KEY=sk-...
Codex picks up the env var and skips ChatGPT OAuth.
Inference billed per-token against the OpenAI account that owns the key.
Uses the stable public Responses API — no flaky backend.

Switching

codex login — swap to ChatGPT auth.
codex logout — clear cached creds.
OPENAI_API_KEY set → Codex prefers API-key mode regardless.
codex whoami — show which path is active + current quota status.

Recommendation

Scenario	Auth
Daily dev on your laptop with a ChatGPT plan	ChatGPT OAuth
Shared team CI	API key in the CI secret store
Scripted third-party agent	API key (stable surface)
Personal home-lab that wants entitlement-funded calls	ChatGPT OAuth + self-heal fallback (see Hermes Pi harness)

Phase 07 · What it can do

Tools & MCP

Built-in tool surface + MCP in both directions.

Built-ins

shell exec

Run commands inside the sandbox. Subject to approval mode. Long-running commands stream output.

file read + write

Read/write files relative to the working directory. Writes prefer apply_patch format for reliability.

apply_patch

Unified-diff-like block. Atomic application. Use this for any multi-file edit instead of freeform writes.

web search

Live search. On by default in ChatGPT Codex; opt-in via config in the CLI.

browser (ChatGPT)

Full browser tool in ChatGPT Codex. Take screenshots, click through pages, scrape.

image output

Codex can emit images (screenshots of running apps, generated assets) — returned as attachments in ChatGPT, stored as files in the CLI.

MCP client

Codex is an MCP client. Configure external MCP servers in ~/.codex/config.toml:

[mcp_servers.filesystem] command = "npx" args = ["-y", "@modelcontextprotocol/server-filesystem", "."] [mcp_servers.github] command = "npx" args = ["-y", "@modelcontextprotocol/server-github"] env = { GITHUB_PERSONAL_ACCESS_TOKEN = "${env:GH_TOKEN}" }

Codex as an MCP server

Inverse direction — expose Codex itself so other MCP clients can delegate coding tasks to it. Run codex mcp; register the resulting stdio process in Claude Code, Cursor, or any MCP-compatible host. Useful for heterogeneous agent pipelines where a supervising agent wants a Codex-class tool as a sub-agent.

apply_patch format

*** Begin Patch *** Update File: src/lib/foo.ts @@ -export function bar() { - return 1 -} +export function bar(n: number) { + return n * 2 +} *** End Patch

Also supports *** Add File: <path> followed by the full content and *** Delete File: <path>. Claude Code uses a different format (exact-string replacement); if you move between the two, remember they're not interchangeable.

Phase 08 · Config

Config

TOML files and the one repo file that matters.

~/.codex/config.toml

[default] model = "gpt-5-codex" approval = "auto" reasoning_effort = "medium" [profile.fast] model = "gpt-5" reasoning_effort = "low" approval = "auto" [profile.yolo] approval = "danger-full-access" [mcp_servers.filesystem] command = "npx" args = ["-y", "@modelcontextprotocol/server-filesystem", "."] [tools] web_search = true [sandbox] writable_paths = ["./tmp", "./build"] network_access = false timeout_seconds = 120

Profiles

Named presets. Select at launch with codex -p fast. Useful for switching model, approval mode, or MCP-server set without editing the default block.

AGENTS.md

Repo-root markdown file read by every Codex surface on every turn. Equivalent role to Claude Code's CLAUDE.md. Cascading — a nested src/AGENTS.md applies when Codex is working under src/. Conventional contents:

Project overview.
Build / test / lint commands.
Style conventions the model should preserve.
Forbidden patterns (don't touch generated/, never commit to main).
Approval hints — "for this repo, always run tests before declaring done."

Sessions directory

Transcripts live under ~/.codex/sessions/. One .jsonl file per session. Safe to prune. codex resume reads from here.

Phase 09 · Positioning

Codex vs Claude Code

Same product class, different vendors, different sharp edges.

Side by side

Dimension	Codex	Claude Code
Binary	Rust, `codex`	Node.js, `claude`
Default model	gpt-5-codex	claude-opus-4-7
Repo primer file	`AGENTS.md` (cascading)	`CLAUDE.md` (project-level)
Config file	`~/.codex/config.toml`	`~/.claude/settings.json`
Project config	AGENTS.md + profile	.claude/settings.json + skills
Safety surface	Approval modes + sandbox (seatbelt/landlock)	Permission modes + hooks + allowlist
Custom procedures	Implicit via AGENTS.md	Explicit `.claude/skills/<name>/SKILL.md`
Interception	Approval prompts only	6 hook types (PreToolUse, PostToolUse, UserPromptSubmit, Stop, SubagentStop, SessionStart)
Remote surface	Codex Cloud (parallel async)	None first-party; Duraclaw covers the space
Patch format	`apply_patch` unified diff	Exact-string replace (`Edit` tool)
MCP	Client + server	Client + server
Free tier	Not available — paid plan required	API key pay-as-you-go
Entitlement path	ChatGPT OAuth, entitlement-funded	API key only

Where Codex wins

Parallel browser-side tasks — Codex Cloud + ChatGPT parallel-agent pattern has no first-party equivalent in Claude Code.
Mobile — ChatGPT Codex on iOS/Android works well; no Claude Code mobile story.
apply_patch format — more reliable for large multi-file edits.
Sandbox by default — Codex's seatbelt/landlock gating is tighter than Claude Code's convention-driven permissions.
Entitlement-funded — if you're paying for ChatGPT anyway, Codex adds zero marginal inference cost.

Where Claude Code wins

Hook system — six hook types give operators interception points Codex's approval flow doesn't.
Skills — explicit versioned procedures as markdown files; Codex's AGENTS.md is monolithic by comparison.
Plugin / marketplace — shared skills + settings via a plugin system.
Opus 4.7 1M context — large-context wins on big-repo exploration.
Long-context agent SDK — better tooling for building custom agents on top.

They're not mutually exclusive

Both are MCP clients and MCP servers — run Codex as an MCP server (codex mcp) and call it from Claude Code, or vice versa. For heterogeneous pipelines where one agent is supervising and another is executing, this is the cheapest interop.