Two terminal-native AI coding agents from the two leading AI labs. Both work in your existing editor, both understand your codebase, both can run commands on your behalf. The differences are real — in model quality, pricing model, project configuration, sandboxing approach, and what happens when things get complex.
This comparison covers the technical and practical differences that affect your actual workflow, not just the surface-level feature matrices.
| Factor | Claude Code | Codex CLI |
|---|---|---|
| Underlying model | Claude Sonnet 4.6 / Opus 4.6 | OpenAI frontier models |
| Context window | Up to 1M tokens | Not disclosed |
| Install | npm install -g @anthropic-ai/claude-code | npm install -g @openai/codex |
| License | Proprietary | Apache-2.0 (open source) |
| GitHub stars | Proprietary — no repo | 80k+ |
| Project config file | CLAUDE.md | AGENTS.md |
| Pricing model | Pay-per-token (API key) | ChatGPT subscription or pay-per-token |
| MCP support | Yes (deep integration) | Yes |
| OS-level sandboxing | Permission prompts | Seatbelt (macOS) / bwrap (Linux) |
| Sub-agent / orchestration | Yes (Task tool) | Yes (parallel worktrees) |
Both install via npm:
# Claude Code
npm install -g @anthropic-ai/claude-code
# Codex CLI
npm install -g @openai/codex
Claude Code requires an Anthropic API key; token usage billed directly. Codex CLI lets you sign in with a ChatGPT account (Plus, Pro, Business, or Enterprise) and charges against your subscription's message allocation, or you can provide an OpenAI API key for pay-per-token access.
The subscription path is a material difference: if you already pay for ChatGPT Plus ($20/month), Codex CLI usage is included up to your tier's limits — no additional API spend. Claude Code has no equivalent subscription tier; it's always token-billed.
Claude Code uses Anthropic's Claude family — Sonnet 4.6 by default, with Opus 4.6 for hard problems and Haiku 4.5 for fast, cheap tasks. You switch models mid-session with /model.
Codex CLI uses OpenAI's frontier models and pulls whichever version is current for your subscription tier. You also switch models with /model inside the interactive session.
The honest answer on output quality is: it depends on the task. Claude tends to produce more coherent, well-structured multi-file changes with better contextual reasoning. GPT-4o class models are strong at code generation but can be more mechanical. The 1M-token context window in Claude Code is a genuine differentiator for large codebases — you can load entire repos into context in ways that Codex CLI's undisclosed context limit can't match.
Both tools read a project config file that you commit to the repo:
CLAUDE.md — loaded at every session start. Put your stack, conventions, test commands, and things to avoid. Supports nested CLAUDE.md files in subdirectories. Heavily documented; see the CLAUDE.md guide for what to include.AGENTS.md — same concept. Originally an OpenAI format, now stewarded by the Linux Foundation as a cross-tool standard. Adopted by 60,000+ open-source projects. The cross-tool standardization means you can write one file and have it work with Codex, and potentially other agents that adopt the standard.In practice, CLAUDE.md supports more Claude-specific features (sub-agent behavior, tool permissions, memory). AGENTS.md is slightly more portable but slightly less powerful within its home tool.
Neither is clearly better if you're only using one tool. If your team is split between Claude Code and Codex CLI, AGENTS.md's cross-tool positioning is genuinely useful — one project config, two tools.
This is a real difference in approach.
Codex CLI uses OS-level sandbox enforcement:
sandbox-execbwrap (Bubblewrap) + seccompThe default sandbox mode (workspace-write) lets Codex read and write files in your workspace, run routine local commands, but blocks network access. The danger-full-access mode removes all restrictions. The read-only mode prevents all mutations.
Claude Code uses a permission prompt system: before any destructive or external action, Claude asks for approval unless you've configured auto-approval for specific tool categories. The --dangerously-skip-permissions flag removes all prompts (equivalent to Codex's danger-full-access).
Codex's OS-level sandboxing is more trustworthy in security-sensitive environments — the kernel enforces the boundary, not the model's self-restraint. Claude Code's permission system is more granular (you can allow Bash while requiring approval for Write to specific paths) but relies on the model correctly describing its intent before each action.
For most development work, both are adequate. For CI pipelines, automated scripts, or running untrusted code, Codex's OS sandbox is stronger.
Codex CLI has three named approval policies:
on-request — explicit approval before any action outside sandbox limitsuntrusted — auto-approves safe reads, prompts for state-mutating commandsnever — disables all approval prompts (full autonomy)An auto-review agent mode can also evaluate risk before passing approval decisions to you — useful for long-running autonomous tasks.
Claude Code has plan mode, auto mode, and per-tool permission configuration. You can also wire hooks to intercept and block specific tool calls programmatically — something Codex doesn't have a direct equivalent for.
Codex CLI has notably deep git integration. It analyzes your repo history for context, auto-generates meaningful commit messages, warns when operating in untracked directories, and supports parallel agents in isolated git worktrees for simultaneous non-conflicting work streams.
Claude Code reads git context but is less opinionated about commits — it will make changes, run your tests, and let you commit on your terms. If you want auto-commit behavior in Claude Code, you need to instruct it explicitly or configure a PostToolUse hook.
For teams that want a tighter git workflow with automatic commit generation, Codex's native approach is cleaner.
Both tools support Model Context Protocol for wiring in external tools. Claude Code's MCP integration is more mature and better documented — it was one of the first tools to ship MCP support, and the ecosystem of Claude Code MCP servers is broader. See the Claude Code MCP guide for setup and useful servers.
Codex CLI supports stdio and streaming HTTP MCP servers via ~/.codex/config.toml, and can itself run as an MCP server when integrated with the OpenAI Agents SDK. For Claude Code-style MCP depth, you're currently better positioned.
Claude Code has a hooks system — scripts that run at lifecycle events (before/after tool calls, on session start/stop). You can block dangerous commands, log all file writes, trigger notifications, auto-lint after edits. See the hooks guide.
Codex CLI doesn't have an equivalent first-class hooks system. You get the sandbox and approval modes, but not the programmatic lifecycle interception that Claude Code's hooks provide. For teams that want to enforce custom guardrails (block writes to generated files, audit all bash executions, auto-run tests after edits), Claude Code has a real advantage here.
Claude Code's 1M-token context window is one of its most practical advantages. You can load entire codebases — multiple large files, full git history, extensive documentation — into a single session without chunking or summarizing. This matters most for:
Codex CLI doesn't disclose its effective context limit. In practice, it compresses or truncates older context in long sessions. For most day-to-day tasks (bug fixes, feature additions within a module, short automation scripts), this doesn't matter. For large-scale architectural work, Claude Code's context advantage is material.
| Path | Claude Code | Codex CLI |
|---|---|---|
| Entry point | Anthropic API key (token-billed) | ChatGPT Plus ($20/mo) or API key |
| Typical dev (1 hr/day) | $30–80/mo (Sonnet) | Included in ChatGPT Plus |
| Heavy use (4+ hr/day) | $150–400/mo (Sonnet/Opus mix) | Hits rate limits; needs Pro tier ($100+/mo) |
| Teams | Token costs scale linearly per seat | Business tier pricing (custom) |
| Cost visibility | /cost command shows real-time spend | Usage tied to subscription allocation |
The real comparison is: do you already pay for ChatGPT Plus? If yes, Codex CLI is effectively free up to your monthly allocation — a significant cost advantage. If you're starting from zero, both require meaningful monthly spend at serious usage levels.
Claude Code's token-based billing is more predictable for teams (cost is proportional to usage) but can surprise you with a large bill when running long agentic sessions with many tool calls. Codex's subscription model caps your cost but also caps your throughput at the highest usage volumes.
Codex CLI is Apache-2.0 licensed with 80,000+ GitHub stars and over 760 releases since April 2025. You can read the source, fork it, audit what it does with your code, and contribute to the project. This matters in enterprise environments where security and compliance teams want full auditability.
Claude Code is proprietary. Anthropic doesn't publish the source. For teams where open source is a hard requirement, Codex wins by default.
Both tools solve the same problem: AI that understands your codebase and can make changes across it. The choice is less about which has more features and more about which model you prefer, which vendor relationship you have, and whether your workflow is better served by token-billed depth (Claude Code) or subscription-based breadth (Codex CLI).
Neither tool has built-in memory across sessions. Start a new Claude Code session and it doesn't know what it did yesterday — CLAUDE.md gives you project context, but not session history. Start a new Codex CLI session and AGENTS.md does the same. Every session starts fresh.
For most tasks this doesn't matter. For long-running projects where you need the AI to remember decisions made across multiple sessions — architectural choices, debugging history, previous attempts that didn't work — neither tool has a built-in answer.
Beyond Your Ability (BYA) gives Claude Code persistent memory across sessions — so it remembers your project decisions, what was tried and why it didn't work, and how your architecture evolved. No more re-explaining the same context every morning.
See how BYA works →