Table of Contents

If you’ve used Claude Code, you know what terminal-based AI collaboration feels like: staying in the command line while an AI reads your codebase, proposes changes, runs tests, and helps you commit — without breaking your flow.

OpenAI launched a direct equivalent in April 2025: Codex CLI. It’s an open-source lightweight coding agent that runs locally in your terminal, built in Rust, and as of 2026 has become a widely-used tool in the developer community.

TL;DR

Codex CLI is a serious terminal coding agent from OpenAI. It’s open source, runs locally, and supports multiple safety modes for controlling how autonomously it operates. If you’re already in the OpenAI ecosystem or want an open-source alternative to Claude Code, it’s worth a proper look.

What It Is

Codex CLI’s core capabilities:

  • Repository access: reads files in your working directory directly, no copy-paste required
  • File editing: modifies code directly or proposes changes for your review depending on the mode
  • Command execution: runs tests, builds, checks logs
  • Conversational interface: describe what you want in natural language; it breaks down the task and executes

It ships as an npm package, built in Rust for performance, open-sourced under MIT license, and supports MCP (Model Context Protocol) for integrating third-party tools.

Safety Modes

One of the more thoughtful aspects of the design: Codex CLI offers three levels of autonomy.

Suggest mode: Every action requires your confirmation. Codex proposes, you decide. Good for unfamiliar codebases or when you want fine-grained control.

Auto-edit mode: Codex can modify files automatically, but still needs confirmation before running system commands. A reasonable middle ground for routine work on familiar repos.

Full-auto mode: Codex operates autonomously — reads, edits, and executes without per-step confirmation. Intended for isolated environments (containers, worktrees) or well-understood automation pipelines.

This tiered approach lets you calibrate risk tolerance to the situation rather than making a binary trust/don’t-trust decision.

Getting Started

Requires Node.js:

npm install -g @openai/codex

Set your API key:

export OPENAI_API_KEY="sk-..."

Launch in your project directory:

codex

This opens a full-screen terminal UI. Natural language prompts work directly:

Find potential race conditions in auth.ts and explain them
Standardize all API response shapes to { data, error, meta }
Add clearer error messages where the CI is failing

Codex reads relevant files, explains its understanding, proposes specific changes, and — depending on your safety mode — either executes or waits for your approval.

Codex CLI vs. Claude Code

Both tools do essentially the same job. The differences are mostly ecosystem and implementation:

Codex CLIClaude Code
Underlying modelGPT series (incl. GPT-5.4)Claude series
Open sourceYes (MIT)No
Safety modes3 tiersConfigurable sandbox
MCP supportYesYes
ImplementationRustTypeScript
Best fitOpenAI API users, open-source preferenceAnthropic API users, claude.ai users

Both support MCP, so tool integration capabilities are converging. The core differences come down to model quality preferences and which ecosystem you’re already invested in.

Features Worth Noting

Subagent parallelism: Spin up multiple Codex subagents to work on different tasks in parallel — useful for large-scale refactors or cross-module changes.

Built-in code review: Ask a separate Codex agent to review your changes before committing, functioning as an automated reviewer.

Worktree support: Run automated workflows on isolated git worktrees, keeping your main working directory clean.

Overall Assessment

Codex CLI is a genuinely capable terminal coding agent. If you’re already using OpenAI’s APIs, the setup friction is minimal and the workflow integrates naturally. If you prefer open-source tools, it’s the only option in this space with an MIT license.

One caveat: this category of tool is evolving fast. Model capabilities — which drive most of the practical quality difference — are shifting every few months. Any comparison valid today may not be valid in six months. The best way to evaluate is to run it on real work and see how it performs for your actual use cases.

References

🇺🇸 English

If you've spent time with Claude Code in the terminal, you have a feel for what this category of tool does: stay in your flow, read the codebase, make edits, run tests — all without breaking your concentration or switching to a browser. OpenAI launched their answer to that in April 2025, and by now it's become a real presence in developer workflows.

It's called Codex CLI. Open source under the MIT license, built in Rust for performance, ships as an npm package, and runs entirely in your local terminal. You get a full-screen conversational interface where you describe what you want in plain language — "find potential race conditions in the auth module," or "standardize all our API response shapes" — and Codex reads the relevant files, explains what it's seeing, and proposes or makes the actual changes.

What sets Codex CLI apart in its design is a three-tier safety model, and this is genuinely thoughtful. At the most cautious level, Suggest mode, every action Codex wants to take requires your explicit confirmation. It proposes, you decide. This is the right choice when you're working in an unfamiliar codebase or you just want maximum visibility into what's happening.

One step up is Auto-edit mode: Codex can modify files on its own, but it still has to ask before running any system commands — builds, tests, anything that touches your environment beyond the code itself. A reasonable middle ground for day-to-day work on repos you know well.

And then there's Full-auto, where Codex just runs — reads, edits, executes — without asking at each step. This one's designed for isolated environments: containers, separate git worktrees, CI pipelines. The key insight behind all three tiers is that this isn't a binary trust-it-or-don't decision. You calibrate the autonomy to the situation. That's a thoughtful piece of product design.

Getting set up is minimal friction. Install it globally as an npm package, set your OpenAI API key as an environment variable, and launch it in your project directory. From there you're in the terminal UI, typing natural language.

Now, the obvious comparison is Claude Code. Both tools do the same job. The real differences come down to ecosystem and philosophy. Codex CLI is open source — the only MIT-licensed option in this space — which matters to a lot of teams with licensing constraints or a preference for auditability. It runs on the GPT model family. Claude Code runs on the Claude series, isn't open source, but has its own sandbox configuration. Both support MCP — the Model Context Protocol — for pulling in third-party tools, so integration capabilities are converging. The practical choice really comes down to which model you prefer and which API you're already paying into.

A couple of other things worth knowing: you can run multiple Codex subagents in parallel on independent tasks — useful for large refactors touching different parts of the codebase simultaneously. And you can spin up a separate Codex agent to review your changes before committing, essentially an automated code reviewer baked into the same workflow.

So here are the things to carry away from this.

First: Codex CLI is a legitimate, capable terminal coding agent. If you're already in the OpenAI ecosystem, the setup friction is close to zero, and it fits naturally into an existing workflow.

Second: the three-tier safety model is the most interesting design decision here. It reframes the question from "do I trust this AI?" to "how much autonomy is appropriate for this specific task?" That's the right question to be asking.

Third: this whole space is moving fast. Model quality — which drives most of the real-world quality difference between tools — shifts every few months. Any comparison that's accurate today may not be accurate by year-end. The honest evaluation is to run it on your actual work and see what happens.

🇹🇼 中文

今年四月,OpenAI 發布了一個叫 Codex CLI 的工具。如果你用過 Claude Code,你馬上就會懂它在做什麼:就是讓 AI 直接住在你的終端機裡,和你一起寫程式。不用切換視窗,不用複製貼上,AI 讀你的程式碼、改你的檔案、跑你的測試,全部在命令列搞定。

Codex CLI 的定位是「輕量化的本地 Coding Agent」。它用 Rust 寫的,開源在 GitHub,MIT License。核心能力有幾個:直接讀你工作目錄的檔案、修改程式碼、執行指令,然後透過自然語言對話的方式讓你描述任務,它負責拆解跟執行。

這個工具有一個設計我覺得蠻值得講的,就是它的**安全模式**,分三層。

最保守的叫 Suggest,也就是建議模式。Codex 每個動作都要你確認,它提出修改,你決定要不要套用。適合你對程式碼庫不熟,或者你就是想看清楚它在幹嘛。

中間那層叫 Auto-edit,自動編輯模式。改檔案它自己來,但執行系統指令還是需要你點頭。日常在熟悉的 repo 裡做修改,這個模式剛好。

最高的叫 Full-auto,全自動。讀、改、執行,它全部自己處理,不需要每步確認。這個比較適合跑在隔離環境裡,比如 Docker 容器,或者你很確定那個操作安全。

這個三層設計讓你可以根據情境調整風險邊界,而不是要不就全信任、要不就全手動確認,這個思路我覺得比較務實。

安裝的話,有 Node.js 環境就可以,一個 npm 指令裝好,設定 OpenAI API key,然後在你的專案目錄啟動,進入全螢幕的終端機介面。你直接用中文或英文描述任務,比如「幫我找 auth 模組裡可能的 race condition」,或者「把所有 API 的回應格式統一」,它就會讀相關檔案、解釋它的理解、然後提出修改方案。

跟 Claude Code 比較的話,兩個工具做的事高度重疊。最主要的差別是底層模型:Codex CLI 走 GPT 系列,Claude Code 走 Claude 系列。然後 Codex CLI 是開源的,Claude Code 不是。MCP 兩個都支援,所以工具整合的能力已經很接近了。選哪個,說直白一點,就是看你對哪個生態系更熟悉,以及你的 API key 放在哪。

有幾個功能比較有意思:它支援啟動多個 subagent 並行跑不同任務,適合大型重構。另外有一個 Code Review 模式,可以在你 commit 前讓一個獨立的 agent 先幫你看過,相當於多了一個自動化 reviewer。還有 worktree 支援,自動化流程可以跑在獨立的 git worktree 上,不會動到你正在工作的主分支。

最後總結三個重點。

第一,Codex CLI 是一個功能完整的終端機 Coding Agent,和 Claude Code 同量級,如果你已經在用 OpenAI 服務或傾向開源工具,值得試試。

第二,三層安全模式的設計很實用,讓你可以根據風險自己調,不是全信就是全手動。

第三,也是最重要的:這類工具的實際表現高度取決於底層模型,而這個領域差距縮得很快。現在的比較結論,半年後可能就不準了。最好的評估方式,還是自己帶著真實任務跑幾輪。

Tags

Related Articles