OpenAI is rolling out Codex, a cloud-based AI agent for software development that automates tasks like bug fixes and feature implementation. The company says Codex is designed to eventually establish a new paradigm for how code gets written.
The name "Codex" nods to OpenAI's original code model, which was retired in 2023. The new agent runs on codex-1, a variant of the o3 model tuned specifically for software development tasks. Codex is available now as a research preview for ChatGPT Pro, Enterprise, and Team users, with support for Plus and Edu subscribers coming soon.
Each Codex session runs in its own isolated cloud container, preloaded with the relevant code repository. Developers can launch tasks with simple text prompts in the ChatGPT sidebar—using "Code" for implementations or "Ask" for code-related questions. Every job is executed in a separate environment. Codex can read and modify files, run tests, execute commands like linters or type checkers, and log its results. According to OpenAI, most tasks take between one and thirty minutes to complete.
Verifiable changes and AGENTS.md files
When a task is done, Codex documents its changes with terminal logs and test results, so developers can check every step. The environment can be configured to closely match the team's real-world dev setup.
Codex can also be guided using special AGENTS.md files. These text files act as instructions for the agent, covering things like testing conventions, code structure, and PR messages. Rules in AGENTS.md apply recursively throughout the project directory tree, and if there are conflicts, files deeper in the hierarchy take precedence.
OpenAI trained Codex using reinforcement learning on real-world software development tasks, aiming to replicate the human coding style and preferences for pull requests. Internal benchmarks show codex-1 performs accurately, even without AGENTS.md files or customized environments.
OpenAI already uses Codex internally for tasks like refactoring, test generation, and bug fixing. Early partners—including Cisco, Temporal, and Superhuman—are putting Codex to work as well. Temporal, for example, uses Codex for error analysis and connecting components, while Superhuman lets product managers make small code changes on their own. Kodiak applies Codex to improve test coverage and debugging tools in autonomous driving software.
To get the most out of Codex, OpenAI recommends splitting well-defined tasks among multiple agents running in parallel. The long-term vision is a workflow where developers move fluidly between real-time collaboration (like with Codex CLI) and asynchronous delegation.
Limitations, safety, and pricing
Right now, Codex doesn't support image inputs and doesn't allow agent interaction during task execution. The model operates without internet access, with visibility limited to the provided repository and pre-installed dependencies. Codex was trained to recognize and reject requests related to malware, but continues to support legitimate low-level coding tasks. A section in the o3 system card details these safeguards.
For now, Codex is free to use, with flexible pricing planned for the future. The smaller codex-mini-latest model, used for Codex CLI, is already available and costs $1.50 per million input tokens and $6 per million output tokens, with a 75% discount through prompt caching.
Looking ahead, OpenAI plans to integrate Codex more deeply with popular developer tools. In the future, developers may be able to delegate tasks to Codex directly from issue trackers or CI systems. Features like adjusting tasks mid-execution and collaborative strategy planning are also on the roadmap.