OpenAI's plan for GPT-5 as the grand unifier remains in place, and ChatGPT's screen agent "Operator" is set to get an update soon.
During a recent Reddit Q&A with the Codex team, OpenAI VP of Research Jerry Tworek described GPT-5 as the company's next foundational model. The goal isn't to launch a radically different system, it seems, but to "just make everything our models can currently do better and with less model switching."
One of the main priorities is tighter integration between OpenAI's tools. Tworek said components like the new Codex code agent, Deep Research, Operator, and the memory system should work more closely together so that users experience them as a unified system, instead of switching between separate tools.
Operator, OpenAI's screen agent, is also due for an update. The tool is still in the research phase and already offers basic features like browser control—but it's not yet reliable. Tworek said the upcoming update, expected "soon," could turn Operator into a "very useful tool."
Tworek's comments were more measured than OpenAI's earlier messaging around GPT-5. Back in February, the company said GPT-5 would merge the GPT and "o" model series into a single system, eliminating the need for users to switch between them. But by April, OpenAI had walked that back. CEO Sam Altman said full integration turned out to be more difficult than expected, so the company released o3 and o4-mini as standalone reasoning models instead.
Token use keeps growing—with no sign of slowing down
Tworek also weighed in on the growing demand for tokens—the small units of text that models use to process and generate language and code. A Reddit user imagined a future where AI assistants continuously handle around 100 tokens per second to read sensor data, analyze emails, or interpret user behavior. Would that kind of growth eventually hit a ceiling?
Tworek doesn't think so. He said token usage is a tradeoff between utility and cost, and both have been moving in the right direction with no end in sight. "Even if models stopped improving," he said, they could still deliver "a lot of value" just by scaling up. "That's the reason for large buildouts in infrastructure capable of producing those tokens," Tworek explained.
Still, he doesn't believe AI will fully replace human labor. "In my view, there will always be work only for humans to do," he said. That work may change, but it won't disappear. In the end, the "last job" could be supervising AI systems—making sure they act in humanity's best interest.
Benchmarks don't mean what they used to
When asked how GPT compares to other models like Claude or Gemini, Tworek downplayed the value of traditional benchmarks. He said they no longer reflect how people actually use these systems and are often skewed by targeted fine-tuning.
Instead, he prefers to evaluate models based on real-world tasks. That’s the only way to know whether a system can actually solve practical problems. OpenAI’s long-term goal, Tworek said, is to eliminate the need for users to choose a model manually—by automatically giving them the most effective one for the job.
"Different models and products have different strengths," he wrote, "but our goal is to resolve this decision paralysis by making the best one."