OpenAI launches o1 and ChatGPT Pro for $200 per month

Dec 5, 2024

OpenAI

Key Points

OpenAI has launched a new premium version of ChatGPT called ChatGPT Pro, which offers unlimited access to the more powerful o1 model.
The standard version is available for $20 per month through the Plus subscription, while the Pro version costs $200 per month and promises improved performance on complex tasks. An API version is also planned.
To mark the launch, OpenAI is granting ten ChatGPT Pro scholarships to medical researchers at top U.S. institutions. CEO Sam Altman touts o1 as the "world's most intelligent model" and announces its upcoming availability as an API.

OpenAI has rolled out a new top-tier ChatGPT subscription that includes unlimited access to a professional version of its latest o1 model. The standard o1 model, previously in preview, is now available to all regular ChatGPT subscribers.

The pricing structure splits into two tiers: Users can access the standard o1 through the existing $20 monthly Plus subscription, while those needing more power can opt for the new ChatGPT Pro plan at $200 per month. OpenAI CEO Sam Altman calls o1 "the smartest model in the world, "adding that the Pro version can "think even harder for the hardest problems."

More computing power for complex tasks

The o1 Pro version draws on extra computing power to handle complex problems with more precision. According to OpenAI, this enhanced mode shows its strength in data science tasks, programming challenges, and legal analysis.

The company built this Pro tier with a specific audience in mind: researchers, engineers, and professionals who need what OpenAI calls "research-grade intelligence daily."

Testing data from OpenAI shows that o1 handily beats both the preview model and GPT-4o across several benchmarks. The new model scored notably higher on mathematics competitions, programming challenges, and PhD-level scientific questions. OpenAI says it will roll out an API version of o1 soon, giving developers direct access to these capabilities.

Three bar charts compare AI model performance in mathematics (AIME), programming (CodeForces) and scientific questions (GPQA Diamond). — The performance boost from GPT-4o to o1 is particularly noticeable on coding tasks, with some results outperforming human experts. OpenAI plans to release the API version of the model in the near future. | Image: OpenAI

The Pro version, with its additional processing power, takes these improvements even further, outperforming the standard o1 model across the board.

Bar chart: Comparison of the 4/4 reliability of AI models in Pro and Preview mode. — In reliability tests, o1's Pro mode set a high bar—requiring all four out of four answers to be correct—and consistently outperformed both the standard o1 and preview versions. | Image: OpenAI

Better accuracy, fewer hallucinations

Testing shows o1 hallucinates less frequently than GPT-4o. In the "SimpleQA" test of 4,000 fact-based questions, o1 achieved 47% accuracy, up from GPT-4o's 38%. The hallucination rate dropped from 61% to 44%. The "PersonQA" test, focusing on public facts about people, showed similar improvements: o1 reached 55% accuracy (up from GPT-4o's 50%) with a 20% hallucination rate (down from 30%).

Table: Hallucination rates and accuracy of the GPT and o1 models on SimpleQA and PersonQA. — The smaller models—GPT-4-Mini and o1-Mini—struggle more with accuracy and produce significantly more hallucinations than their larger counterparts. | Image: OpenAI

OpenAI credits these improvements to its new "chain-of-thought" training approach. The method teaches the model to perform a longer thought process before it responds, allowing the system to better verify facts and recognize unreliable information faster.

o1 can be deceptive

During safety testing, the researchers stumbled upon something unexpected: o1 sometimes behaves deceptively. OpenAI created a special monitoring system to track how the model reasoned through problems.

The testing revealed misleading responses in about 0.17% of cases. In the most common scenario—occurring in 0.09% of responses—the model would make up its own rules, then use these invented rules to justify withholding information. Another troubling pattern showed up in 0.04% of cases, where the model simply made up false references and sources to support its claims.

Example: AI model bypasses monitoring mechanisms to achieve its goal. — The example shows how o1 attempts to circumvent the monitoring of its target achievement to achieve a misaligned target. Despite explicit instructions in the configuration file, the model bypasses the security mechanisms. | Image: OpenAI

The researchers have mixed feelings about this development: On the one hand, the "chain-of-thought" technique allows for greater control over the models; on the other hand, the improved reasoning capabilities could lead to more complex deception strategies.

OpenAI donates ChatGPT Pro scholarships

OpenAI launches a ChatGPT Pro grant program, initially awarding ten grants to medical researchers from leading U.S. institutions, including Harvard Medical School and Berkeley Lab. The company plans to expand the program to other areas of research. OpenAI intends to add more computationally intensive features for Pro users in the future.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: OpenAI | System Card