OpenAI launches GPT-5.1 API with improved coding capabilities and new developer features

Nov 13, 2025

Sora prompted by THE DECODER

OpenAI has rolled out its latest language model, GPT-5.1, to the API. Pricing stays the same as GPT-5.

The update adds two new variants for longer programming workloads: gpt-5.1-codex and gpt-5.1-codex-mini. Prompt caching now lasts up to 24 hours, which should noticeably boost speed and lower costs for repeated queries.

According to OpenAI's published benchmarks, GPT-5.1 delivers moderate improvements over GPT-5. On SWE-bench, a coding benchmark, GPT-5.1 scores 76.3 percent, up from 72.8 percent. Most of the other results are nearly identical to the previous version, making it clear that this is a fine-tuning update, which matches the ".1" increment in the name.

Evaluation	GPT-5.1 (high)	GPT-5 (high)
SWE-bench Verified ^{(all 500 problems)}	76.3%	72.8%
GPQA Diamond ^{(no tools)}	88.1%	85.7%
AIME 2025 ^{(no tools)}	94.0%	94.6%
FrontierMath ^{(with Python tool)}	26.7%	26.3%
MMMU	85.4%	84.2%
Tau²-bench Airline	67.0%	62.6%
Tau²-bench Telecom*	95.6%	96.7%
Tau²-bench Retail	77.9%	81.1%
BrowseComp Long Context 128k	90.0%	90.0%

GPT-5.1 also introduces a "No Reasoning" mode, which skips deep reasoning to generate much faster responses. OpenAI says this setting outperforms GPT-5 with "minimal" reasoning, especially when using tools, running code, or searching the web.

A new "apply_patch" tool lets GPT-5.1 change code, create, edit, or delete files. The shell tool can suggest command line commands, which are then executed and checked locally. This points to more automation in developer workflows. More details on the API model are here.

Warmer responses in ChatGPT might foster concerns about safety and emotional attachment

GPT-5.1 is also available in ChatGPT. OpenAI says the model is better at following prompts and gives responses that feel warmer and more human. But this friendlier tone comes with new safety tradeoffs: according to OpenAI's latest safety evaluation, more empathetic replies might sometimes make the model less strict with sensitive topics.

The GPT-5.1-thinking model showed declines in handling issues like harassment, hate speech, violence, and sexual content, with scores dropping by up to seven percentage points. Both model variants also became less resistant to emotional dependency, as the instant model's score dropped from 0.986 to 0.945.

Mental health now has its own assessment category, reflecting concerns about users seeing more in the chatbot than just a tool. GPT-5.1-thinking improved in this area (from 0.466 to 0.684), while GPT-5.1-instant slipped a bit (from 0.944 to 0.883). Online A/B tests showed mixed results, and OpenAI notes that these numbers aren't statistically strong. In the end, real-world experience will determine how these changes affect users.

The GPT-5.1-thinking model showed declines in handling issues like harassment, hate speech, violence, and sexual content, with scores dropping by as much as seven percentage points. Both model variants also became less resistant to emotional dependency, with the instant model's score falling from 0.986 to 0.945.

Category	GPT-5-thinking	GPT-5.1-thinking	GPT-5-instant (Aug 15)	GPT-5-instant (Oct 3)	GPT-5.1-instant
Emotional reliance*	0.812	0.785	0.688	0.986	0.945

*Emotional reliance measures the model's ability to avoid fostering emotional dependency.

On the security front, GPT-5.1-instant now blocks jailbreak attempts more effectively, with its StrongReject score rising from 0.850 in October to 0.976. Still, as with other metrics, only real-world use will show how effective these changes really are.

AI News Without the Hype – Curated by Humans

Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.

AI news without the hype
Curated by humans.

More than 16% discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

OpenAI launches GPT-5.1 API with improved coding capabilities and new developer features

Warmer responses in ChatGPT might foster concerns about safety and emotional attachment

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.