Based on customer and community feedback, OpenAI has extended support for its gpt-3.5-turbo-0301, gpt-4-0314, and gpt-4-32k-0314 models until at least June 13, 2024. These models were previously announced to end on Sept. 13. 2023.
The company said its priority in releasing new model versions is to increase smartness, with improvements in aspects such as following instructions, factual accuracy, and refusal behavior.
As it introduces new models, OpenAI faces challenges with performance drops for some tasks, and admits that its "evaluation methodology isn't perfect," but it's evolving. "While the majority of metrics have improved, there may be some tasks where the performance gets worse," the company says.
OpenAI allows API users to pin the model version, providing stability against changes that affect performance, and invites contributions to its Evals library for bug reporting. It is also working to provide developers with more stability and transparency around model releases and deprecations.
OpenAI changes its approach to communicating model changes
Recently, some users complained about lower quality results with the latest GPT models, as evidenced by a scientific study, which has apparently caused OpenAI to rethink its position, or at least its communication strategy, that GPT-4 isn't getting dumber. It now says that it's important to "give developers more stability and visibility into how we release and deprecate models."
We understand that model upgrades and behavior changes can be disruptive to your applications.
OpenAI
The latest model, gpt-4-0613, was introduced last month. With this model, we also saw a decline in performance on GPT-4-based tasks that we use in some of our editorial processes, such as social media posts and, as reported here, summaries.
The latest model didn't follow prompts as accurately and in as much detail as the older version, GPT-4-0314, and sometimes got basic facts wrong where the older model got them right. Developers can target the older models by specifying the version number when calling GPT-4, rather than just calling GPT-4, which always points to the latest version.