OpenAI releases a greatly expanded version of its Model Spec, a document that defines how AI models should behave.
OpenAI has significantly updated its Model Spec, expanding the document that outlines how its AI models should behave from its initial release in May 2024. The new 63-page guidelines focus on three core principles: customizability, transparency, and intellectual freedom.
A key shift in the new specification involves handling sensitive topics. Rather than defaulting to extreme caution, models are now expected to engage with users in searching for truth and take clear stances on issues like disinformation. "We can’t create one model with the exact same set of behavior standards that everyone in the world will love," explains Joanne Jang from OpenAI's behavior team in The Verge.
Addressing user feedback with more mature interactions
The updated guidelines introduce new approaches to adult content, including plans for a "grown-up mode" that would allow certain adult content in appropriate contexts while maintaining strict barriers against harmful material. CEO Sam Altman had previously hinted at this development.
The spec also tackles the issue of AI models being overly agreeable. Future versions will aim to provide more honest feedback and "behave more like a firm sounding board that users can bounce ideas off of — rather than a sponge that doles out praise." This change responds to criticism of AI's tendency to be overly agreeable - or sycophantic, as OpenAI calls it.
Whether these new guidelines will be implemented in the upcoming GPT-4.5 and GPT-5 models remains to be seen.