OpenAI's "Model Spec" aims to guide AI behavior - and the company wants your input

Midjourney prompted by THE DECODER

With the publication of the "Model Spec", OpenAI wants to stimulate a public discussion about how AI models should behave. The document defines objectives, rules, and standard behaviors for the design of model behavior.

OpenAI has released the first version of its "Model Spec," a document that specifies desired behavior for AI models in the OpenAI API and ChatGPT, the company announced. The spec contains a set of core objectives, rules, and standard behaviors. Model behavior, or how models respond to user input, is crucial to how people interact with AI, according to OpenAI. However, designing this behavior is still a young science, as models are not explicitly programmed but learn from a variety of data.

The Model Spec reflects OpenAI's documentation, research, and experience in shaping model behavior, as well as ongoing work that will influence the development of future models, the company said.

Objectives, rules, standard behaviors - OpenAI pursues a multi-level approach

The Model Spec serves as a guideline for researchers and "AI trainers" to generate data for Reinforcement Learning from Human Feedback (RLHF). In the long term, OpenAI wants to investigate whether AI models can also learn directly from the Model Spec.

The model specification distinguishes between objectives, rules, and standard behaviors:

Objectives provide a general direction for desirable behavior, but are often too broad to give specific instructions.
Rules resolve conflicts between objectives and ensure safety and legality. They cannot be overridden by developers or users.
Standard behaviors outline behaviors that align with the principles but ultimately leave control to developers and users. They also show how to prioritize conflicting goals.

Objectives include supporting developers and end-users, benefiting humanity, and representing OpenAI well. Rules include following instructions by priority, obeying laws, avoiding illegal or harmful content, and protecting copyrights and personality rights.

Standard behaviors include assuming good intentions, asking clarifying questions, being objective, avoiding influencing opinions, expressing uncertainty, and being efficient while respecting length limits.

OpenAI: Model specs will continue to evolve

The company sees the release as part of an ongoing public discussion about how models should behave, how desired model behavior is defined, and how the public can best be involved in these discussions. OpenAI now aims to involve representative stakeholders from around the world, such as policymakers, trusted institutions, and experts.

Over the next two weeks, OpenAI invites the general public to provide feedback on the goals, rules, and standards in the Model Spec. Like the models themselves, the Model Spec will be continuously developed based on the feedback received.

Recommendation

AI and society

The rabbit hole that is OpenAI's Q*

The complete model spec is available in the documentation.

Join our community

Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

OpenAI's "Model Spec" aims to guide AI behavior - and the company wants your input

Objectives, rules, standard behaviors - OpenAI pursues a multi-level approach

OpenAI: Model specs will continue to evolve

The rabbit hole that is OpenAI's Q*

Rule-Based Rewards: OpenAI provides insight into the GPT-4 safety stack

CriticGPT: OpenAI sees AI critics as the key to safe alignment of more intelligent AI systems

"Apple Intelligence" is a system-wide blend of generative AI and personal context

Rule-Based Rewards: OpenAI provides insight into the GPT-4 safety stack

Meta takes on OpenAI's GPT-4o with Llama 3 405B, its largest open-source LLM to date

AI models might need to scale down to scale up again

OpenAI's "Model Spec" aims to guide AI behavior - and the company wants your input

Objectives, rules, standard behaviors - OpenAI pursues a multi-level approach

OpenAI: Model specs will continue to evolve

Share

Bank details