OpenAI has introduced six security measures to complement existing security controls and help safeguard "advanced AI."
To be useful, OpenAI says, AI models must be accessible online, whether for services like ChatGPT or for research purposes. That makes them vulnerable to attack. According to OpenAI, AI models are already being targeted by malicious cyber actors.
When it comes to infrastructure, OpenAI is particularly focused on protecting model weights, which are the product of expensive AI training. So there's a lot of money in these weights, which makes it worthwhile for bad actors.
"The resulting model weights are sequences of numbers stored in a file or series of files. AI developers may wish to protect these files because they embody the power and potential of the algorithms, training data, and computing resources that went into them."
Since model weights are just files that could potentially be stolen, the AI computing infrastructure needs to be as secure as possible. To this end, OpenAI proposes six security measures:
- Trusted computing for AI accelerators using new encryption and hardware security technologies. The goal is to ensure GPU accelerators can be cryptographically verified and model weights stay encrypted until loaded onto the GPU. Additionally, model weights and inference data should only be decryptable by authorized GPUs.
- Network and client isolation assurances to reduce attack surfaces and data exfiltration. AI systems should be able to operate offline, isolated from untrusted networks. Also, strong tenant isolation should guarantee AI workloads can't be compromised by infrastructure provider vulnerabilities.
- Innovations in data center operational and physical security. This covers comprehensive access controls, 24/7 monitoring, bans on data storage media, and data destruction requirements. New approaches like remote-controlled "kill switches" or tamper-evident systems are also being explored.
- AI-specific audit and compliance programs. Existing security standards (SOC2, ISO/IEC, etc.) should be expanded to include AI-specific requirements.
- AI for cyber defense to assist defenders. AI can be incorporated into security workflows to take some load off security engineers. OpenAI uses custom models to analyze high-volume security telemetry.
- Resilience, redundancy, and ongoing security research to keep up with threats. Controls should deliver defense-in-depth and work together to maintain resilience even if individual controls fail.
OpenAI itself is working on implementing these measures, which it details in a blog post. The company also wants to encourage the AI and security community to participate in research and development. To support this, OpenAI is offering a $1 million grant program. OpenAI is also working with the U.S. military on cybersecurity.
When will OpenAI's "advanced AI" arrive?
The safety guidelines outlined here may provide a glimpse into the protections for OpenAI's next large language model, GPT-5. OpenAI CEO Sam Altman recently confirmed plans to release a new AI model this year. GPT-5 could mean a leap in performance similar to the jump from GPT-3 to GPT-4.
In addition to infrastructure security risks, there are also application-level dangers, most notably prompt injections, which can cause AI models to generate unwanted output, such as instructions on how to build a bomb. There is currently no foolproof protection against prompt injections, which have been a known problem since at least GPT-3.