OpenAI reportedly launching ChatGPT's first browser agent "Operator" this week

Jan 22, 2025

Midjourney prompted by THE DECODER

Key Points

OpenAI is set to introduce "Operator," an AI agent system capable of independently handling computer tasks such as coding and travel bookings, with plans for a research preview and API for developers in January.
OpenAI CEO Sam Altman believes that AI agents represent the next frontier for artificial intelligence, focusing on more intelligent use of existing language models.
In addition to OpenAI, companies like Anthropic, Microsoft, and Google are also competing in the race to develop automated AI assistants that can streamline entire work processes by connecting subtasks.

Update: January 22, 2025

According to a report from The Information, OpenAI plans to launch "Operator" as a new ChatGPT feature for browser control later this week.

The feature will offer several task categories, including food and events, delivery services, shopping, and travel planning. Each category will come with its own set of suggested prompts to help users get started.

When users enter a prompt, a mini-screen opens within the chatbot, showing a browser window and the agent's actions in real time. The agent asks follow-up questions as needed, such as the time and number of guests for a restaurant reservation.

ChatGPT users can take control of the screen while the operator is working, and they can save and share their operator tasks with other users. The system can also work with websites that require users to log in, although Google's Gmail is reportedly an exception to this feature.

Update: January 7, 2025

OpenAI might release its AI computer agent "Operator" this month, according to a new report from The Information that confirms earlier Bloomberg coverage from November. The launch has apparently been delayed by safety concerns around "prompt injections" - a security vulnerability where users can manipulate an AI system into ignoring its built-in rules and restrictions.

Despite being a known issue since at least GPT-3, there's still no reliable defense against prompt injection attacks. This risk is particularly concerning for autonomous AI agents since users have less direct oversight of what content the model processes.

OpenAI co-founder Wojciech Zaremba recently criticized Anthropic for releasing its AI agent without proper safeguards, saying that OpenAI would have faced "tons of hate" if it had tried the same thing.

Original Article: November 14, 2024

OpenAI is reportedly planning AI agent "Operator" for January launch

OpenAI will introduce an AI assistant called "Operator" in January that can perform computer tasks on its own, according to Bloomberg, citing two people familiar with the matter.

The sources say OpenAI executives announced in an internal meeting that the tool will first launch as a research preview and through an API for developers.

While designed as a general-purpose assistant, Operator will focus on browser-based tasks. The move aligns with broader industry efforts to automate complex workflows.

OpenAI CEO Sam Altman views AI agents as the next phase of AI growth. This shift may stem from slower progress in traditional language model development. Altman suggests the future lies in using existing models more effectively.

Industry-wide push for AI agents

Several major AI labs are developing similar AI assistants to automate multi-step tasks with minimal user supervision.

Anthropic has already launched an assistant that processes screen content and performs real-time actions. Microsoft has integrated automation features into its Copilot platform.

Google is developing its own solution called "Project Jarvis," a Chrome-based AI assistant designed to handle tasks like online shopping and travel booking. The company plans to launch it alongside its new Gemini language model in December.

There's no standard definition for agentic AI systems yet, but they function as small programs or prompts that handle individual subtasks and coordinate with other assistants. This can be within one language model or by bridging different language models and AI systems.

By connecting multiple assistants that reliably perform specific tasks, companies aim to automate entire workflows through seamless coordination.

OpenAI has taken a first public step in this direction by releasing "Project Swarm" on GitHub. This experimental open-source framework allows developers to create and manage multiple-assistant systems. It demonstrates how assistants can transfer control between each other and execute defined task steps with specific tools. According to OpenAI, Project Swarm serves as a practical demonstration of how their assistant concept works in real-world applications.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

Source: Bloomberg | The Information