Content
summary Summary

OpenAI's new ChatGPT Operator is getting its first workout from US users who have early access to the tool.

Ad

While OpenAI's launch event showed basic uses like booking restaurants and planning trips, users are pushing the boundaries to see what this AI agent can really do.

Dan Mac shared a video showing the operator scanning job listings with his uploaded resume. The agent managed to find a position that matched his background - a task it handled well, despite working somewhat slowly.

Video: via Dan Mac

Ad
Ad

Software developer Kieran Klaassen found a different use case, testing the operator on local development environments.

Video: via Klaassen

Meanwhile, Alex Volkov spent 40 minutes putting the system through its paces. He liked how it could juggle multiple tasks and understand concepts like tweet quoting, but noticed some hiccups with cookie handling and task completion times. At one point, the operator seemed confused about its own capabilities, asking if it should keep monitoring a chat when nothing was happening.

Chris Koerner tested a more entrepreneurial approach. He had the operator automatically message Facebook Marketplace sellers offering piano pickup services for $200. After some initial hand-holding, the system started working independently, even logging its outreach efforts in Google Sheets.

Video: via Chris Koerner

Recommendation

Mixed results show promise and limitations of automated browsing

Not all tests went smoothly. One Reddit user tried getting the operator to compile information about 50 financial YouTubers, including LinkedIn profiles and email addresses. While the agent knew to open a web browser, it searched Bing instead of YouTube and struggled to find a suitable spreadsheet. After 20 minutes, the user gave up, ending up with an incomplete table on an unfamiliar Office website containing incorrect contact details for just 18 influencers.

Table view of a spreadsheet file with a list of financial influencers, their contact details and channel descriptions on various financial topics.
To create a list of influencers, the operator uses Microsoft's Bing search engine - but doesn't think to use Excel for a spreadsheet. | Image: u/No-Definition-2886/Reddit

Some users report running into website blocks when using the operator. A post on r/webdev claimed eBay prevented mass price collection, though this might be due to general bot protection rather than specific operator blocking. The system appears to use a virtual Chrome browser through Microsoft Azure servers, but there's no specific parameter yet in robots.txt files to control its access.

Reddit seems to have similar protections in place, but user Rowan Cheung showed how the operator found a workaround using Bing search results instead.

Video: via Rowan Cheung

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

ChatGPT Operator seems to fulfill the basic premise of being able to navigate through the Internet autonomously, according to these reports. The fact that this works better than with previous approaches is probably due in part to the fact that the system not only accesses the DOM of a web page, but also evaluates screenshots using the multimodal GPT-4o.

As with older agent systems, testers are initially impressed by the autonomy. However, it still makes too many mistakes for human users to rely on it for important tasks without constant supervision.

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Early users are putting OpenAI's new ChatGPT operator through its paces, showcasing its potential for tasks like job hunting, testing web applications, and juggling multiple tasks at once.
  • But when it comes to more complex tasks, like digging up data on influencers, the operator stumbles. It looks in the wrong places, struggles with spreadsheets, and even "hallucinates" information. Plus, it's slow.
  • While ChatGPT Operator is better at navigating the web on its own than previous attempts, it's still too error-prone to be trusted with important tasks without constant human supervision.
Jonathan works as a freelance tech journalist for THE DECODER, focusing on AI tools and how GenAI can be used in everyday work.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.