Content
summary Summary

Hugging Face has released a new open-source AI agent designed to handle basic computer tasks. For now, though, it feels more like a shaky tech demo than a useful assistant.

Ad

The Open Computer Agent, which runs in a web browser, interacts with applications like Firefox on a Linux-based virtual machine—meaning it can browse the web and perform simple searches.

But even Hugging Face admits there are major limitations. The agent responds slowly, frequently gets tripped up by CAPTCHAs, and often needs a full restart to work again. By default, the agent logs requests to help improve the technology, though users can turn this off.

Agent struggles with even simple demo tasks

Tests by THE DECODER show just how rough things are. When prompted to complete Hugging Face's own demo task—finding the company's headquarters on Google Maps—the agent instead searches for a "3d printing supply store," missing the mark entirely. Good old-fashioned Google searches work better: 20 Jay St Suite 620, Brooklyn, New York, USA.

Ad
Ad
Screenshot: Benutzeroberfläche des Open Computer Agent mit Eingabefeld für Aufgaben und Anzeige der Agenten-Aktivität in Google Maps.
The Open Computer Agent uses a virtual Linux machine with which it can operate various programs such as Firefox. | Image: Screenshot/THE DECODER

At least the project looks good. Hugging Face put extra effort into the design, giving the interactive Linux interface a shiny, retro-futuristic frame. The style seems inspired by Apple's hit series "Severance," complete with a toggle labeled "Innie/Outie" to switch the effect on or off.

In a demo by Hugging Face employee Aymeric Roucher, the computer agent answers the question of how long Alexander's soldiers had walked from their departure in Macedonia to India when they decided they were too tired to go any further. | Video: Aymeric Roucher/Hugging Face

The agent is built on "smolagents," a minimalist framework for AI agents that Hugging Face introduced in December 2024. This open-source library lets developers create agents with very little code, allowing the AI to write Python code directly instead of using traditional JSON commands. The idea is to streamline workflows and make agents more efficient.

Under the hood, the agent also uses Alibaba's Qwen-VL vision model, which can locate elements in images and interact with user interfaces. In benchmarks, the latest Qwen2.5-VL-32B model (released in March) even outperformed larger models like Qwen2-VL-72B, showing particular strength at analyzing complex visual information.

More tech demo than practical tool

The launch of the Open Computer Agent—inspired by OpenAI's experimental ChatGPT Operator—is the latest in a series of open-source efforts from Hugging Face that follow the lead of commercial solutions. Back in February, the company unveiled Open Deep Research, a competitor to OpenAI's Deep Research, built in just 24 hours.

Recommendation

Despite rising interest from businesses—KPMG reports that 65 percent of companies are already experimenting with AI agents—the state of the Open Computer Agent shows how early things still are. Agents that use computers like humans remain stuck in the experimental phase. For developers and researchers, it's an interesting playground, but it's nowhere near ready for everyday use.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Hugging Face has introduced the Open Computer Agent, which is designed to perform basic PC tasks via a web browser. However, tests show clear technical weaknesses - the agent is slow to respond and fails even at simple demo tasks.
  • The tool is based on the "smolagents" framework and uses Alibaba's Qwen-VL vision model. It can interact with a Linux-based virtual machine and programs like Firefox.
  • The release is more of an experimental tech demo than a fully developed product. For developers, the agent provides a testing platform, but it is not yet ready for everyday use, and it is questionable whether it ever will be with this approach.
Jonathan writes for THE DECODER about how AI tools can make our work and creative lives better.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.