Content
summary Summary

A team at Hugging Face, led by chief researcher Thomas Wolf, has created an open-source version of OpenAI's Deep Research system in 24 hours.

Ad

According to the Hugging Face blog, they aim to make the proprietary technology accessible to everyone by replicating the agent framework behind OpenAI's Deep Research. The team developed their system to write program code directly instead of using JSON for actions. This approach reduces processing steps by about 30 percent, leading to lower costs and better performance compared to traditional language models.

Comparison of two LLM agent implementations: text/JSON vs. code-based approach with APIs for country price comparison of a smartphone.
When calculating the price of a smartphone in different countries, the JSON-based solution requires separate actions for each step (get exchange rate, look up price, calculate taxes). The Code Agent, by contrast, can perform the entire calculation in a single loop.| Image: via Hugging Face

For the actual implementation, the team borrowed two key pieces from Microsoft's Magentic-One agent framework: a text-based web browser for searching and a text inspector that can read various file formats.

Testing the system's research capabilities

The team evaluated their system using the GAIA benchmark, which tests how AI agents handle complex research tasks. One example asks: "Which of the fruits shown in the 2008 painting "Embroidery from Uzbekistan" were served as part of the October 1949 breakfast menu for the ocean liner that was later used as a floating prop for the film 'The Last Voyage'? Give the items as a comma-separated list, ordering them in clockwise order based on their arrangement in the painting starting from the 12 o'clock position. Use the plural form of each fruit."

Ad
Ad

To solve this puzzle, the AI agent needs to:

  • Identify the fruit in the painting through image processing
  • Determine which ocean liner appeared in the movie
  • Locate its breakfast menu from 1949
  • Present the information in the required format

Hugging Face's system scored 55.15 percent on these multi-step challenges. That's better than Microsoft Magentic-One's 46 percent, but still trails OpenAI's 67 percent with Deep Research.

The team acknowledges they still have work ahead to match OpenAI's Deep Research, particularly in improving browser interactions. One key difference: Hugging Face relies on available open-source language models, while OpenAI uses its own o3 model, specifically trained for web tasks using reinforcement learning.

Still, Hugging Face's results on the GAIA benchmark, coming on the heels of OpenAI's Deep Research release, suggest the gap between open-source and proprietary AI may be closing faster than expected - another indication, after the Deepseek dilemma, that proprietary AI may not be the strongest business model.

The team's next step is to develop GUI agents that can interact directly with screens, mice, and keyboards. The code is available on GitHub, and you can see a live demo here. Other developers have created their own open-source versions, including dzhng, assafelovic, and Jina AI. Hugging Face plans to analyze and document these different approaches.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Recommendation
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Hugging Face, the open-source AI company, is working on making OpenAI's proprietary Deep Research technology accessible to the public. The project is spearheaded by co-founder Thomas Wolf.
  • The Hugging Face system achieved a 55.15 percent success rate on the GAIA benchmark, which tests AI systems on complex, multi-step tasks. This places their performance between Microsoft's Magentic-One (around 46 percent) and OpenAI Deep Research (approximately 67 percent).
  • A significant improvement in Hugging Face's approach was the use of a code agent, which expresses actions using program code instead of JSON. The team plans to explore GUI agents in the future, but recognizes that significant work is needed to achieve full parity with OpenAI's system.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.