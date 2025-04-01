Matthias is the co-founder and publisher of THE DECODER, exploring how AI is fundamentally changing the relationship between humans and computers.

Update April 1, 2025:

Amazon has released additional examples demonstrating Nova Act's capabilities, claiming the system operates more reliably than existing solutions. According to Amazon, the system breaks down complex workflows into discrete commands like searching, making payments, or answering questions about screen content. Developers can add custom instructions, call APIs, and interact directly with browsers through the Playwright library.

In internal testing, Amazon reports Nova Act achieved over 90 percent success rates on UI interactions like date selection and popup handling. The system outperformed comparable models from Anthropic and OpenAI on established benchmarks like ScreenSpot and GroundUI Web.

Amazon says Nova Act works effectively even in completely unfamiliar environments like browser games, despite not being specifically trained for them. The company says it has integrated the model into its Alexa+ voice assistant.

Amazon views Nova Act as an initial step toward more sophisticated AI agents. Rather than relying solely on supervised fine-tuning, the company plans to emphasize reinforcement learning across various environments. This approach mirrors OpenAI's Computer-Using Agent (CUA), which also used reinforcement learning on web data for training.

The company's long-term vision involves creating AI agents capable of handling multi-step tasks like wedding planning or complex IT operations independently. A demonstration shows Nova Act setting up an out-of-office message in Outlook.

Video: Amazon

Currently, these agents require significant human supervision. The long-term goal of companies developing these systems is to make these processes more reliable, faster, and capable of running in parallel to handle many office tasks automatically.

Original article from March 31, 2025:

Amazon launches AI agent toolkit with Nova Act SDK

Amazon has released Nova Act, a new AI agent development system, along with a web service to access its existing AI models.

U.S. developers and customers can now access the preview version of Nova Act SDK, which provides access to Amazon's language models Nova Micro, Lite, and Pro, as well as models for image generation (Nova Canvas) and video creation (Nova Reel). The models are already available through Amazon Bedrock, but the new website nova.amazon.com aims to make them more accessible.

"Nova.amazon.com puts the power of Amazon's frontier intelligence into the hands of every developer and tech enthusiast, making it easier than ever to explore the capabilities of Amazon Nova," says Rohit Prasad, SVP of Amazon Artificial General Intelligence.

The Nova Act SDK allows developers to build AI agents that can navigate browsers and perform actions, similar to OpenAI's Operator. According to Amazon, Nova Act helps developers break down complex processes into manageable commands for tasks like web searching, payment processing, and question answering. The platform includes features for adding detailed instructions to improve task reliability.

Video: via Amazon

"We think of agents as systems that can complete tasks and act in a range of digital and physical environments on behalf of the user. Today, such agents are still in an early stage," Amazon writes.

The release represents Amazon's entry into the growing field of AI agents capable of performing tasks across digital environments. Some industry observers consider this technology a potential next growth frontier for AI, with implications for automating various white-collar jobs through AI agents operating computers and executing tasks at speeds beyond human capabilities.