Adept is working on universal language control for computer software. The goal: One day, we will control computers only by language.
In April 2022, Adept AI Labs presented itself to the global public for the first time: the startup is funded with around 65 million US dollars and has several former researchers from Deepmind, Meta and Google on its team, who have been working together since around December 2021.
Adept co-founder Ashish Vaswani was the lead author of the research paper on Transformer, a neural network with a special attention mechanism that has laid the foundation for many advances in computational linguistics in recent years.
Adept's goal is to develop an AI assistant that works with and for humans on computers and learns through human feedback. Natural language processing will serve as an interface to one day operate any software imaginable through words, the startup says.
"We believe the clearest framing of general intelligence is a system that can do anything a human can do in front of a computer," the team writes. The next era of computing would be defined by direct language input instead of performing tasks manually.
ACT-1: Universal transformer for language-controlled software
Now the startup is showing the first demo: Adept confidently announces the Transformer-based AI model ACT-1 as the "next frontier of models that can take actions in the digital world."
For a demo, Adept trained ACT-1 to operate a traditional browser using text input. The model is integrated via a Chrome extension. The video below shows how it navigates a real estate site based on a text prompt to find a home for a family of four in Houston with a budget of up to $600,000.
1/7 We built a new model! It’s called Action Transformer (ACT-1) and we taught it to use a bunch of software tools. In this first video, the user simply types a high-level request and ACT-1 does the rest. Read on to see more examples ⬇️ pic.twitter.com/mq7c0Vyd7N
— Adept (@AdeptAILabs) September 14, 2022
In other demos, Adept shows how the AI model operates Salesforce on the web and Excel or autonomously researches Wikipedia to answer user questions. The model can also link actions across websites and programs: For example, the AI searches for a refrigerator under $1,000 on Craigslist and contacts the seller via Gmail.
4/7 The model can also complete tasks that require composing multiple tools together; most things we do on a computer span multiple programs. In the future, we expect ACT-1 to be even more helpful by asking for clarifications about what we want. pic.twitter.com/fEyFATqcvx
— Adept (@AdeptAILabs) September 14, 2022
One of the most interesting features of ACT-1 is that the system can learn and improve actions based on human feedback. Only this mechanism gives it the flexibility it needs to be a useful digital assistant for many tasks.
The following video shows how ACT-1 creates a new column in Excel when prompted by the user, but the column contains an error. Through text input, the user provides a hint as to the correct column function. The AI takes the change from the text and corrects the column function.
According to Adept, these demos only scratch the surface of an Action Transformer's capabilities. The startup says it is "making great progress towards Adept being able to do arbitrary things on a computer."
Talk to your computer as if there had never been a mouse
Adept predicts that "in a few years" most computer interactions will be through natural language rather than graphical user interfaces.
"We’ll tell our computer what to do, and it’ll do it," the startup writes. Today's input methods would then seem outdated by comparison.
The language-based interface would also allow many more people to use software more effectively without having to go through training first. Documentation and instructions would be processed by AI models instead of humans. The resulting efficiencies could accelerate human progress in all areas, Adept believes.