Content
summary Summary

Amazon is creating a new multimodal AI system called Olympus that focuses on image and video analysis.

Ad

The visual focus aims to make up for the company's limitations in text processing and solving complex problems, where it lags OpenAI and Anthropic's models, according to a report from The Information citing multiple sources familiar with the project.

Sources say Olympus goes far beyond basic video AI capabilities. The system can track movement in sports footage with high accuracy, such as the exact moment a basketball leaves a player's hand, and estimate the trajectory, replacing work typically done by human analysts.

It's also capable of searching video archives with high accuracy. This capability could appeal to sports analytics firms and media companies managing large video libraries. Industrial applications could include AI-powered inspections of underwater drilling equipment to find corrosion or leaks.

Ad
Ad

Combining text with visual models

Late last year, Amazon's AI chief Rohit Prasad laid out plans for four large text models, sources told The Information. Two notable models in development include one with 400 billion parameters and another with two trillion parameters—larger than the original GPT-4. While parameter count matters less now due to new efficiency techniques, it still provides a rough measure of model capability.

At the time, Amazon planned to pair those text models with a smaller visual processing model. The exact technical details behind Olympus remain under wraps for now, though the company may reveal more at next week's AWS re:Invent cloud conference.

Boosting AWS cloud appeal

It makes sense for Amazon to keep investing in its own models, even after putting $8 billion into Anthropic. Major cloud providers need exclusive AI offerings alongside standard options to attract customers. As Microsoft works with OpenAI and Google builds Gemini, Amazon wants its own edge in the market.

Like Google, Microsoft, and Meta, Amazon is developing its own AI chips to reduce its dependence on Nvidia's pricing control and chip shortages in the AI chip market. With its own models and those from Anthropic, Amazon would have clear use cases for these custom chips.

Ad
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • Amazon is developing a new multimodal AI system called Olympus that excels at video analysis, aiming to compensate for its limitations in text processing compared to competitors.
  • Olympus can perform advanced tasks like precisely tracking basketball movements in sports footage, potentially appealing to sports analytics firms, media companies, and industrial applications.
  • Amazon is reportedly working on large text models with up to two trillion parameters, which it plans to combine with a smaller visual processing model to create Olympus, though the exact technical details remain undisclosed.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.