Adept recently introduced Fuyu-Heavy, a new multimodal AI model for digital agents. Fuyu-Heavy is the third most capable multimodal model after GPT-4V and Gemini Ultra, and excels in multimodal reasoning and UI understanding, the company says. It performs well on traditional multimodal benchmarks and matches or exceeds the performance of models in the same performance class on standard text-based benchmarks. The model performs similarly to Claude 2.0 on chat scores, and slightly better than Gemini Pro on the MMMU benchmark. Fuyu-Heavy will soon power Adept's enterprise product, and lessons learned from its development have already been applied to its successor. The following video demonstrates the model's ability to understand a user interface.
Ad
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Sources
News, tests and reports about VR, AR and MIXED Reality.
This app for Meta Quest 3 brings the LEGO feeling to mixed reality and we're giving away free keys
Meta Quest Charts give VR enthusiasts no reason to be happy
XR weekly round-up: Nvidia RTX 5090 divides VR community, Quest update is here and VR industry in crisis
MIXED-NEWS.com
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.