Ad
Short

According to Bloomberg reporter Mark Gurman, Apple is working to bring Apple Intelligence to the Vision Pro headset. One challenge is to optimize the features for mixed reality. The AI features will not be released for the Vision Pro until next year - Apple Intelligence will launch on all other supported devices in the fall. By then, Gurman expects a deal with Google or Anthropic to support additional AI models. Longer term, he speculates, the company may be planning a monthly subscription service like "Apple Intelligence+" that offers additional features to monetize the technology. Apple already takes a cut of subscription revenue from any AI partner it brings on board. "The company will be less reliant on hardware tweaks to drive its business and will actually be making money from AI — something everyone in Silicon Valley is hoping to pull off," Gurman says.

Short

LMSYS Org has added image recognition to the Chatbot Arena to compare vision language models (VLMs) from OpenAI, Anthropic, Google, and other AI vendors. In two weeks, more than 17,000 user preferences were collected in more than 60 languages. GPT-4o and Claude 3.5 Sonnet performed significantly better at image recognition than Gemini 1.5 Pro and GPT-4 Turbo. While Claude 3 Opus is better than Gemini 1.5 Flash for language models, both are similarly good for VLMs. The open-source model Llava-v1.6-34b is slightly better than Claude-3-Haiku. The data collected shows common applications such as image description, math problems, document comprehension, meme explanation, and story writing. Next, the team plans to add support for multiple images, as well as PDFs, video, and audio. The Large Model Systems Organization (LMSYS Org) is an open research organization founded by UC Berkeley students and faculty in collaboration with UCSD and CMU.

Image: LMSYS
Ad
Ad
Ad
Ad
Short

At I/O Connect in Berlin, Google unveiled its Gemma 2 model series, which will be available to researchers and developers through Vertex AI starting next month. The model will be available in two versions: with 9 and 27 billion parameters. The open-source models are said to be more efficient and secure than their predecessors, and outperform models twice the size in terms of overall performance. At Google I/O in May, the company showed the first benchmarks of the large version, which was also able to keep up with Meta's Llama 3 with 70 billion parameters. According to Google, the new models are aimed at developers who want to integrate AI into their apps or edge devices such as smartphones, IoT devices, and PCs.

Ad
Ad
Google News