Ad
Short

LMSYS Org has added image recognition to the Chatbot Arena to compare vision language models (VLMs) from OpenAI, Anthropic, Google, and other AI vendors. In two weeks, more than 17,000 user preferences were collected in more than 60 languages. GPT-4o and Claude 3.5 Sonnet performed significantly better at image recognition than Gemini 1.5 Pro and GPT-4 Turbo. While Claude 3 Opus is better than Gemini 1.5 Flash for language models, both are similarly good for VLMs. The open-source model Llava-v1.6-34b is slightly better than Claude-3-Haiku. The data collected shows common applications such as image description, math problems, document comprehension, meme explanation, and story writing. Next, the team plans to add support for multiple images, as well as PDFs, video, and audio. The Large Model Systems Organization (LMSYS Org) is an open research organization founded by UC Berkeley students and faculty in collaboration with UCSD and CMU.

Image: LMSYS
Ad
Ad
Short

At I/O Connect in Berlin, Google unveiled its Gemma 2 model series, which will be available to researchers and developers through Vertex AI starting next month. The model will be available in two versions: with 9 and 27 billion parameters. The open-source models are said to be more efficient and secure than their predecessors, and outperform models twice the size in terms of overall performance. At Google I/O in May, the company showed the first benchmarks of the large version, which was also able to keep up with Meta's Llama 3 with 70 billion parameters. According to Google, the new models are aimed at developers who want to integrate AI into their apps or edge devices such as smartphones, IoT devices, and PCs.

Ad
Ad
Ad
Ad
Short

OpenAI is pushing back the launch of ChatGPT's advanced voice capabilities. Originally scheduled for late June with a small test group of ChatGPT Plus subscribers, the rollout has been delayed a month for safety reasons. The company says it continues to refine the system, including improving content moderation and rejecting inappropriate material. OpenAI is also improving the user experience and underlying infrastructure, it says. The rollout will be iterative, beginning in late July, with all Plus users gaining access in the fall. The exact timeline will depend on meeting security and reliability standards. OpenAI is also developing new video and screen sharing features that will be released separately.

Google News