Artificial Intelligence: News, Business, Science

AI research

Dec 18, 2024Dec 18, 2024

Study shows: 'Test-time compute scaling' is a path to better AI systems

AI research

Dec 27, 2024Dec 27, 2024

Deepseek's $5.6M Chinese LLM wonder shakes up the AI elite

AI in practice

Dec 21, 2024Dec 21, 2024

OpenAI unveils o3, its most advanced reasoning model yet

AI research

Dec 18, 2024Dec 18, 2024

Study shows: 'Test-time compute scaling' is a path to better AI systems

AI research

Dec 27, 2024Dec 27, 2024

Deepseek's $5.6M Chinese LLM wonder shakes up the AI elite

AI research AI and society AI in practice

AI in practice

Jan 2, 2025Jan 2, 2025

The great AI scaling debate continues into 2025

AI in practice

Jan 1, 2025Jan 1, 2025

Short

According to internal documents obtained by TechCrunch, Google has been benchmarking its Gemini AI model against Anthropic's Claude. Google contractors are given up to 30 minutes per prompt to evaluate which model produces better outputs, focusing on criteria like truthfulness and comprehensiveness. Claude tends to be more safety-conscious in its answers compared to Gemini, according to Techcrunch. A Google DeepMind spokesperson confirmed that they do compare results across models, but stressed that they don't use Anthropic's models to directly improve Gemini, which would go against Anthropic's ToS. Also, this kind of competitive benchmarking is common in the AI industry - companies regularly benchmark their models against competitors to understand where they stand. Moreover, Google is an investor in Anthropic.

Techcrunch

AI in practice

Jan 1, 2025Jan 1, 2025

AI companies slash prices in battle for market share

AI research

Jan 1, 2025Jan 1, 2025

Microsoft's new Large Action Model can perform some tasks in Word

AI in practice

Dec 31, 2024Dec 31, 2024

Nvidia snaps up Israeli AI infrastructure company Run:ai

AI in practice

Dec 31, 2024Dec 31, 2024

Hugging Face's Smolagents framework simplifies building AI agents with just a few lines of code

AI in practice

Dec 31, 2024Dec 31, 2024

ByteDance plans to invest billions in Nvidia AI chips through offshore data centers

AI in practice

Dec 31, 2024Dec 31, 2024

Short

OpenAI CEO Sam Altman recently took to X to gauge what users want from the company in 2025. The most requested features (?) were AGI, AI agents, a "much better 4o upgrade," better memory capabilities, and longer context windows. Users also clamored for an "adult mode" that would dial back OpenAI's content moderation, along with enhanced research capabilities and improvements to the company's Sora videogenerator. One of these wishes - better memory - is already in testing, though OpenAI hasn't revealed any details about the underlying technology. Interestingly, Altman noted that many of the features OpenAI actually plans to release in 2025 barely made it onto users' wishlists - or didn't show up at all.

Altman via X