Anthropic's Claude 3 beats OpenAI's GPT-4. Right? In the benchmarks published by the company, the largest model, Opus, beats GPT-4, but a closer look reveals that it is complicated: Anthropic tested its latest model against the first version of GPT-4, not newer versions like GPT-4 Turbo. The reason: OpenAI has so far only published benchmarks for the old GPT-4 model, which can only be accessed via the API. However, there are GPT-4 Turbo results for some benchmarks that do not come directly from OpenAI. AI researcher Lawrence Chan has compiled them. A look at these numbers makes it clear: In every benchmark where Claude 3 and GPT-4 Turbo were compared, the OpenAI model still beats the best model from Anthropic - even if only by a few percentage points. However, the models are so close that the question of which model is better depends very much on the task at hand - and is mostly a matter of taste.
Ad
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Sources
News, tests and reports about VR, AR and MIXED Reality.
What happens next with MIXED
My personal farewell to MIXED
Meta and Anduril are now jointly developing XR headsets for the US military
MIXED-NEWS.com
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.