Content
summary Summary

Inflection claims that its new language model, Inflection-2, outperforms direct competitors such as Google PaLM-2 and Claude 2, and is second only to GPT-4.

The new model is said to be significantly more powerful than its predecessor, Inflection-1, and, according to the startup, demonstrates improved factual knowledge, better style control, and significantly improved reasoning.

Inflection-1 was released in July. It was roughly on par with GPT-3.5 and PaLM-540B. Inflection-2 should now catch up with GPT-4, the company claims.

Inflection-2 Outperforms Claude 2 and PaLM 2 Large on Benchmarks

Inflection-2 was trained on 5,000 NVIDIA H100 GPUs with a mixing accuracy of fp8 for about 10²⁵ FLOPs. According to Inflection, this puts it in the same training class as Google's flagship PaLM 2 Large, which will soon be replaced by Gemini.

Ad
Ad

However, Inflection-2 outperforms PaLM 2 Large on most standard AI performance tests, including the widely used MMLU benchmark, which covers a broad range of language-related tasks from high school to professional level, and other language tests such as TriviaQA, HellaSwag, and GSM8k.

Comparison of Inflection-1, Google's PaLM 2-Large and Inflection-2 for a number of commonly used academic benchmarks. (N-values in parentheses) | Image: Inflection

Compared to GPT-4, Inflection-2 scored 89.0 on the HellaSwag 10-shot, approaching GPT-4's score of 95.3. In addition, Inflection says its latest LLM outperforms Claude 2 with chain-of-thought reasoning, i.e., an already optimized prompting process.

Final results of the MMLU language comprehension test. As always, benchmark results and real-world use may differ. | Image: Inflection

Inflection-2 falls well short of GPT-4 for coding and math tasks, but performs better than Metas Llama 2, for example. Inflection-2 is not optimized for coding, Inflection writes, so there is room for improvement in future models.

Inflection-2 in coding and math benchmarks compared to the competition. | Image: Inflection

Pi chatbot will soon run on Inflection-2

Inflection-2 will soon run the company's Pi chatbot. The infrastructure is being upgraded from Nvidia A100 to H100 GPUs, which should speed up inference, i.e., the processing of input by the AI model. Despite its multiple size (175 billion parameters), Inflection-2 should be cheaper and faster than Inflection-1.

Inflection is already planning to train even larger models on the full capacity of the 22,000-GPU cluster. The next AI model will be about ten times larger and will be released in about six months, the company says. You can test Pi at Pi.ai/talk.

Recommendation

In terms of safety and responsibility, Inflection has voluntarily signed on to the White House's July 2023 commitments.

Inflection has some big names on board

Inflection went public in March 2022. Founded by LinkedIn founder Reid Hoffman, Deepmind co-founder Mustafa Suleyman, and former Deepmind researcher Karén Simonyan, the AI startup focuses on using natural language as a personal interface to computers.

In May 2022, Inflection AI closed a $225 million investment round, and in June 2023, the company announced another investment round in which Microsoft, Reid Hoffman, Bill Gates, Eric Schmid, and Nvidia invested a total of $1.3 billion. At the time, the company was valued at $4 billion.

Since the startup's announcement, AI researchers such as Heinrich Kuttler of Meta AI and Maarten Bosma and Rewon Child, formerly of Google Brain, are said to have joined Inflection AI. Former Deepmind and Google product manager Joe Fenton is helping Inflection AI develop its products and business model.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • AI startup Inflection claims its new language model, Inflection-2, outperforms competitors like Google PaLM-2 and Claude 2, and is second only to GPT-4 in performance, with improved factual knowledge, style control, and reasoning skills.
  • Inflection-2 outperforms PaLM 2 Large on standard AI performance tests such as the MMLU benchmark, TriviaQA, HellaSwag, and GSM8k, but lags behind GPT-4 on coding and math tasks.
  • The company plans to use Inflection-2 to power its Pi chatbot and is already working on a larger AI model, with significant funding from investors including Microsoft, Reid Hoffman, Bill Gates, Eric Schmid, and Nvidia.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.