Content
summary Summary

From now on, ChatGPT Plus subscribers can send up to 50 messages in three hours to GPT-4. After the introduction of the new model in March, the number was limited to 25 messages in two hours for computational and cost reasons.

The increase in the number of messages may be related to the recently introduced GPT-4 version "0613", which may be a more efficient model, but may also sacrifice quality (see below).

What is noticeable compared to the March model "0314" is the significantly higher writing speed of the new model. This was also the case when switching from GPT-3 to GPT-3.5. However, there was no loss of quality. The new GPT-4 model "0613" can handle significantly more RPM (requests per minute) and TPM (tokens per minute).

Overview of GPT models per API. The latest GPT model processes six times more tokens per minute. | Image: Screenshot OpenAI

GPT-4 offers more volume and is faster - at the expense of quality?

In our tests via the API, the new GPT-4 version follows our prompt templates created for the March release less reliably and in less detail and is more prone to factual errors.

Ad
Ad

For example, we generate some summaries below our articles with GPT-4. In the summary of this news item about Barry Diller's lawsuit plans, the old GPT-4 model correctly captures that Diller wants to team up with publishers to sue AI companies.

The March GPT-4 model correctly deduces from the article that Diller aims to team up with publishers to sue AI companies. | Image: THE DECODER

The new version of GPT-4 claims that Diller wants to sue the publishers. This bug could be reproduced on multiple regenerations with the same prompt. The old version is always right, the new one is always wrong.

The new GPT-4 model captures the core of the message incorrectly. | Image: THE DECODER

Of course, this is just an anecdotal observation. However, a more systematic investigation also suggests that ChatGPT has lost quality with GPT-3.5 and GPT-4 since March. Ultimately, however, this is not conclusively proven and possible reasons are not known.

Criticism of the allegedly declining quality of the model has been voiced for several weeks, especially by frequent users on social media and Reddit. OpenAI has always pointed out that there has been no degradation in quality, and if anything has changed, it's an improvement. The examples sometimes cited are likely bugs. Now that the study has been published, OpenAI wants to investigate the examples and possible quality degradation.

LLMs must be reliable for everyday work

Regardless of whether the criticism of potentially poor performance proves true, OpenAI would be well advised to ensure the reliability of its model across model changes and to communicate better and more transparently. Especially for business customers who resell services, quality fluctuations in day-to-day use can jeopardize the business model.

Recommendation

This may not even be a degradation of the model, but rather a deviation in prompt compatibility, for example. Since the understanding of the exact functioning of the models is still low, OpenAI faces a challenging task here, which will only become more difficult as its customer base grows.

Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Ad
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Summary
  • OpenAI recently increased the limit for ChatGPT-Plus so that subscribers can now send up to 50 messages in three hours to GPT-4 instead of 25 messages in two hours.
  • The new GPT-4 model processes more requests and tokens per minute and is significantly faster than its predecessor. In testing via the API, our editorial team found unreliability and content errors in the following prompt templates, which could indicate a degradation in model quality, or at least a significant change.
  • OpenAI needs to ensure the reliability of its models, especially for business customers for whom variations in quality could jeopardize the business model.
Sources
Online journalist Matthias is the co-founder and publisher of THE DECODER. He believes that artificial intelligence will fundamentally change the relationship between humans and computers.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.