AI text recognition: each detector has its opinion

Mar 4, 2023

Midjourney prompted by THE DECODER

With the advent of AI text generators such as ChatGPT, the question arose: how can AI text be distinguished from text written by humans?

This question cuts across all domains, from teachers who need to grade assignments, to agencies that hire copywriters, to search engines that try to rank content. AI text detectors tend not to answer this question, at least not reliably.

So far, solutions like DetectGPT, GPTzero, and even OpenAI's own text classifier have not been able to provide convincing results for ChatGPT and GPT-3, as well as other AI generators: Neither AI nor human text is reliably recognized as such, which can have negative consequences if decision-makers in the education sector, for example, rely on the results.

AI detectors do not seem to work reliably

Author Brandon Gorrell of the newsletter Pirate Wires has started a more extensive test, feeding various texts from him and ChatGPT into the most popular AI detectors, besides the one from OpenAI in GPTZero, Content at Scale, Writer.com, Corrector.app and CopyLeaks. His tests show that the tools rarely agree or are at least vague in their judgment.

In the test run of five texts submitted by the author during the week of February 13, the detectors would never have unanimously and unambiguously classified the texts as AI-generated.

The results of the tools for an AI-generated description of zebras:

GPTZero: “Your text is likely to be written entirely by AI”

OpenAI: “The classifier considers the text to be possibly AI-generated.”

Content at Scale: “Likely both AI and Human!”

Writer.com: “75% human generated content”

Corrector.app: “Fake 42.55%”

CopyLeaks: “AI content detected”

The results of the tools for an AI-generated wedding invite:

GPTZero: “Your text is likely to be written entirely by AI”

OpenAI: “The classifier considers the text to be possibly AI-generated.”

Content at Scale: “Unclear if it is AI content!”

Writer.com: “13% human generated content”

Corrector.app: “Fake 99.97%”

CopyLeaks: “AI content detected”

According to the experiment, the tools worked better with human-written text, and in some cases they were all correct. However, Gorrell also notes that the results varied greatly over the course of the study, making systematic evaluation virtually impossible. But that is even more a sign of lack of reliability.

Reliable AI text recognition may not be realistic

Tech journalist Jon Stokes, co-founder of Ars Technica, thinks he knows why. It is likely that some detectors are familiar with the probabilities of a particular model, but would be overwhelmed by text from a different model, he said.

This is all the more questionable because most AI detectors tout their abilities as being independent of any particular model. In the wake of more easily customizable language models, which are also likely to make detection more difficult, this does not reflect well on the often paid services.

After all, with the release of its classifier, OpenAI has admitted that it can only reliably and correctly classify a small fraction of AI content. OpenAI CEO Sam Altman has also publicly stated several times that there is no such thing as permanently reliable AI text detectors and that the education system should not rely on it.

AI News Without the Hype – Curated by Humans

As a THE DECODER subscriber, you get ad-free reading, our weekly AI newsletter, the exclusive "AI Radar" Frontier Report 6× per year, access to comments, and our complete archive.

AI news without the hype
Curated by humans.

More than 16% discount.
Read without distractions – no Google ads.
Access to comments and community discussions.
Weekly AI newsletter.
6 times a year: “AI Radar” – deep dives on key AI topics.
Up to 25 % off on KI Pro online events.
Access to our full ten-year archive.
Get the latest AI news from The Decoder.

Subscribe to The Decoder

AI text recognition: each detector has its opinion

AI detectors do not seem to work reliably

Reliable AI text recognition may not be realistic

AI News Without the Hype – Curated by Humans

AI news without the hypeCurated by humans.

AI news without the hype
Curated by humans.