An analysis of 14 million PubMed abstracts shows that AI text generators have influenced at least 10 percent of scientific abstracts since ChatGPT's introduction. In some fields and countries, the percentage is even higher.
Researchers from the Universities of Tübingen and Northwestern examined linguistic changes in 14 million scientific abstracts between 2010 and 2024. They found that ChatGPT and similar AI text generators led to a sharp increase in certain style words.
The researchers first identified words that appeared significantly more frequently in 2024 compared to previous years. These included many verbs and adjectives typical of ChatGPT's writing style, such as "delve," "intricate," "showcasing," and "underscores."
Based on these markers, the researchers estimate that in 2024, AI text generators influenced at least 10 percent of all PubMed abstracts. In some cases, the impact was even greater than that of words like "Covid," "pandemic," or "Ebola," which were dominant in their time.
The researchers found that for PubMed subgroups in countries such as China and South Korea, around 15 percent of abstracts were created using ChatGPT, compared to just 3 percent in the UK. However, this doesn't necessarily mean that UK authors use ChatGPT less.
In fact, according to the researchers, the actual use of AI text generators is likely to be much higher. Many researchers edit AI-generated text to remove typical marker words. Native speakers may have an advantage here because they're more likely to notice such phrases. This makes it difficult to determine the true proportion of AI-influenced abstracts.
Where it was measurable, the use of AI was particularly high in journals such as Frontiers and MDPI, at about 17 percent, and in IT journals it reached 20 percent. The highest proportion of Chinese authors in IT journals was 35 percent.
Meta was too early
AI could assist scientific authors and make articles more readable. According to study author Dmitry Kobak, using generative AI specifically for abstracts isn't necessarily problematic.
However, AI text generators can also invent facts, reinforce biases, and even plagiarize. They could also reduce the diversity and originality of scientific texts.
The researchers call for a reassessment of guidelines for using AI text generators in science.
In this context, it seems almost ironic that Meta's scientific open-source language model "Galactica," published shortly before ChatGPT, faced harsh criticism from parts of the scientific community, forcing Meta to take it offline.
This clearly didn't prevent the introduction of generative AI into scientific writing, but it may have prevented a system optimized for this task.