The success of large language models such as GPT-3 has stimulated much debate in AI research. For despite the progress, there is disagreement about the future of AI. A survey of many researchers in the field shows how divided the scientific community is.
Large language models such as OpenAI's GPT-3 provided previously unseen capabilities in dealing with text. They also provided the blueprint for training other large AI models, such as code or generative image models like DALL-E 2.
There are differences in the specific architecture, but nearly all current large AI models rely on some form of transformer, as well as big data and lots of computing power.
The role of scaling on the path to better AI is controversial.
Such models are stochastic parrots to some researchers. Others see them as fundamental models that can be specialized. Still, others think that all that is needed is more scaling of the models for the big goal: artificial general intelligence.
The debate about the potential of large AI models is prominent on Twitter. A few months ago, it was particularly fueled by Deepmind's GATO, a Transformer-based AI system that can handle many different tasks.
Is scaling up a system like Gato paving the way to AGI - or is the system simply running many mediocre specialist models in parallel?
There are many opinions on this question. Cognitive scientist and Deep Learning critic Gary Marcus criticized Deepmind's "Scaling-Uber-Alles" approach as short-sighted. Metas AI chief Yann LeCun also doesn't see scaling alone as the right way to go.
Study provides insight into NLP community opinion
A new study now provides deeper insight into the opinions of AI researchers in the field of natural language processing. A total of 480 researchers participated in the survey.
The study's authors estimate that this means they reached about five percent of all researchers who had at least two publications at the Association for Computational Linguistics conference between 2019 and 2022.
The survey includes detailed questions on resource use, language understanding, AGI, ethics, the role of benchmarks, research, and industry. A majority of participants were from the U.S. (58 percent) and Europe (23 percent). A smaller portion came from other regions of the world. Sixty-seven percent reported being men, 25 percent women.
"Scaling-Uber-Alles" doesn't work, no clear verdict on language understanding
Only 17 percent of respondents think scaling can solve almost every important problem in NLP research. 72 percent think their field puts too much focus on scaling. Scaling also comes with higher energy requirements: 60 percent think the CO₂ footprint for AI training of large models is problematic.
By contrast, respondents are undecided on a key question: could models like GPT-3 understand language in a non-trivial sense with enough data and computing power? 51 percent of respondents think so, 49 percent see no signs of this.
Meanwhile, 58 percent consider artificial general intelligence a priority in NLP research. 57 percent think the development of large language models is an important step on the road to CCI.
Text automation enabled by AI is seen by 73 percent as potentially problematic for society. This could be related to the fact that 82 percent of participants believe that industrial companies will be responsible for the most cited papers in the next ten years. 74 percent think that they already have too much influence on the research field.
The study's authors hope the survey will help educate the NLP community about its own beliefs and thus contribute to productive discourse. All results can be viewed on the NLP Survey website.