Update, November 18, 2022:
Meta AI and Papers with Code have responded to the criticism of Galactica: The demo remains offline. The models are still available for researchers interested in working and replicating the results from the paper.
Meta-AI CEO Yann LeCun defended the project on Twitter, saying Galactica is meant to be a demo, not a finished product and not a replacement for scientific work and thinking on your own, but a convenience – much like a driving assistant in a car.
“Real articles will contain new and interesting science. That will include articles whose authors used Galactica to help them write those papers,” LeCun writes. According to LeCun, the project is now “paused”.
Debate over human interaction with AI
Ultimately, the debate is less about Galactica’s inability to deliver accurate results at all times. Rather, it is about the risk of misuse when humans adopt Galactica’s results unquestioned, for example out of convenience, and thereby consciously or unconsciously increase the quantity and quality of misinformation in the scientific process.
There were similar debates about the risk of misuse when GPT-3 was first introduced, for instance in the context of a possible fake news glut. As a result, OpenAI has released GPT-3 only incrementally, and today employs numerous methods to reduce the risk of misuse.
However, similarly powerful large language models are now available as open source. An AI-driven flood of Fake News doesn’t seem to have materialized yet.
Opponents of Galactica might object that the language model is used in an academic context where accuracy is particularly important. In the future, however, researchers may use regular language models to support their work, which in turn may be even less accurate than Galactica. Stopping work on Galactica does not seem a sensible, much less a definitive, solution to the problem outlined.
Original article, November 17, 2022:
Just two days ago, Meta introduced “Galactica”, a large language model (LLM) trained with science data. It is supposed to simplify scientific research and speed up routine tasks. Some scientists warn against the model.
Together with the platform “Papers with Code”, Meta AI trained the large language model Galactica with 48 million scientific data pieces like papers, textbooks and reference material.
In benchmarks on reasoning or mathematical tasks, Galatica achieved better results than other language models, some of which were larger. But the relevance – especially in science – is in the details.
Is Galactica a threat to science?
On Twitter, some scientists are speaking out, sharply criticizing Galactica and Meta’s communication about the language model. Meta AI called it the first step toward a new interface for science.
The gist of the criticism: like all large language models, Galactica can convincingly output false information. These can be grossly incorrect or only subtly off, such as an incorrect date or reference.
PROBLEM: "Researchers are buried under a mass of papers, increasingly unable to distinguish between the meaningful and the inconsequential."
SOLUTION: build a model that generates limitless reams of text that sounds plausible but may contain serious mistakes@PapersWithCode pic.twitter.com/m88BsSMjTU
— Dan Elton (@moreisdifferent) November 16, 2022
Gary Marcus calls Galactica a danger to science. If the language model is not stopped, this would be the “tipping point in a gigantic increase in the flow of misinformation,” Marcus writes, calling it an “epochal event.”
A Wikipedia text about Marcus generated by Galactica contained 85 percent incorrect information, according to the researcher, but it was phrased plausibly. A “decent AI system” could check such information online, but Galactica doesn’t provide that feature, Marcus said.
“This is no joke. Galactica is funny, but the uses it will be put to are not.”
Fake reviews for fake papers
Michael Black, director at the Max Planck Institute for Intelligent Systems in Tübingen, Germany, conducted his own tests in which Galactica cited non-existent papers. Galactica, he said, was an interesting research project, but not useful for scientific work and dangerous to boot.
“Galactica generates text that’s grammatical and feels real. This text will slip into real scientific submissions. It will be realistic but wrong or biased. It will be hard to detect. It will influence how people think,” Black writes.
This could lead to a new “deep scientific fakes” era, he says, in which researchers receive citations for papers they never wrote. These false citations would then be carried over into other papers. “What a mess this will be,” Black writes.
A hint of possible AI hallucinations isn’t enough, he says: “Pandora’s box is open and we won’t be able to stuff the text back in.”
Galactica is not an accelerator for science and is not even useful as a writing aid, Black said. On the contrary, it distorts research and is a danger.
If we are going to have fake scientific papers, we might has well have fake reviews of fake papers. And then we can also have fake letters of reference for fake academics who get promoted to tenure at fake universities. I can then retire as there is nothing left for me to do.
Fundamental criticism of language model search
Linguist Emily Bender of the University of Washington finds particularly strong words. She refers to Galactica’s publication as garbage and pseudoscience.
“Language models have no access to ‘truth’, or any kind of ‘information’ beyond information about the distribution of word forms in their training data. And yet, here we are. Again,” Bender writes.
Bender and her colleague Chirag Shah had previously criticized the use of large language models as search engines, particularly Google’s plans in this area, in a March 2022 scientific paper.
Search based on language models could lead to further proliferation of fake news and increased polarization, they argue, because a search system needs to be able to do “more than matching or generating an answer.”
It needs to offer users different ways to interact and make sense of information, rather than “just retrieving it based on programmed notions of relevance and usefulness,” the researchers write.
In their view, information seeking is “a socially and contextually situated activity with diverse set of goals and needs for support that must not be boiled down to a combination of text matching and text generating algorithms.”
Similar critiques of Galactica are currently piling up on Twitter. Meta AI and Papers with Code have not yet commented, but they have disabled the demo feature of the Galactica website.