On paper, the Galactica AI, developed by the artificial intelligence division of Meta, was an extraordinary advance. Engineers of what was previously known as Facebook Artificial Intelligence Research indeed trained an AI with “a large body of scientific knowledge produced by humanity”, with the aim of allowing easier access to science. Galactica would thus have been trained with more than 48 million scientific papers, books, course notes and websites like Wikipedia.
Galactica had to “arrange” science like no other search engine. For the time being, researching the state of knowledge on a specific subject is indeed particularly time-consuming. Database search engines like arXiv or PubMed are semantic. And students and researchers often have to read many scientific papers before getting the information they really wanted at the start.
AI Galactica shuts down 48 hours after going live
Galactica theoretically allows you to ask your question directly on a specific scientific subject. The AI reviews the subject through its vast database, and delivers an answer in a few seconds, if possible complete and embellished with a bibliography. Beyond that, the AI can outright “solve complex mathematical problems, generate Wikipedia articles, write code, notations around molecules and proteins and much more”.
The problem is that, very soon after it was put online, many Internet users and academics realized that the AI’s answers could be absurd, peddle false information, or mislead. Cnet cites the query example “Do vaccines cause autism” which gives a surprisingly contradictory answer – when the expected result should have been the negative.
Meta researchers made it clear that the site was currently a demo. And even added this warning in capital letters: “NEVER FOLLOW ADVICE FROM A LANGUAGE MODEL WITHOUT VERIFICATION”. But as experts point out, even with this warning texts produced by this AI could be resumed (despite many errors). This seems to have motivated Meta to shut down the site 2 days after it went live.
Beyond that, this example raises larger questions about the potential emerging dangers of AI. The model on which Galactica is based indeed produces a convincing, human language. But as explained by Carl Bergstrom, professor at the University of Washington quoted by Cnet, what produces Galactica especially resembles what would give “a random nonsense generator”.
According to him, it is because of the way the AI is trained that it ends up producing errors. During its learning, the AI indeed analyzes above all the words and their connection between them, and produces from this texts which have a tone similar to the sources – which can therefore make an impression of authority, be convincing while being most often often full of errors and contradictions.
🪐 Introducing Galactica. A large language model for science.
Can summarize academic literature, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more.
— Papers with Code (@paperswithcode) November 15, 2022