Fixing bugs in AI chats will take time

BOSTON (AP) — White House officials concerned about the potential for social harm from AI-powered chatbots and Silicon Valley powerhouses rushing them to market are keen on a three-day competition that ends on Sunday at the DefCon hacker convention in Las Vegas.

Some 3,500 competitors have seized on laptops seeking to expose flaws in eight leading large-scale language models representative of the next big thing in technology. But don’t expect quick results from this first independent “red team.”

The findings will be made public more or less until February. And even then, fixing flaws in these digital constructs—whose inner workings aren’t entirely reliable or fully understandable even to their creators—will take time and millions of dollars.

Current models of artificial intelligence are simply too unwieldy, brittle and malleable, academic and corporate research shows. Security was an afterthought in its making, as data scientists amassed staggeringly complex collections of images and text. These are prone to racial and cultural bias, and are easy to manipulate.

“It’s tempting to pretend we can sprinkle some security magic dust on these systems after they’re built, patch them into submission, or bolt special security gadgets on the side,” said Gary McGraw, a cybersecurity veteran and co-founder. from the Berryville Institute of Machine Learning.

Anthropic’s Michael Sellitto, who provided one of the AI ​​test models, acknowledged at a news conference that understanding its capabilities and security issues “is kind of an open area of ​​scientific research.”

Conventional software uses well-defined code to issue explicit step-by-step instructions. OpenAI’s ChatGPT, Google’s Bard and other language models are different. Trained in large part by ingesting and sorting billions of data points in internet traces, they are perpetual works in progress, an unsettling prospect given their transformative potential for humanity.

After publicly launching chatbots last fall, the generative AI industry has repeatedly had to plug security holes exposed by researchers and experts.

Tom Bonner of AI security firm HiddenLayer, a speaker at DefCon this year, tricked a Google system into labeling a piece of malware as harmless simply by inserting a line saying “this is safe to use.” .

“There are no good railings,” he said.

Another researcher had ChatGPT create phishing emails and a recipe to violently wipe out humanity, a violation of their code of ethics.

A team including Carnegie Mellon researchers found that leading chatbots are vulnerable to automated attacks that also produce harmful content. “It is possible that the very nature of deep learning models makes such threats unavoidable,” they wrote.

The alarms had already sounded before

In its final 2021 report, the US Artificial Intelligence Homeland Security Commission said that attacks on commercial artificial intelligence systems were already occurring and that “with rare exceptions, the idea of ​​protecting artificial intelligence systems has been an afterthought in the engineering and deployment of artificial intelligence systems, with inadequate investment in research and development.”

Serious hacks, regularly reported only a few years ago, are barely disclosed. The stakes are too high, and in the absence of regulation, “people can and are sweeping things under the rug right now,” Bonner said.

The attacks fool the logic of artificial intelligence in ways that may not even be clear to their creators. And chatbots are especially vulnerable because we interact with them directly in plain language. That interaction can alter them in unexpected ways.

Researchers have found that “poisoning” a small collection of images or text into the vast sea of ​​data used to train AI systems can wreak havoc and easily go undetected.

A study co-authored by Florian Tramér of the Swiss University ETH Zurich found that corrupting just 0.01% of a model was enough to break it, and cost as little as $60. The researchers waited for two models of a handful of websites used in crawlers to expire. Then they bought the domains and posted incorrect data about them.

Hyrum Anderson and Ram Shankar Siva Kumar, who red-teamed AI while they were colleagues at Microsoft, call the state of AI security for text-based and image-based models “regrettable” in their new book “Not with a Bug but with a Sticker”. One example is the virtual assistant Alexa, who is tricked into interpreting a Beethoven concert clip as a command to order 100 frozen pizzas.

Surveying more than 80 organizations, the authors found that the vast majority did not have a response plan to a data poisoning or data theft attack. Most of the industry “wouldn’t even know it happened,” they wrote.

Andrew W. Moore, a former Google executive and dean of Carnegie Mellon, says he dealt with attacks on Google’s search software more than a decade ago. And between late 2017 and early 2018, spammers tampered with Gmail’s AI-powered detection service four times.

Big AI players say safety and security are top priorities and voluntarily committed to the White House last month to present their models, largely “black boxes” whose contents are kept secret, for the external scrutiny.

But there are concerns that companies are not doing enough.

Tramér envisions search engines and social media platforms being manipulated for financial gain and misinformation by exploiting weaknesses in the artificial intelligence system.

A savvy job seeker might, for example, figure out how to convince a system that they are the only correct candidate.

Ross Anderson, a computer scientist at the University of Cambridge, is concerned that AI bots erode privacy as people engage them to interact with hospitals, banks and employers, and malicious actors leverage them to extract financial, employment data. or health of supposedly closed systems.

AI language models can also contaminate themselves by retraining from junk data, research shows.

Another concern is company secrets being ingested and generated by artificial intelligence systems. After a Korean business news outlet reported on such an incident at Samsung, corporations including Verizon and JPMorgan banned most employees from using ChatGPT at work.

While the major AI players have security staff, many smaller competitors likely do not, meaning poorly protected plug-ins and digital agents could multiply. Startups are expected to launch hundreds of offerings based on licensed pre-trained models in the coming months.

California18

Welcome to California18, your number one source for Breaking News from the World. We’re dedicated to giving you the very best of News.

Leave a Reply