The “deep voice”, or how AI enables voice attacks

How to embezzle $ 35 million with a few well-received emails and a phone call? By simply using “deep voice” tools that can imitate the voice of almost anyone.

You have probably heard of deepfakes, these videos manipulated by artificial intelligence, to make anyone say anything in a relatively convincing way. From now on, it will also be necessary to take into account the “deep voices”, a tool which makes it possible to clone a known voice. And the system has, unsurprisingly, been used for scams before.

Betrayed by emails and a phone call

As detailed a Forbes article published on October 14, 2021, in the United Arab Emirates a banker authorized a transfer of 35 million dollars, thinking he recognized the voice of a customer on the phone. The scam took place in early 2020, when a hacker called a local bank branch posing as the CEO of a large corporation. By disguising his voice using this “deep voice” mechanism, the man was able to convince the banker to transfer the comfortable sum to several accounts located in the United States, using the pretext of “ acquisition Business to come. ” The interlocutor’s voice sounded like that of the director of the company and the branch manager therefore believed that the call was legitimate. » details the complaint filed with the US Department of Justice.

Source : Max Wiedemann, Quirin Berg

To give a credible air to the trick, several emails had been sent to the director of the agency, all supposedly sent from the official email address of the client in question. One of them even contained a validation letter from the supposed CEO to one of his lawyers in charge of the case. Convinced by the phone call and the legitimate appearance of the emails, the banker therefore authorized the transfer.

Attacks that will multiply

This isn’t exactly the first time that a scam of this type has taken place. In 2019, a criminal had used the same tools to impersonate the CEO of a German company, who needed an urgent transfer to pay one of his suppliers. The manager thinking ” recognize the slight German accent of his boss and the melody of his voice on the phone »Therefore transferred the tidy sum of 243,000 dollars to a Hungarian bank account.

Asked by Forbes, a cybersecurity expert explains that “ audio manipulation, which is easier to orchestrate than making fake videos, is only going to increase. Without education in this new type of attack, along with better authentication methods, many businesses are likely to fall for these very compelling conversations. »

The need for strong authentication

By collecting passages from interviews, podcasts or videos, it becomes possible to recreate a known voice and make it say anything. As a result, it becomes more and more difficult to trust a phone call to gauge the authenticity of anything. As the expert interviewed by Forbes explains, voice is no longer a sufficient authentication factor. In the case of the $ 35 million scam, a strong two-factor authentication system (such as biometric validation) would probably have reduced the risk. Even official-looking emails cannot be used as a validation method, since it is possible to create false addresses, via the technique email spoofing among others.

In another style, a recent documentary retracing the life and death of chef Anthony Bourdain had caused controversy, because of a few lines of an email that had been read by a digital clone of Bourdain’s voice. This kind of manipulation is made easier and easier by the rise of artificial intelligence. Companies like Replica Where Descript have even made it their business (for legitimate uses of course). Other firms like Pindrop are specialized in digitized voice detection.

Leave a Comment