After ChatGPT, this new artificial intelligence from Microsoft is already causing concern

News JVTech After ChatGPT, this new artificial intelligence from Microsoft is already causing concern

Published on 01/18/2023 at 06:40

While ChatGPT continues to hit the headlines with regard to the production of texts via artificial intelligence, it is another technology that is beginning to worry: Vall-E, an AI capable of imitating voices.

2023 looks set to be the year of artificial intelligence : proof of this is the popularity of ChatGPT, the OpenAI AI that fascinates as much as it worries, due to its ability to generate texts of all kinds with usable consistency. Certainly, artificial intelligence can still be improved, but its ability to improve rapidly raises questions about the future that this kind of technology is preparing for us.

But hardly have we had time to recover from the discovery of ChatGPT that another artificial intelligence comes to the fore: Vall-E. This time, it is an AI developed by Microsoftand which is able to imitate any human voice based on just three seconds of audio recording.

Vall-E, towards a voice synthesis deepfake?

Microsoft researchers used EnCodec technology, developed by Meta and formalized in October 2022, to develop their artificial intelligence. They then “educated” her using 60,000 hours of speeches delivered by 7,000 different people. Currently, Vall-E only speaks English, but the AI already offers very convincing results.

The only real weakness of the software: the accents. For now, this is one of Vall-E’s weak points: if the person whose voice the AI is to reproduce has an accent, then the result can be improved. In addition, there are also a few bugs, including mispronounced words, but nothing that can’t be fixed in the future.

A tone of voice captured in just three seconds

“To synthesize the personalized speech, VALL-E generates the corresponding acoustic tokens conditioned on the acoustic tokens of the 3-second inscribed recording and the phoneme prompt, which respectively constrain the speaker and the content information. Finally, the generated acoustic tokens are used to synthesize the final waveform with the corresponding neural codec decoder”summary the recently published article by Microsoft researchers.

A Github page allows you to listen to some examples of voice synthesis carried out using Vall-E. We can compare the extract of three seconds spoken by a real person, and the result of artificial intelligence. Sometimes it’s not perfect, but other times it’s really hard to tell who’s the AI and who’s the human.

An artificial intelligence that is not accessible to the public

In the immediate future, Microsoft researchers have decided not to give Vall-E access to the public. The reason for this is quite obvious: “Since Vall-E could synthesize speech that maintains speaker identity, it may carry potential risks of misuse of the model, such as voice spoofing or user impersonation. a specific speaker. »

The Vall-E developers are therefore currently working on a method that should ultimately make it possible to clearly identify whether a voice message has been manipulated by artificial intelligence. It is certain that we will hear more about Vall-E in the future.

Add California18 to your Google news feed.

Vall-E, towards a voice synthesis deepfake?

A tone of voice captured in just three seconds

An artificial intelligence that is not accessible to the public

California18

You Might Also Like

Microsoft-Activision: an Xbox event in London in the day of the verdetto della CMA?

Thrustmaster T128 review: could this be your entry into the world of racing sims?

Smart heating: Fritz!Dect 302 / 301 now cheap at Media Markt

Austin Butler on his Dune character: "The hero in his own story"

Command line control: GitHub CLI 2.20 helps to find extensions

The director of The Marvels promises that her film will be different from what was seen in the UCM, but her description points to being more of the same

Leave a Reply Cancel reply