Site icon California18

Google invents a ChatGPT for music: here is MusicLM

After Midjourney for images, another type of AI will trouble the art world: MusicLM. This algorithm designed by Google, still at a prototype stage, generates sounds based on descriptions.

Music and artificial intelligence: the duel begins? AIs to generate images, such as Midjourney, are already on the market. For its part, ChatGPT is illustrated in the generation of text. The two programs work in a somewhat similar way: you enter a “prompt”, a command, which leads the algorithm to generate the desired response, whether it is a frog with an umbrella on Mars for Midjourney, or a cooking recipe for ChatGPT.

The applications of this type of AI are numerous and a giant like Google is working on many fields – such as medicine with MedPaLM. But the works that the group has posted on January 26, 2023 relate to the world of music. The program is called MusicLM.

This AI generates sounds from prompts. Clearly, we first insert a textual description: “ a soothing violin melody backed by a distorted guitar riff “. In response, MusicLM generates “ 24kHz music that stays consistent for several minutes “, describe the engineers.

A soothing violin melody backed by a distorted guitar riff »

To accompany their research paper, the Google researchers also put a platform online where you can find examples of everything MusicLM can do. The site has all the features:

  • Audio generation from a complex description: ” a funky track with a strong, danceable beat and a prominent bass line. A catchy keyboard melody adds a layer of richness and complexity to the song », as a prompt gives a sound of 30 seconds.
  • Long generation: the same process, but for a complete song up to 5 minutes.
  • Story mode: on the same principle, it is possible to change the sound from a sequence of prompts, a bit like a medley going from jazz to rock then to pop.
  • Conditioning of text and melody: This process is similar to a request made to Midjourney for a particular style of painting. You can ask to play Bella Ciao either on the piano, or on the guitar, or in whistles, etc.
  • Conditioning by an image: MusicLM generates a sound from a painting and its caption. Jacques-Louis David’s Bonaparte will thus be transformed into a triumphant audio sequence, for example.
Screenshot of examples where MusicLM generates sound from a painting. // Source: Google Research
  • Generation of 10 seconds from text: short sounds, like samples, for instruments (“guitar”, “flute”), or a specific genre (“rap”, “90s Berlin house”…), an atmosphere linked to a place (“beach in the Caribbean”, “escape from prison”), an era (a club from the 80s vs. a club from the 2000s), and even… a level of skill (from the piano coming from a beginner or a professional).

The platform shows that the algorithm is able to do a lot, but the results are rather mixed so far. The sounds are not very pleasant to listen to — sometimes dissonant, distorted, very flat. The sung lyrics are also very poor, and an avenue for improvement for future work, as the engineers themselves admit in their paper.

The technical process itself, on the other hand, is efficient: according to Google engineers, in any case, it would surpass all previous models in terms of audio quality and text fidelity. For example, on the platform, we see that the algorithm is capable of “diversity” by generating a different sound for the same description each time.

There is little doubt that the arrival of such an algorithm will, in turn, have controversial ethical implications. We remember that just a few days ago, singer Nick Cave was annoyed by fans sending him requested lyrics to ChatGPT “in the style of Nick Cave”. He denounced empty texts and shitty “. Artists generally warn about the social (and philosophical) implications of these tools in the art world.

Google goes on the attack on AI

The publication of this work is not a surprise. First of all because we already knew that Google was working on AI related to music. This type of algorithm was mobilized in 2021 by Google Arts & Culture teams, in collaboration with real musicians, to transform Kandinsky’s famous painting into musical sequences.

Then, the New York Times revealed how Google is preparing its counterattack against ChatGPT. The GAFA giant would like to release no less than 20 products related to artificial intelligence. And this, from next May. The MusicLM paper isn’t officially published yet, it needs to be accepted into a scientific journal to attest to the value of the research, which may take a few months — a fairly consistent timeline with public announcements in the spring of 2023.

Help us build the future of Numerama by answering this survey!

Exit mobile version