Google AI Announced the First-ever Text-to-Music Generator AudioLM

In Brief

AudioLM can produce music just by listening to sounds

Mubert AI to continue human speech and piano music

The Trust Project is a worldwide group of news organizations working to establish transparency standards.

With GPT-3 and others, the idea of generative AI has a good chance of moving forward. We also discovered the concepts of inpainting and outpainting; AI skillfully completes the images while keeping the theme and the style. What about music?

And yet again! Since all of this is based on AI language models that retain meaning, it was just a matter of time before this technology would be applied to music. And now the time has come.

Google AI announced first-ever text-to-music generator AudioLM

According to recent Google research, a new framework for audio production called AudioLM may be taught to create realistic speech and piano music simply by listening to sounds. Due to its long-term consistency and excellent fidelity, AudioLM surpasses earlier systems and advances audio creation with applications in voice synthesis and computer-assisted music.

We have developed a system to recognize AudioLM-produced synthetic sounds using the same AI concepts that underpinned the creation of our previous models.

AudioLM from Google AI can extend an acoustic passage while keeping “intent.” As of now, it has been trained to continue human speech and piano music, based on a limited sample of input data. Check the sample below.

The criteria for speech were straightforward: Listeners were asked to assess whether the continuation sounded like human speech. With the music, it was discovered that the “continuation” of the section supplied for input is far superior in quality than all current music generators from scratch, such as JukeBox. With a suggestion at the input, the AI continues the music considerably better.

Google AI announced first-ever text-to-music generator AudioLM

Human raters listened to audio samples to confirm the results. They determined whether they were hearing a real continuation of a human voice that had been recorded or an artificial voice produced by AudioLM. Their data indicate a 51.2% success rate. As a result, it will be challenging for the average listener to distinguish between speech produced by AudioLM and actual human speech.

Does text-to-music technology alter the music business?

A text-to-music generator based on the Mubert API was recently announced by another AI model, Mubert. Mubert creates a different set of sounds for each request that you send. The likelihood of a repeat is really slim. Music is created when a request is made; it is not pulled from a database of finished tunes. How truly generative this music is is a common question.

Does text-to-music technology alter the music business?

Sounds are chosen before being created. Both the input prompt and the Mubert API tags are encoded to a transformer neural network’s latent space vector. The closest tags vector for each query is then chosen, and the accompanying tags are transmitted to our API to create music. No neural network was used to construct any of the sounds (separate loops for bass, leads, etc.); all of the sounds were produced by musicians and sound designers.

Mubert’s next significant step is to take items from the current world, such as photos, movies, scenarios, and presentations, and create the music of the world around you.

Here’s what you can get by recklessly putting text prompts into the mouth of the musical Mubert AI:

This is the initial stage in the process of building a more sophisticated and precise generating algorithm, but this will take time and money.

However, text-to-music technology is already available, so you can generate albums in bulk by switching out “input prompt” for “write a random prompt script.” Seems artists are no longer required.

Read more related news:


Any data, text, or other content on this page is provided as general market information and not as investment advice. Past performance is not necessarily an indicator of future results.

Damir Yalalov

Damir is the Editor/SEO/Product Lead at He is most interested in SecureTech, Blockchain, and FinTech startups. Damir earned a bachelor's degree in physics.

Follow Author

More Articles