News Report Technology
January 08, 2023

VALL-E: Microsoft’s new zero-shot text-to-speech model can duplicate everyone’s voice in three seconds

In Brief

With just a three-second sample of any voice, the transformer-based TTS model VALL-E can produce speech in every voice.

This is a significant advancement in the direction of more natural-sounding TTS systems.

Microsoft has, however, provided a few samples of the model in use, and it is evident that this represents a significant development in TTS technology.

Since the release of the first text-to-speech (TTS) model, researchers have been looking for ways to improve the way these systems generate speech. The latest model from Microsoft, VALL-E, is a significant step forward in this regard.

VALL-E is a transformer-based TTS model that can generate speech in any voice after only hearing a three-second sample of that voice. This is a significant improvement over previous models, which required a much longer training period in order to generate a new voice.

VALL-E: Microsoft’s new zero-shot text-to-speech model can duplicate everyone’s voice in three seconds
VALL-E is an amazing technological feat that has the potential to change the way we interact with digital media.
Related article: Microsoft has released a diffusion model that can build a 3D avatar from a single photo of a person

Additionally, the intonation, charisma, and style of the voice are all kept intact in the generated speech. This is an important step forward in making TTS systems sound more natural.

VALL-E: Microsoft’s new zero-shot text-to-speech model can duplicate everyone’s voice in three seconds

This model is transformer-based and has a Dale-1 appearance. Not to be confused with the diffusion-based Dalle-2. The code is still lacking. And users have some skepticism that they will post it.

Related article: Microsoft’s VALL-E appears to be the most dangerous scam software ever

However, Microsoft has released a few examples of the model in action, and it is clear that this is a major advance in TTS technology.

Example #1:

Example #2:

Example #3:

Read more about AI:

Disclaimer

In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.

About The Author

Damir is the team leader, product manager, and editor at Metaverse Post, covering topics such as AI/ML, AGI, LLMs, Metaverse, and Web3-related fields. His articles attract a massive audience of over a million users every month. He appears to be an expert with 10 years of experience in SEO and digital marketing. Damir has been mentioned in Mashable, Wired, Cointelegraph, The New Yorker, Inside.com, Entrepreneur, BeInCrypto, and other publications. He travels between the UAE, Turkey, Russia, and the CIS as a digital nomad. Damir earned a bachelor's degree in physics, which he believes has given him the critical thinking skills needed to be successful in the ever-changing landscape of the internet. 

More articles
Damir Yalalov
Damir Yalalov

Damir is the team leader, product manager, and editor at Metaverse Post, covering topics such as AI/ML, AGI, LLMs, Metaverse, and Web3-related fields. His articles attract a massive audience of over a million users every month. He appears to be an expert with 10 years of experience in SEO and digital marketing. Damir has been mentioned in Mashable, Wired, Cointelegraph, The New Yorker, Inside.com, Entrepreneur, BeInCrypto, and other publications. He travels between the UAE, Turkey, Russia, and the CIS as a digital nomad. Damir earned a bachelor's degree in physics, which he believes has given him the critical thinking skills needed to be successful in the ever-changing landscape of the internet. 

From Ripple to The Big Green DAO: How Cryptocurrency Projects Contribute to Charity

Let's explore initiatives harnessing the potential of digital currencies for charitable causes.

Know More

AlphaFold 3, Med-Gemini, and others: The Way AI Transforms Healthcare in 2024

AI manifests in various ways in healthcare, from uncovering new genetic correlations to empowering robotic surgical systems ...

Know More
Read More
Read more
Arbitrum, Sequence, And Ubisoft’s Player-Guided Experience, Captain Laserhawk: The G.A.M.E., Scheduled For Launch On December 18
News Report Technology
Arbitrum, Sequence, And Ubisoft’s Player-Guided Experience, Captain Laserhawk: The G.A.M.E., Scheduled For Launch On December 18
December 11, 2024
Binance Labs Invests In Perena, Supporting It In Driving Solana Stablecoin Adoption
Business News Report Technology
Binance Labs Invests In Perena, Supporting It In Driving Solana Stablecoin Adoption
December 11, 2024
The First Year of SSV Network’s DVT Adoption and the Path to Securing Ethereum’s Future
Opinion Markets Software Technology
The First Year of SSV Network’s DVT Adoption and the Path to Securing Ethereum’s Future
December 11, 2024
Binance Kicks Off $5M ‘Airdrop Carnival,’ Celebrating Wallet Relaunch With Newly Enhanced Airdrop Zone
News Report Technology
Binance Kicks Off $5M ‘Airdrop Carnival,’ Celebrating Wallet Relaunch With Newly Enhanced Airdrop Zone
December 11, 2024