Meta is Training its AI on Religious Texts

by Valeria Goncharenko

Published: May 25, 2023 at 8:00 am Updated: May 25, 2023 at 8:19 am

by Karolina Gaszcz

Edited and fact-checked: May 25, 2023 at 8:00 am

In Brief

Meta has developed an AI-powered text-to-speech technology that can identify 4,000 languages.

The project’s aim is to preserve languages.

The company is using the Bible and other religious texts to train its Massively Multilingual Speech models.

Meta is Training its AI on Religious Texts

Tech giant Meta announced a new AI-powered text-to-speech tool. According to the announcement, it can identify more than 4,000 languages. The initiative aims to preserve languages. Notably, the company is using religious texts and the Bible to do so.

“Collecting audio data for thousands of languages was our first challenge because the largest existing speech datasets cover 100 languages at most. To overcome this, we turned to religious texts, such as the Bible, that have been translated into many different languages and whose translations have been widely studied for text-based language translation research,” writes Meta in a blog post.

According to the company, the original data is obtained from the Bible. In addition, the Meta AI team got audio recordings and text from FaithComesByHearing.com, GoTo.Bible, and Bible.com.

Meta says it has recorded more than 6,255 languages and dialects in the project, including Bible stories, evangelistic messages, scripture readings, and song recordings. It also states that its models work equally well for women’s voices, even though readings usually feature men’s voices.

Notably, the data of readings of the New Testament provides approximately 32 hours of readings per language. Overall, the dataset features over 1,100 languages. According to Christian ethicists that advised Meta AI on this project, most Christians do not consider the New Testament and its translations too sacred to be used in machine learning. The same applies to other religious texts.

“While the content of the audio recordings is religious, our analysis shows that this doesn’t bias the model to produce more religious language,” states the blog post.

So, the religious training data would not bias the systems into a particular point of view. The systems will not produce religious-style text either.

Read more related articles:

Tags:

Disclaimer

In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.

About The Author

Valeria is a reporter for Metaverse Post. She focuses on fundraises, AI, metaverse, digital fashion, NFTs, and everything web3-related. Valeria has a Master’s degree in Public Communications and is getting her second Major in International Business Management. She dedicates her free time to photography and fashion styling. At the age of 13, Valeria created her first fashion-focused blog, which developed her passion for journalism and style. She is based in northern Italy and often works remotely from different European cities. You can contact her at [email protected]

Valeria Goncharenko