News Report Technology
November 07, 2023

Whisper V3 by OpenAI Goes Open Source, Expanding Voice Recognition Across Languages

In Brief

OpenAI announced the open-source release of WHISPER V3, a state-of-the-art model for voice recognition in multiple languages.

OpenAI Unveils Whisper V3: Revolutionizing Voice Recognition Across Languages

Artificial intelligence (AI) research company OpenAI, has taken a significant leap in the realm of speech recognition by open-sourcing its state-of-the-art model Whisper large-v3, during their Developer Day event.

This latest iteration of the Whisper model demonstrates a remarkable ability to understand and transcribe voice in a multitude of languages, broadening its applicability beyond the English-centric models of the past.

Whisper large-v3 thrives in diverse conditions, adeptly handling various language inputs. According to OpenAI, while models targeting English applications like tiny.en and base.en show superior performance. However, Whisper large-v3’s effectiveness is subject to fluctuation depending on the language being transcribed.

Originally focusing on English upon its launch last September, the model expanded its capabilities with version 2 in December to include support for a range of languages, though it did not specify which ones.

Whisper large-v3 available under a permissive license on GitHub, enables users to transcribe various forms of content with best-in-class accuracy. Its unique timestamp feature adds significant value, potentially revolutionizing subtitle generation on video platforms like YouTube.

Whisper V3 by OpenAI Goes Open Source, Expanding Voice Recognition Across Languages
Source: OpenAI

OpenAI’s Multilingual Speech Recognition Breakthrough

Whisper large-v3 processes audio by first segmenting it into 30-second clips and then running it through a complex system that includes an encoder and decoder to generate the output.

These components work in unison to predict the textual transcription of the spoken words. One of the technical highlights of Whisper large-v3 is its language identification feature, which not only transcribes multilingual speech but also translates it into English.

While initial plans suggested integration with the popular ChatGPT to facilitate direct voice interaction with the chatbot, OpenAI has opted to grant the public direct access to Whisper large-v3. It’s worth noting that the current target audience for Whisper is primarily researchers, not the general public.

OpenAI’s commitment to advancing robust speech processing is evident in their decision to open-source Whisper large-v3. The organization underscores its objective to foster the development of practical applications and further research in this field.

OpenAI has refined its AI tool with a vast dataset featuring 680,000 hours of closely monitored data gathered from the internet, including a substantial share of non-English audio. This step aims to fuel innovation and broaden the scope of voice recognition technology worldwide.

Disclaimer

In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.

About The Author

Nik is an accomplished analyst and writer at Metaverse Post, specializing in delivering cutting-edge insights into the fast-paced world of technology, with a particular emphasis on AI/ML, XR, VR, on-chain analytics, and blockchain development. His articles engage and inform a diverse audience, helping them stay ahead of the technological curve. Possessing a Master's degree in Economics and Management, Nik has a solid grasp of the nuances of the business world and its intersection with emergent technologies.

More articles
Nik Asti
Nik Asti

Nik is an accomplished analyst and writer at Metaverse Post, specializing in delivering cutting-edge insights into the fast-paced world of technology, with a particular emphasis on AI/ML, XR, VR, on-chain analytics, and blockchain development. His articles engage and inform a diverse audience, helping them stay ahead of the technological curve. Possessing a Master's degree in Economics and Management, Nik has a solid grasp of the nuances of the business world and its intersection with emergent technologies.

Hot Stories
Join Our Newsletter.
Latest News

Orbitt Staking Goes Live With Nearly $2M In ORBT Rewards

by Alisa Davidson
December 03, 2024

From Ripple to The Big Green DAO: How Cryptocurrency Projects Contribute to Charity

Let's explore initiatives harnessing the potential of digital currencies for charitable causes.

Know More

AlphaFold 3, Med-Gemini, and others: The Way AI Transforms Healthcare in 2024

AI manifests in various ways in healthcare, from uncovering new genetic correlations to empowering robotic surgical systems ...

Know More
Read More
Read more
Bitcoin Price Drops Below $88,000 On South Korean Crypto Exchanges As Country Declares Martial Law
Business Markets News Report Technology
Bitcoin Price Drops Below $88,000 On South Korean Crypto Exchanges As Country Declares Martial Law
December 3, 2024
New Cryptocurrencies Set to Redefine Blockchain Innovation in 2025
Opinion Business Markets Technology
New Cryptocurrencies Set to Redefine Blockchain Innovation in 2025
December 3, 2024
Orbitt Staking Goes Live With Nearly $2M In ORBT Rewards
News Report Technology
Orbitt Staking Goes Live With Nearly $2M In ORBT Rewards
December 3, 2024
Chromia Completes Asgard Mainnet Upgrade And Launches Oracle Extension
News Report Technology
Chromia Completes Asgard Mainnet Upgrade And Launches Oracle Extension
December 3, 2024