News Report Technology
November 07, 2023

Whisper V3 by OpenAI Goes Open Source, Expanding Voice Recognition Across Languages

In Brief

OpenAI announced the open-source release of WHISPER V3, a state-of-the-art model for voice recognition in multiple languages.

OpenAI Unveils Whisper V3: Revolutionizing Voice Recognition Across Languages

Artificial intelligence (AI) research company OpenAI, has taken a significant leap in the realm of speech recognition by open-sourcing its state-of-the-art model Whisper large-v3, during their Developer Day event.

This latest iteration of the Whisper model demonstrates a remarkable ability to understand and transcribe voice in a multitude of languages, broadening its applicability beyond the English-centric models of the past.

Whisper large-v3 thrives in diverse conditions, adeptly handling various language inputs. According to OpenAI, while models targeting English applications like tiny.en and base.en show superior performance. However, Whisper large-v3’s effectiveness is subject to fluctuation depending on the language being transcribed.

Originally focusing on English upon its launch last September, the model expanded its capabilities with version 2 in December to include support for a range of languages, though it did not specify which ones.

Whisper large-v3 available under a permissive license on GitHub, enables users to transcribe various forms of content with best-in-class accuracy. Its unique timestamp feature adds significant value, potentially revolutionizing subtitle generation on video platforms like YouTube.

Source: OpenAI

OpenAI’s Multilingual Speech Recognition Breakthrough

Whisper large-v3 processes audio by first segmenting it into 30-second clips and then running it through a complex system that includes an encoder and decoder to generate the output.

These components work in unison to predict the textual transcription of the spoken words. One of the technical highlights of Whisper large-v3 is its language identification feature, which not only transcribes multilingual speech but also translates it into English.

While initial plans suggested integration with the popular ChatGPT to facilitate direct voice interaction with the chatbot, OpenAI has opted to grant the public direct access to Whisper large-v3. It’s worth noting that the current target audience for Whisper is primarily researchers, not the general public.

OpenAI’s commitment to advancing robust speech processing is evident in their decision to open-source Whisper large-v3. The organization underscores its objective to foster the development of practical applications and further research in this field.

OpenAI has refined its AI tool with a vast dataset featuring 680,000 hours of closely monitored data gathered from the internet, including a substantial share of non-English audio. This step aims to fuel innovation and broaden the scope of voice recognition technology worldwide.

Disclaimer

In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.

About The Author

Nik is an accomplished analyst and writer at Metaverse Post, specializing in delivering cutting-edge insights into the fast-paced world of technology, with a particular emphasis on AI/ML, XR, VR, on-chain analytics, and blockchain development. His articles engage and inform a diverse audience, helping them stay ahead of the technological curve. Possessing a Master's degree in Economics and Management, Nik has a solid grasp of the nuances of the business world and its intersection with emergent technologies.

More articles
Nik Asti
Nik Asti

Nik is an accomplished analyst and writer at Metaverse Post, specializing in delivering cutting-edge insights into the fast-paced world of technology, with a particular emphasis on AI/ML, XR, VR, on-chain analytics, and blockchain development. His articles engage and inform a diverse audience, helping them stay ahead of the technological curve. Possessing a Master's degree in Economics and Management, Nik has a solid grasp of the nuances of the business world and its intersection with emergent technologies.

Hot Stories

BRICS Nations Eye Stablecoin Trade Solution

by Viktoriia Palchik
May 01, 2024
Join Our Newsletter.
Latest News

Sentencing Day Arrives: CZ’s Fate Hangs in Balance as US Court Considers DOJ’s Plea

Changpeng Zhao is poised to face sentencing in a U.S. court in Seattle today.

Know More

Samourai Wallet Founders Accused of Facilitating $2B in Darknet Deals

The apprehension of the Samourai Wallet founders represents a notable setback for the industry, underscoring the persistent ...

Know More
Join Our Innovative Tech Community
Read More
Read more
Ankr Collaborates With AI Blockchain Platform Talus Network To Unlock Bitcoin Liquidity For AI
Business News Report Technology
Ankr Collaborates With AI Blockchain Platform Talus Network To Unlock Bitcoin Liquidity For AI
May 1, 2024
Binance Labs Supports Movement Labs To Facilitate Facebook’s Move Integration Across Blockchains
Business News Report Technology
Binance Labs Supports Movement Labs To Facilitate Facebook’s Move Integration Across Blockchains
May 1, 2024
BRICS Nations Eye Stablecoin Trade Solution
Business Markets Stories and Reviews Technology
BRICS Nations Eye Stablecoin Trade Solution
May 1, 2024
Bitcoin L2 Network BOB Integrates With LayerZero For Enhanced Functionality
Business News Report Technology
Bitcoin L2 Network BOB Integrates With LayerZero For Enhanced Functionality
May 1, 2024