News Report Technology
January 29, 2026

Qwen Open-Sources Advanced ASR And Forced Alignment Models With Multi-Language Capabilities

In Brief

Alibaba Cloud has open-sourced its Qwen3-ASR and Qwen3-ForcedAligner AI models, delivering state-of-the-art speech recognition and forced alignment performance across multiple languages and challenging acoustic conditions.

Qwen Open-Sources Advanced ASR And Forced Alignment Models With Multi-Language Capabilities

Alibaba Cloud announced that it has made its Qwen3-ASR and Qwen3-ForcedAligner AI models open-source, offering advanced tools for speech recognition and forced alignment. 

The Qwen3-ASR family includes two all-in-one models, Qwen3-ASR-1.7B and Qwen3-ASR-0.6B, which support language identification and transcription across 52 languages and accents, leveraging large-scale speech data and the Qwen3-Omni foundation model. 

Internal testing indicates that the 1.7B model delivers state-of-the-art accuracy among open-source ASR systems, while the 0.6B version balances performance and efficiency, capable of transcribing 2,000 seconds of speech in one second with high concurrency. 

The Qwen3-ForcedAligner-0.6B model uses a non-autoregressive LLM approach to align text and speech in 11 languages, outperforming leading force-alignment solutions in both speed and accuracy. 

Alibaba Cloud has also released a comprehensive inference framework under the Apache 2.0 license, supporting streaming, batch processing, timestamp prediction, and fine-tuning, aimed at accelerating research and practical applications in audio understanding.

Qwen3-ASR And Qwen3-ForcedAligner Models Demonstrate Leading Accuracy And Efficiency

Alibaba Cloud has released performance results for its Qwen3-ASR and Qwen3-ForcedAligner models, demonstrating leading accuracy and efficiency across diverse speech recognition tasks. 

The Qwen3-ASR-1.7B model achieves state-of-the-art results among open-source systems, outperforming commercial APIs and other open-source models in English, multilingual, and Chinese dialect recognition, including Cantonese and 22 regional variants. 

It maintains reliable accuracy in challenging acoustic conditions, such as low signal-to-noise environments, child or elderly speech, and even singing voice transcription, achieving average word error rates of 13.91% in Chinese and 14.60% in English with background music.

The smaller Qwen3-ASR-0.6B balances accuracy and efficiency, delivering high throughput and low latency under high concurrency, capable of transcribing up to five hours of speech in online asynchronous mode at a concurrency of 128. 

Meanwhile, the Qwen3-ForcedAligner-0.6B outperforms leading end-to-end forced alignment models including Nemo-Forced-Aligner, WhisperX, and Monotonic-Aligner, offering superior language coverage, timestamp accuracy, and support for varied speech and audio lengths.

Disclaimer

In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.

About The Author

Alisa, a dedicated journalist at the MPost, specializes in cryptocurrency, zero-knowledge proofs, investments, and the expansive realm of Web3. With a keen eye for emerging trends and technologies, she delivers comprehensive coverage to inform and engage readers in the ever-evolving landscape of digital finance.

More articles
Alisa Davidson
Alisa Davidson

Alisa, a dedicated journalist at the MPost, specializes in cryptocurrency, zero-knowledge proofs, investments, and the expansive realm of Web3. With a keen eye for emerging trends and technologies, she delivers comprehensive coverage to inform and engage readers in the ever-evolving landscape of digital finance.

Hot Stories
Join Our Newsletter.
Latest News

The Calm Before The Solana Storm: What Charts, Whales, And On-Chain Signals Are Saying Now

Solana has demonstrated strong performance, driven by increasing adoption, institutional interest, and key partnerships, while facing potential ...

Know More

Crypto In April 2025: Key Trends, Shifts, And What Comes Next

In April 2025, the crypto space focused on strengthening core infrastructure, with Ethereum preparing for the Pectra ...

Know More
Read More
Read more
Solv Protocol Upgrades SolvBTC With FROST2, Setting New Standard For Institutional-Grade Bitcoin Execution
News Report Technology
Solv Protocol Upgrades SolvBTC With FROST2, Setting New Standard For Institutional-Grade Bitcoin Execution
January 29, 2026
Leading RWA Tools Helping TradFi Assets Enter On-Chain Markets In 2026
Top Lists News Report Technology
Leading RWA Tools Helping TradFi Assets Enter On-Chain Markets In 2026
January 29, 2026
China Clears Major Tech Firms To Buy Over 400K Nvidia H200 Chips, Easing AI Supply Constraints
Business News Report Technology
China Clears Major Tech Firms To Buy Over 400K Nvidia H200 Chips, Easing AI Supply Constraints
January 29, 2026
Bitget Wallet Report Shows Crypto Wallets Emerging As Central Hubs For Everyday Onchain Finance
News Report Technology
Bitget Wallet Report Shows Crypto Wallets Emerging As Central Hubs For Everyday Onchain Finance
January 29, 2026