Markets News Report
August 08, 2023

Alibaba Introduces Open-Source Qwen-7B Language Model

Alibaba Introduces Open-Source Qwen-7B Language Model

Alibaba has unveiled its open-source Large Language Model (LLM) named Qwen-7B, marking their inaugural entry into the realm of publicly accessible LLMs. This model is built upon 7 billion parameters.

For context, Qwen-7B underwent training using 2.2 trillion tokens. The context size set during this training phase was 2048, while users can extend this to a maximum of 8192 during testing. By comparison, Llama-2, another LLM, offers a context size of 4096.

Benchmarks are essential for gauging the performance of such models, and in this domain, the Chinese developers assert that Qwen-7B has surpassed Llama-2. One metric that stands out is the Human-Eval coding benchmark, where Qwen-7B scores 24.4 against Llama-2’s 12.8. However, it’s prudent to view these numbers with a degree of caution. Some benchmarks do indicate that Qwen-7B outperforms not just the base model of LLama-2-7B but also the LLaMA-2-13B variant. However, when pitted against the refined versions of Llama-2, the margin of difference becomes narrower. It should be noted that the exact training methodology of Qwen-7B has not been explicitly detailed by its developers.

In functionality parallel to LLaMa2-chat, Qwen has presented a chat-centric version named Qwen-7B-Chat. This model is optimized to interact with users and incorporates various tools and APIs to enhance its responsiveness.

Those with an inclination towards technical specifics would be interested to know that Qwen-7B’s architectural foundation bears resemblance to LLaMA. However, there are distinct features that differentiate Qwen-7B:

  1. It employs untied embedding.
  2. Rotary positional embedding is utilized.
  3. Biases are excluded, with the exception of QKV in attention.
  4. RMSNorm is favored over LayerNorm.
  5. Instead of the standard ReLU, SwiGLU is incorporated.
  6. Flash attention has been introduced to expedite the training process.
  7. The model comprises 32 layers, has an embedding dimension of 4096, and accommodates 32 attention heads.

In terms of licensing, Qwen-7B aligns with Llama-2. It permits commercial usage, but with a stipulation on user volume. While Llama-2 sets this cap at 700 million active users per month, Qwen-7B’s threshold is 100 million.

Those seeking an in-depth examination can refer to the technical report available on GitHub. Additionally, a demonstration of Qwen-7B, provided in the Chinese language, is accessible for those interested in a practical exploration of the model’s capabilities.

Read more about AI:

Disclaimer

Any data, text, or other content on this page is provided as general market information and not as investment advice. Past performance is not necessarily an indicator of future results.


The Trust Project is a worldwide group of news organizations working to establish transparency standards.

Damir is the team leader, product manager, and editor at Metaverse Post, covering topics such as AI/ML, AGI, LLMs, Metaverse, and Web3-related fields. His articles attract a massive audience of over a million users every month. He appears to be an expert with 10 years of experience in SEO and digital marketing. Damir has been mentioned in Mashable, Wired, Cointelegraph, The New Yorker, Inside.com, Entrepreneur, BeInCrypto, and other publications. He travels between the UAE, Turkey, Russia, and the CIS as a digital nomad. Damir earned a bachelor's degree in physics, which he believes has given him the critical thinking skills needed to be successful in the ever-changing landscape of the internet.Ā 

More articles
Damir Yalalov
Damir Yalalov

Damir is the team leader, product manager, and editor at Metaverse Post, covering topics such as AI/ML, AGI, LLMs, Metaverse, and Web3-related fields. His articles attract a massive audience of over a million users every month. He appears to be an expert with 10 years of experience in SEO and digital marketing. Damir has been mentioned in Mashable, Wired, Cointelegraph, The New Yorker, Inside.com, Entrepreneur, BeInCrypto, and other publications. He travels between the UAE, Turkey, Russia, and the CIS as a digital nomad. Damir earned a bachelor's degree in physics, which he believes has given him the critical thinking skills needed to be successful in the ever-changing landscape of the internet.Ā 

Hot Stories
Join Our Newsletter.
Latest News

20 Most Underrated AI Startups in 2023: Ranked by Funding

AI remains a constant focal point for investors and entrepreneurs alike. While the spotlight often falls on ...

Know More

Ranked: Top 10 Countries by Estimated AI Contribution to Economy by 2030

AI stands at the cusp of a transformative era, poised to reshape virtually every sector and ignite ...

Know More
Join Our Innovative Tech Community

Read More

Read more
Justin Sun Announces 5,000 ETH Hack at HTX, Assures Full Coverage of Losses
Markets News Report
Justin Sun Announces 5,000 ETH Hack at HTX, Assures Full Coverage of Losses
September 25, 2023
Mitsubishi UFJ Trust Bank Partners with Binance Japan to Advance Stablecoin AdoptionĀ 
Markets News Report
Mitsubishi UFJ Trust Bank Partners with Binance Japan to Advance Stablecoin AdoptionĀ 
September 25, 2023
Mystiko.Network Unveils First Privacy Solution for L2 on Base Mainnet
News Report Technology
Mystiko.Network Unveils First Privacy Solution for L2 on Base Mainnet
September 25, 2023
OpenAI’s ChatGPT Unveils Major Upgrade, Adds Voice Conversation and Image Chat
News Report Technology
OpenAI’s ChatGPT Unveils Major Upgrade, Adds Voice Conversation and Image Chat
September 25, 2023
What You
Need to Know

Subscribe To Our Newsletter.
Daily search marketing tidbits for savvy pros.