In Brief MiniMax concluded its week of product announcements with the launch of Hailuo Video Agent, an AI-driven video creation tool, and Voice Design, a multilingual text-to-speech generator.

Chinese AI company MiniMax announced that it has launched the Hailuo Video Agent in its Beta phase. This AI-driven application converts basic text or image prompts into short, high-quality video clips with a single click. It leverages advanced techniques such as frame-by-frame physics simulation, motion-based prompting, and multimodal parsing to deliver creative video content in an accessible format.

This Beta release marks the beginning of the product’s development journey, introducing early creative capabilities designed to stimulate ideation and signal the commencement of a new era in AI video generation.

The platform allows users to select a preferred creative agent style, describe their idea using plain language without any need for technical knowledge, and then receive a fully rendered, polished video generated by the Hailuo Agent.

The Hailuo Video Agent is being developed in three distinct stages. Stage One includes prebuilt video agent templates that generate high-quality, creative videos from user-submitted text or images with a single command.

Stage Two will introduce semi-customizable video agents, giving users the option to modify all aspects of the video creation process, including the script, visuals, and voiceover. Stage Three will deliver a fully autonomous, end-to-end video agent capable of transforming creative input into a final-cut video with minimal manual involvement.

MiniMax stated that it intends to gradually implement Stage Two of the Agent creation tools during the summer.

In addition, MiniMax has unveiled Voice Design, an advanced zero-shot text-to-speech model that utilizes a learnable speaker encoder to accurately replicate the vocal timbre of a reference voice without requiring transcription. This technology enables high-quality and expressive speech synthesis, including one-shot voice cloning capabilities. It supports output across 32 languages and offers sophisticated features such as emotion modulation and professional-grade voice customization, reflecting a significant advancement in multilingual and adaptive voice generation.

MiniMax Launches MiniMax-M1 LLM And Hailuo 02 Video Model

MiniMax is an AI startup identified as one of China’s prominent emerging AI firms. The company specializes in developing large-scale multimodal AI systems that encompass text, voice, image, and video generation, including its Hailuo video model.

Its infrastructure supports the production of billions of text tokens and millions of video segments. MiniMax is supported by significant investors such as Alibaba, Tencent, and IDG, and is categorized among a select group of high-growth Chinese AI startups often referred to as the Little Dragons, which have collectively attracted substantial venture capital over the past year.

Last week, the company launched several new technologies, including a large language model (LLM) named MiniMax-M1, which is presented as being more efficient than other proprietary models in China and reportedly surpasses the performance of DeepSeek’s R1-0528 model across various benchmark tests. Additionally, MiniMax has introduced a new version of its video generation tool, Hailuo 02, which offers native 1080p resolution, improved compliance with user instructions, and enhanced capabilities for simulating complex physical environments.

