Chinese Tech Companies Lead the Open-Sourcing Wave of Video Generation Models

Chinese tech firms, notably Tencent and Alibaba, are pioneering the open-sourcing of AI video generation models. Tencent has released a text-to-video model, while Alibaba’s Wan 2.1 ranks highest on the VBench Leaderboard. StepFun is set to follow suit by open-sourcing related models. The open-sourcing trend aims to enhance accessibility despite challenges in computational demands.

The trend of open-sourcing artificial intelligence models in the Chinese tech sector is notably extending to video generation tools. This movement was highlighted by Tencent Holdings, which recently released an open-source text-to-video model that allows users to create five-second videos from a single image. Additionally, Alibaba Group has made its video and image-generating AI tool, known as Wan 2.1, publicly accessible since February 25.

Furthermore, the AI startup StepFun announced plans to open-source its text-to-video and text-to-speech models, developed in collaboration with Geely Auto. This initiative aligns with the actions of industry leader DeepSeek. Tencent’s model features 13 billion parameters, enabling the creation of realistic videos, including animated characters. The Hunyuan LLM team has shared essential resources, such as weights and inference code, via platforms like GitHub and Hugging Face.

Using Tencent’s model, users can design scenes, select camera angles, and provide text and audio for lip-synced outputs. Demonstrations showcase fluid subject movements and realistic speech patterns. However, the substantial differences in computational power and data requirements between video and image generation have prompted reluctance among developers to open-source their costly models.

Tencent previously unveiled the HunyuanVideo, which also boasts over 13 billion parameters, making it one of the most significant open-source video generation tools available. Experts suggest that while the technical direction for AI video generation remains uncertain, open-source tools during the refinement phase can foster advancements.

According to the VBench Leaderboard, Wan 2.1 is currently ranked first in video generation models with a score exceeding 86%, outpacing OpenAI’s Sora. HunyuanVideo holds the 12th position, while Zhipu AI’s CogVideoX1.5-5B ranks 15th.

The surge in Chinese tech companies embracing open-source video generation models is notable, particularly with Tencent’s and Alibaba’s recent releases. This initiative opens opportunities for innovation, despite challenges posed by computational requirements. As the industry navigates the complexities of AI video generation, open-source resources are expected to play a crucial role in driving progress and accessibility.

Original Source: www.yicaiglobal.com

Chinese Tech Companies Lead the Open-Sourcing Wave of Video Generation Models

Comments

Leave a Reply Cancel reply