Alibaba Cloud has released Qwen2-VL, a new vision-language model with enhanced visual understanding, video comprehension, and multilingual text-image processing. Qwen2-VL shows superior performance against models like Meta's Llama 3.1 and OpenAI's GPT-4o and supports various applications, including real-time video analysis and tech support. The models, available in three sizes (7B, 2B, and soon 72B), are open-source under Apache 2.0 for the smaller variants.
Alibaba has released over 100 open-source AI models, enhancing its technology to compete with rivals. The new Qwen 2.5 models, upgraded in math and coding, span applications from automobiles to gaming. Alibaba has also launched a new proprietary model, Qwen-Max 2.5, and a text-to-video tool to strengthen its AI and cloud services offerings.