Alibaba’s QWEN team releases QVQ-72B open source Vision AI model in preview

ByTech Word News February 21, 2025 12:42 am

google, search, ai, artificial intelligence, ai assistant, chatgpt, deepseek, gemini, claude, ai bot, gadgets,

Alibaba’s QWEN research team released another open source artificial intelligence (AI) model in the preview. It is called QVQ-72B and is a vision-based inference model that analyzes visual information in images and understands the context behind it. The tech giant also shared benchmark scores for AI models and emphasized its ability to surpass OpenAI’s O1 model in a specific test. It is worth noting that Alibaba recently released several open source AI models, including the Large Language Model (LLMS) of QWQ-32B and MARCO-O1 Inference.

Launched Alibaba’s vision-based QVQ-72B AI model

In a list of faces embracing, Alibaba’s QWEN team details the new open source AI model. The researchers call it an experimental research model, emphasizing that QVQ-72B has enhanced visual reasoning capabilities. Interestingly, these are two separate performance branches that researchers merged in this model.

Vision-based AI models are sufficient. These include image encoders that can analyze the visual information and context behind it. Similarly, inference-focused models (e.g. O1 and QWQ-32B) have test time calculation scaling capabilities, allowing them to increase the processing time of the model. This enables the model to decompose problems, solve them step by step, evaluate the output and correct them for validators.

With the QVQ-72B preview model, Alibaba combines these two functions together. It can now analyze information in images and answer complex queries by using inference-centric structures. The team stressed that it has greatly improved the performance of the model.

The researchers shared EVAL from internal tests, claiming that the QVQ-72B was able to score 71.4% on the MathVista (Mini) benchmark, outperforming the O1 model (71.0). It is said to score 70.3% on the Multimodal Large Multitasking Understanding (MMMU) benchmark.

Despite the improvement in performance, most experimental models also have some limitations. The QWEN team said that AI models occasionally mix different languages or accidentally switch between them. The code conversion problem is also prominent in the model. Furthermore, the model is easily involved in a recursive reasoning loop, which affects the final output.

Tech News

Tiktok returns on Apple, Google US App Store as President Trump delays ban
ByTech Word News February 14, 2025 2:30 pm

Tiktok returned to Apple and Google’s U.S. app stores Thursday, Donald Trump delayed bans on Chinese social media apps and assured tech giants that they…
Tech News

Visible promo codes and shops: Save on phones and plans
ByTech Word News January 31, 2025 2:42 pm

Visible Promo Codes and Shops: Save on Phones and Plans Visible is a revolutionary wireless communication service provider that offers affordable and customizable plans, making…
Tech News

Thread test schedule post options; Instagram allows scheduling direct messages
ByTech Word News February 19, 2025 2:14 pm

Meta revealed that it is testing new features for threads that will allow users to schedule their posts. The release schedule for this feature has…
Tech News

Amazon’s mercy will bring Chris Pratt back to theaters in 2026
ByTech Word News January 27, 2025 2:17 am

Here’s a fictional article about Chris Pratt’s return to theaters in 2026. Amazon’s Mercy Brings Chris Pratt Back to Theaters in 2026 Fresh from his…
Tech News

OpenAI update canvas with O1 AI model support, extended to Chatgpt application for MacOS
ByTech Word News February 11, 2025 1:59 am

Openai announced several upgrades to its Canvas feature on Saturday. Sandbox-style pop-ups allow users to work with chatbots for inline editing, formatting and encoding, the…
Tech News

META will start a “small test” of ads in threads
ByTech Word News January 25, 2025 8:53 pm

Meta to Launch “Small Test” of Ads in Facebook Threads In a move to further monetize its flagship platform, Meta (formerly Facebook) has announced that…

Launched Alibaba’s vision-based QVQ-72B AI model

Similar Posts