Hug Face introduces a compact version of the smolvlm visual language model that can run on consumer laptops

ByTech Word News February 14, 2025 3:23 am

google, search, ai, artificial intelligence, ai assistant, chatgpt, deepseek, gemini, claude, ai bot, gadgets,

Last week, Embrace Face introduced two new variants to its Smolvlm visual language model. The new artificial intelligence (AI) model can provide 256 million and 500 million parameter sizes, the former being called the world’s smallest visual model by the company. The new variant focuses on retaining the efficiency of the 200 billion parameter model while significantly reducing the size. The company stressed that the new models could run locally on restricted devices, consumer laptops, and perhaps even browser-based reasoning.

Embrace the Face to introduce smaller Smolvlm AI model

In a blog post, the company announced the Smolvlm-256m and Smolvlm-500m visual language models, in addition to the existing 2 billion parameter models. This version brings two basic models and two instruction fine-tuning models in the above parameter sizes.

The embrace says that these models can be loaded directly into transformers, machine learning exchanges (MLX), open neural network exchanges (ONNX) platforms and developers can build on top of the basic models. It is worth noting that these are open source models with the Apache 2.0 license and are available for personal and commercial use.

With the new AI model, Hugging Face aims to bring multi-model models focused on computer vision to portable devices. For example, the 256 million parameter model can run on less than one GB of GPU memory and 15GB of RAM, processing 16 images per second (batch size 64).

“For a mid-sized company, processing 1 million images per month means a significant savings in annual computational cost,” Andrés Marafioti, a machine learning research engineer at Hugging Face, told VentureBeat.

To reduce the size of the AI model, the researchers switched the visual encoder from the previous Siglip 400m to the 93m parameter siglip Base patch. In addition, symbolization has also been optimized. The New Vision model encodes the image at a rate of 4096 pixels per token, while each token in the 2B model is 1820 pixels.

It is worth noting that the smaller models also lag slightly behind the 2B model in terms of performance, but the company says that trade-off is at least maintained. Depending on the hugged face, the 256m variant can be used for subtitle images or short videos, answering questions about the document, and basic visual reasoning tasks.

Developers can use Transformers and MLX for inference and fine-tune AI models, as they can use old Smolvlm code. These models are also listed on the Hug Face.

Tech News

Openai’s Chatgpt and Sora services are now running after major power outages
ByTech Word News February 20, 2025 12:07 pm

Openai’s Artificial Intelligence (AI) Chatbot Chatgpt service suffered a major power outage in the U.S. and some other regions on Thursday. According to a report…
Tech News

India delays UPI payments market share cap to alleviate Phonepe, Google Pay
ByTech Word News February 16, 2025 11:36 am

India delayed the market share cap of popular digital payment methods for two years on Tuesday, a move that would benefit Google Pay and Walmart-backed…
Tech News

Qualcomm and Croma partners open the first Snapdragon experience zone in India
ByTech Word News February 24, 2025 10:21 am

Qualcomm recently announced a partnership with Croma, a leading consumer electronics retail chain, to launch its first Snapdragon experience zone in India. Located in the…
Tech News

50% discount on promo code and coupons of home chef
ByTech Word News January 28, 2025 2:08 pm

Savor the Flavor with Home Chef: 50% Off Promo Code and Coupons As a home chef, you understand the importance of quality ingredients, ease of…
Tech News

After the release of custody in the United States, former Binance chief Zhao Zhao promises to conduct blockchain funding
ByTech Word News February 27, 2025 9:07 pm

Binance founder and former CEO Changpeng Zhao was released from U.S. custody on September 28 after being sentenced to four months in California’s low-security prison…
Tech News

The creator behind the barbarian leads a new restart resident Evil
ByTech Word News January 26, 2025 2:07 am

The Creator of the Barbarian Leads a New Adventure in Resident Evil In a surprise move, Emrys Ellor, the renowned creator of the popular game…

Embrace the Face to introduce smaller Smolvlm AI model

Similar Posts