
Hugging Face announced a new plan Tuesday to establish the Open-R1, a fully open breeding of the DeepSeek-R1 model. Last week, Chinese artificial intelligence companies backed by hedge funds released the DeepSeek-R1 artificial intelligence (AI) model in the public domain, launching shock waves in Silicon Valley and Nasdaq Island. A big reason is that such advanced and large-scale AI models can surpass Openai’s O1 model and have not been released in open source. However, the model is not completely open source, and hugging face researchers are now trying to find the missing work.
Why hug the face to build Open-R1?
In a blog post, the Hug Face Researchers detailed the reasons behind copying DeepSeek’s famous AI model. In essence, DeepSeek-R1 is the so-called “Black-Box” version, which means that the code and other assets needed to run the software are available, but the datasets, as well as the training code, are not. This means that anyone can download and run an AI model locally, but it is impossible to copy the information needed by the model.
Some unpublished information includes inference-specific data sets for training basic models, for creating training code that allows the model to decompose and process hyperparameters of complex queries, and for the computational and data trade-off processes used in training.
The researchers say the goal of building a fully open source version of DeepSeek-R1 is to provide transparency for enhanced learning’s enhanced results and share reproducible insights with the community.
Embrace Face’s Open-R1 plan
Since DeepSeek-R1 is available in the public domain, researchers are able to understand certain aspects of the AI model. For example, the basic model used to create R1, DeepSeek-V3, is purely enhanced learning without any human supervision. However, the inference-focused R1 model uses several improved steps that reject low-quality outputs and produces polished and consistent answers.
To this end, the hugging facial researchers developed a three-step plan. First, a distilled version of R1 will be created using its dataset. The researchers will then attempt to replicate the pure enhanced learning pattern, and the researchers will then include supervised fine-tuning and further enhanced learning until they adjust the response with R1.
The synthetic dataset derived from the R1 model and the training steps will then be released to the open source community to enable developers to convert existing large language models (LLMs) into inference models only through fine-tuning.
It is worth noting that hugging faces use a similar process to radiate the Llama 3B AI model to show that test time calculations (also known as inference time calculations) can significantly enhance the small language model.