Deepseek: Everything you need to know about the AI ​​Chatbot application

Deepseek: Everything you need to know about the AI ​​Chatbot application

Deepseek has become viral.

Deepseek of the AI ​​Chinese laboratory broke into the dominant conscience this week after Its Chatbot application has increased at the top of Apple App Store graphics (and Google Play, too). Deepseek AI models, which have been trained using calculation economical techniques, led Wall Street analystsand technologists – Ask if the United States can maintain its advance in the AI ​​race and if the request for AI chips will support.

But where does Deepseek come from, and how did he reach international renown so quickly?

The origins of the Deepseek merchant

Deepseek is supported by High Fly Capital Management, a Chinese quantitative coverage fund that uses AI to clarify its commercial decisions.

Enthusiastic AI Liang Wenfeng Co-founded High-Flyer in 2015. Wenfeng, who started to pride himself in trade while a student at the University of Zhejiang, launched High Flyer Capital Management as a designing fund in 2019 focused on the development and deployment of AI algorithms.

In 2023, High-Flyer launched Deepseek as a laboratory dedicated to the search for AI tools separated from its financial activity. With High-Flyer as one of its investors, the laboratory took place in its own business, also called Deepseek.

From the first day, Deepseek built its own clusters of data centers for model formation. But like the other AI companies in China, Deepseek was allocated by American export prohibitions on equipment. To train one of its most recent models, the company was forced to use Nvidia H800 fleas, a less powerful version of a chip, the H100, available for American companies.

The Deepseek technical team is supposed to distort young. Business would aggressively recruit Doctorate of AI researchers of the best Chinese universities. Deepseek also hires people with no computer experience To help its technology better understand a wide range of subjects, according to the New York Times.

Strong Deepseek models

Deepseek unveiled its first set of models – Deepseek Coder, Deepseek LLM and Deepseek Chat – in November 2023. But it was only last spring, when the startup published its family of new generation Deepseek -V2 models, which the AI ​​industry has started to notice.

Deepseek -V2, a text analysis and image analysis system for general use, performed well in various AI landmarks – and was much cheaper to operate than the models comparable to the time. This forced the interior competition of Deepseek, including Bytedance and Alibaba, to reduce the prices of use of some of their models and to make others completely free.

Deepseek-V3Launched in December 2024, only added to the notoriety of Deepseek.

According to the internal reference tests of Deepseek, Deepseek V3 surpasses the downloadable and openly available models like Meta Lama and “closed” models that can only be accessible via an API, such as Openai GPT-4O.

Equally impressive is the model of Deepseek R1 R1. Released in January, Deepseek claims R1 thus performs that the O1 model of OPENAI on key references.

Being a model of reasoning, R1 effectively checks the facts, which helps him to avoid some of the traps which normally trigger models. Reasoning models take a little more time – usually minutes to a few more minutes – to achieve solutions compared to a typical non -season model. The advantage is that they tend to be more reliable in fields such as physics, science and mathematics.

However, there is a disadvantage of R1, Deepseek V3 and other models of Deepseek. Being an AI developed by Chinese, they are subject to reference By the Chinese Internet regulator to ensure that his answers “embody the basic socialist values”. In the Deepseek Chatbot application, for example, R1 will not answer questions about Tiananmen Square or the autonomy of Taiwan.

A disturbing approach

If Deepseek has a business model, it is not clear what is this model, exactly. The company assesses its products and services well below the market value – and gives the others for free. It also does not take investors’ moneyDespite a ton of interest from VC.

The way Deepseek says it, the breakthroughs of efficiency allowed him to maintain extremely competitiveness of costs. Some experts dispute However, the company’s figures provided.

In any case, the developers have taken the Deepseek models, which are not open source because the sentence is commonly understood but are available under permissive licenses which allow commercial use. According to Clem Delangue, the CEO of Hugging Face, one of the platforms housing the models of Deepseek, The developers on the face cuddles have created more than 500 R1 “derivatives” models which have accumulated 2.5 million combined downloads.

Deepseek’s success against larger and more established rivals has been described as “reversal of AI” And “Over-hypothesis.” The success of the company was at least partly responsible for causing a drop in the course of NVIDIA action by 18% in January, and for cause an audience From the CEO of Openai, Sam Altman.

Microsoft Announced that Deepseek is available on its Azure Ai Foundry serviceThe Microsoft platform which brings together AI services for companies under a single banner. Asked about the impact of Deepseek on Meta AI spending when he calls the first quarter of results, said CEO Mark Zuckerberg IA infrastructure expenses will continue to be a “strategic advantage” For Meta. In March, Openai called Deepseek “subsidized by the State” and “controlled by the State”, “, And recommends that the US government is planning to ban Deepseek models.

During the call of gains from the fourth quarter of Nvidia, CEO Jensen Huang underlined “the excellent innovation” of Deepseek ,, To say that IT and other models of “reasoning” are perfect for Nvidia because they need much more calculation.

At the same time, Some companies prohibit Deepseekand also whole country And governments,, Including South Korea. New York State too Prohibit deeply by being used on government devices.

As for what the future of Deepseek could hold is not clear. Improved models are data. But the American government seems to be distrust of what he perceives as a harmful foreign influence. In March, the Wall Street Journal reported that The United States probably prohibits Deepseek on government devices.

This story was initially published on January 28, 2025 and will be updated regularly.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *