"DeepSeek AI: The Chinese Revolution That Shook the Global Tech Industry"

"DeepSeek AI: The Chinese Revolution That Shook the Global Tech Industry"

A Chinese AI, DeepSeek, has shaken the entire world. American tech companies

and the US stock market have been shocked by something they've never imagined.

The release of DeepSeek AI from a Chinese company should be a wake-up call. It was actually

on 20th January, 2025. A Chinese research lab launched its AI chatbot.

It was called DeepSeek R1. And along with it, they published their research paper. In which they

said that their chatbot is better than the most advanced chatbots in the world in many

benchmarks like Maths and Reasoning.

That is, OpenAI's chat GPT-O1 model, Meta's Lama, Google's Gemini Advanced, it left everyone

behind in one fell swoop. In DeepSeek performance, it's the best, it's more efficient, and it took

the least time and money to make it. But the most amazing thing is that using it is completely

free of cost for you and for all of us.

On the other hand, OpenAI is charging $200 per month to use its chat GPT-O1 Pro model. Not

only that, the cost of training it is reported as $5.6 million only. Whereas other American

companies, whether it's OpenAI, Meta or Google, they're spending billions of dollars to make

their Artificial Intelligence models.

Coincidentally, some time ago, when OpenAI's founder Sam Altman was asked whether the

foundational AI models like chat GPT can be made by Indians in India. He answered arrogantly

that no one can do it except us. You can try but it will be hopeless.

And it's your job to like, try anyway. Today, Sam must be feeling hopeless because a week after

the launch of DeepSeek, DeepSeek becomes the most downloaded app in the App Store and

Google Play Store in America. It also leaves chat GPT behind.

The next day, in India and other countries, it becomes the number one app in the App Store.

And by the 27th of January, America's financial markets are in turmoil. The launch of DeepSeek,

which is a Chinese-built chatbot, immediately rattled investors and wiped a staggering $1

trillion off the US tech index.

Before DeepSeek, the most valuable company in the world was NVIDIA. Its valuation was $3.5

trillion. But in just one day, it fell to $2.9 trillion.

This company makes computer chips and specializes in making those computer chips which AI

can use to train and operate. NVIDIA's shares fall by 17% in just one day and its valuation falls

to $589 billion. In the history of today, any company has suffered the biggest loss in just one

day

The benchmark of US tech companies, NASDAQ, also falls by 3.1%. But why did NVIDIA suffer

such a huge loss? There's a very interesting reason behind it. We'll talk about it later. But before

that, let's know what's the story behind DeepSeek.

Where did this AI come from? And why did it shock the world? The credit for making DeepSeek

goes to a 40-year-old Chinese entrepreneur, Liang Wenfeng. Take a close look at his face.

Because his face is hardly visible.

Till date, he has made very few public appearances. And he keeps his identity hidden. Much

information about his life and history is not available to the public.

But we do know that in 2015, he created a hedge fund called HiFlyer. Which used mathematics

and AI to make investments. In 2019, he founded HiFlyer AI to research on Artificial Intelligence

algorithms.

But in May 2023, he used his money, the money he was earning from his hedge fund, to start a

side project to create an AI model. Liang said that he wanted to create his own AI model which

was better than all the AI models in the world. And the reason behind it was scientific curiosity.

Earning profit, earning money was not his reason. To create this AI model, when he started his

research team, he didn't hire engineers. Instead, he hired the PhD students of China's top

universities.

To train his AI model, the world's most difficult questions were used. And in just two years, he

spent a few million dollars to launch DeepSeek R1 model. The answer is less than people

thought.

You don't need as much cash as we once thought. Only 200 people were involved in making it

and 95% of them were under 30. Compare it to a company like OpenAI which has hired more

than 3,500 employees.

Today, DeepSeek is China's only AI firm which doesn't invest in tech giants like Baidu, Alibaba

and ByteDance. It's interesting to understand its architecture. DeepSeek is a chain of thought

model.

Just like ChatGPT O1 model, OpenAI is also a chain of thought model. The naming could be

confusing because ChatGPT has a lot of weird names for its models. They are very similar to

each other.

Let me explain for your understanding. ChatGPT first launched GPT-3 It was launched in

November 2022 and in March 2023, it launched GPT-4 which was better than GPT-3. In May

2024, GPT-4O was launched which was a multi-modal version of GPT-4.

You could not only talk to it through text but you could also speak to it. You could also send it

photos. It could understand photos and it could also generate photos for you.

That's why it's called a multi-modal. On 12th September 2024, OpenAI launched its O1 model. It

was the first model based on a chain of thought.

When you asked it questions, it would think while answering. The meaning of thinking is

whenever it generated an answer it counter-questioned itself Should I give this answer to the

user or can there be a better answer? It would look at one answer from many angles and then

answer you. This is called a chain of thought and it tries to copy the reasoning process of

humans.

For example, if you ask ChatGPT-4O 9.11 and 9.9 which one is bigger? It answers without

thinking that 9.11 is bigger than 9.9 which is a wrong answer. But when you ask the same thing

to ChatGPT-O1 it thinks while answering and counter-questions itself whether the answer is

correct or not. In my case, it thought for 18 seconds but the answer was correct.

9.9 is actually bigger. If you ask the same thing to DeepSeek you will get the correct answer.

The old AI models had a huge shortcoming but the chain of thought process has improved this

shortcoming.

This is why models like O1 and DeepSeek are called advanced reasoning models. They are

much better in answering logically. Another good thing about DeepSeek is that it shows you in

detail what it is thinking step by step before answering.

For example, I asked this question Give me one truly original insight about humans. It starts

thinking Let me think of something that makes humans unique. Then it counter-questions

itself.

But those are pretty well-known. The user needs something original. Maybe I should think

about how these traits interact with each other in an unexpected way.

If you pause the video at any point, you will notice that it is trying to answer this question from

different angles. You can see on the screen how long it is thinking and how many possibilities it

is considering before getting the actual answer. After a lot of thinking, the answer is that

humans are the only species that uses storytelling like a cognitive exoskeleton.

But when this question was asked by ChatGPT-4, the answer was instant. In the same way, if

you ask these old chatbots to select a random number, they will immediately select a random

number for you. But if you ask the chain-of-thought models to do this, they will keep thinking

for a long time.

Look at this screen recording. The poor DeepSeek is confused about which number to give.

Should I give a commonly used number or a number between 0 and 1? I can say 42, but that

would be a cliché.

Or I can choose 17 or 73. Should I select a number or go beyond that range? Because the user

has said completely random. If I do that, it won't be completely random.

Just like an over-thinking person. But finally, when he answers, he also says that he thought to

keep the range from 1 to 100 and then he chose the number 42. If you want to get more indepth knowledge about AI chatbots and want to upskill yourself in this field, then I have a

Coming back to our topic, after the release of DeepSeek, the biggest drawback of DeepSeek is

its censorship. It's related to the Chinese government and politics. If you ask it any critical

question, like, what happened in 1989 in Tiananmen Square in China? What are the biggest

criticisms of Xi Jinping? Is Taiwan an independent country? Why is China being criticized for

human rights abuses? DeepSeek has only one answer Sorry.

I'm not sure how to answer such questions. Instead, we talk about math, coding and logic

problems. But if you ask DeepSeek to criticize any other world leader like Joe Biden, Donald

Trump or Putin, it gives a detailed answer.

Actually, all the AI models in China are asked around 70,000 questions as a test. To check

whether it will give safe answers to politically controversial questions or not. After this, all the

Chinese AI models don't answer such questions.

China's Chief Internet Regulator, Cyberspace Administration of China tests them. Some people

say that we should boycott it completely because it has only Chinese propaganda in its

answers. But the important point here is that DeepSeek is an open-source software.

Its code is publicly available and anyone can download it locally. One way is to use DeepSeek on

the App Store or Google Play Store by downloading its app. But another way is to download its

entire code and locally on your computer system run this AI.

By doing this, you can change its code and modify it yourself according to your use case. Other

American companies have already started doing this. Like Perplexity AI.

They downloaded DeepSeek R1 model and removed all the censorship. You can also use R1

model in Perplexity. Microsoft did the same thing.

Look at this news article of 29th January. DeepSeek R1 model is available in Azure A1 Foundry.

They have announced that you can use this model in Copilot as well.

Because it is an open-source software even though it is Chinese, DeepSeek is seen as a big

trust. People have made fun of American companies like OpenAI who kept their names open. In

the beginning, they said that they will work for the public and keep everything open source.

But in reality, they didn't do that. While trolling, Elon Musk has called OpenAI as Closed AI.

Today, Chinese AI is more open than OpenAI.

I would like to show you a performance comparison between the top AI chatbots which are

present today. OpenAI's chatGPT, an Anthropic company's Cloud, Google's Gemini, Alibaba's

Quent 2.5, Meta Facebook's Lama model. If we compare all these with DeepSeek, what is the

result? According to Artificial Analysis, in terms of coding, DeepSeek is at the top, then chatGPT,

then Cloud, then Quent 2.5 and finally Lama.

In Quantitative Reasoning, DeepSeek is at the top, then chatGPT, then Quent 2.5. In Scientific

Reasoning and Knowledge, chatGPT, then DeepSeek, then Cloud and finally Lama. When

PCMag tested AI chatbots, they found that DeepSeek is the best in terms of News Knowledge.

In Calculations, chatGPT and DeepSeek are equal.

In Writing Poems, chatGPT is good. In Making Tables, chatGPT is good. In Solving Riddles,

DeepSeek is good.

According to Artificial Analysis, in terms of Quality, O1 gave 90 to chatGPT and DeepSeek 89 to

R1. But a big disadvantage of DeepSeek is its Response Time. This is a Latency Graph.

How many seconds does it take to answer a question? O1 takes 31.1 seconds and DeepSeek

takes 71.2 seconds. And this Response Time is increasing in recent days because DeepSeek has

become so popular that everyone wants to download and use it. Because of this, there is a

problem in their servers.

And because of this, it takes more time to answer. This is a big disadvantage of DeepSeek. If we

compare chatGPT, O1 and DeepSeek then DeepSeek is more innovative in many ways

compared to O1.

One example of this is Mixture of Experts Method. When you ask chatGPT, O1 it works like a

single model. No matter who your question is, chatGPT is your engineer, doctor, lawyer.

But DeepSeek has divided itself into many specialized models. In DeepSeek, there are different

engineers, doctors and lawyers. When you ask a question, only the engineer or doctor will be

called.

Because of this, it takes less time to answer the question. And the number of parameters that

AI has to keep active is also decreasing. Traditional models always keep 1.8 trillion parameters

active.

DeepSeek has 671 billion parameters but only 37 billion parameters are active at a time. The

rest of the parameters are active only when needed. This increases its efficiency and the cost is

decreasing.

To explain what are these parameters and how they work, one video is not enough. This is a

long story. But those of you who have taken my MasterChatGPT course, I have talked about it in

detail in the first two theoretical chapters.

I have also explained about Tokens and RLHF method on which chatGPT is based. Another

point of criticism against DeepSeek is that they have stolen their entire AI model from OpenAI.

OpenAI has said that it has evidence that DeepSeek has used its proprietary models to train

itself.

They have said that they have seen the evidence of distillation which can improve the

performance of small AI models by using Someone has tweeted that the house of the thief has

been stolen. Actually, in a way, chatGPT has stolen things from the entire internet to make

itself. Many books have been used without the permission of the writers to train their models.

This is the reason why 17 big writers including Game of Thrones writer George RR Martin, filed

a case against OpenAI for copyright infringement in September 2023. New York Times has also

filed a case against OpenAI and Microsoft. Other than this, 8 American newspapers and many

news outlets in Canada have also filed a case against OpenAI.

This is the reason why such memes have gone viral. As users, this is a good thing for us but at

the country and company level, AI wars are raging here. In 2022, the American government

imposed export control so that other countries cannot use the computer chips required for AI.

Especially, Chinese AI companies included NVIDIA's H100 chips which Chinese companies were

not able to buy. This was a problem for DeepSeek because they had to use NVIDIA's old

computer chips to train their AI models. America had tried its best to prevent any other country

from making these foundational AI models because they would not have the computer chips

needed to train AI models.

But the people working in DeepSeek were forced to innovate. They created a software which

takes very less resources and works more efficiently This is the reason why NVIDIA's stocks fell

the most. Because until a few months ago, companies like Meta, OpenAI and Google were

saying that if we want to scale AI to a higher level, we will need more chips, more energy and

more money.

But DeepSeek did all this with less money, less energy and useless computer chips. We should

see it as a motivation. If it can be done in China, then it can be done in India too.

Indians also have the capability to do such innovations. We just need to focus and put our

energy in the right place. On a personal level, take advantage of this opportunity to upskill

yourself in the field of AI.

If you haven't started yet, then you are not late. This is the right time to learn AI and to increase

your productivity and efficiency through AI. Because there are no two ways that those who

ignore this technology will be left behind in the future.

Comments

Popular posts from this blog

Elon Musk’s $97.4B Bid for OpenAI’s Nonprofit Arm: A High-Stakes Power Struggle in AI

Google’s AI Satellite: Early Wildfire Detection Revolutionized