10²⁶ parameters, AGI will take another 70 years! Tsinghua University predicts that the total price of GPUs will reach 40 million times the market value of Apple

AGI will come this year; Nobel Prize-level AI will be born in 2026-2027.

Whether it is Ultraman or Anthropic CEO Dario Amodei, technology leaders in the AI field all believe that "super intelligence" is just around the corner.

Even a few days ago, an article in the New York Times stated that the US government knew that AGI was coming and had corresponding ideas and countermeasures.

Is AGI really coming?

Recently, a research team from Tsinghua University and Renmin University of China calculated that:

Humanity is still 70 years away from AGI!

They proposed a new framework, "Survival Game", to evaluate the level of intelligence.

In this framework, intelligence is no longer a vague concept, but can be quantified by the number of failures in the trial and error process - the fewer the failures, the higher the intelligence.

Paper address: https://arxiv.org/pdf/2502.18858

When both the expected value and variance of the number of failures remain limited, it means that the system has the ability to continuously respond to new challenges, which the authors define as the "autonomy level" of intelligence.

The experimental results show that in simple tasks, basic pattern recognition or rule reasoning, AI has autonomous capabilities with low and stable failure rates.

However, when the task becomes more difficult, such as video processing, search optimization, recommendation systems, and self-developed language understanding, AI performance falls short of expectations.

The number of failures increased dramatically, and the stability of the solution decreased.

They predict that to achieve “level autonomy” in common tasks, AI models will need up to 10²⁶ parameters.

Imagine the scale: the total value of the H100 GPUs required to train such a model is 4×10⁷ times the market value of Apple!

Even according to the optimistic estimate of Moore's Law, the hardware conditions to support this parameter scale will require 70 years of technological accumulation.

How was this account calculated?

Intelligence, the Trial and Error of Natural Selection

First, we need to talk about intelligence. How does it come about?

It is not an innate talent, but an inevitable product shaped by natural selection over billions of years of evolution.

Today, every life form we see—whether human, animal, or plant—follows this law.

The process of "natural selection" is like a ruthless test: species must explore uncertainty, find answers to survival, and try again and again until they succeed.

If no solution is found, they will be eliminated in this cruel test and will not be able to continue.

Inspired by this, the researchers proposed the "survival game" framework to quantify and evaluate intelligence.

Here, the level of intelligence is no longer an abstract concept, but can be measured by the number of failures in finding the correct solution during the trial and error process.

That is to say, the fewer failures, the higher the intelligence.

The number of failures, as a discrete random variable, its expectation and variance directly reflect the level of intelligence.

If the expectation and variance are infinite, the subject will never be able to find the answer and will not be able to survive in the "survival game"; on the contrary, if both converge, it means that the subject has the ability to solve the problem efficiently.

Survival game, three intelligent classifications

Based on the expectation and variance of the number of failures, the researchers classified intelligence into three levels:

Finite level: Both expectation and variance diverge, and the subject can only blindly enumerate possible solutions, which is inefficient and difficult to deal with complex challenges.
Competent: The expectation and variance are limited but unstable; the subject can find the answer in a specific task, but the performance is not robust enough.
Autonomous level: Both the expectation and variance converge and are small. The subject can stably solve the problem with a small number of attempts and operate autonomously at an affordable cost.

This classification is not only applicable to biological intelligence, but also provides a scientific yardstick for evaluating AI.

LLM remains at "limited level"

In the specific experiment, the researchers evaluated the current most advanced large model in the "survival game", and the results were thought-provoking.

In simple tasks such as handwritten digit recognition, AI's performance reached the "autonomous level", with few and stable failures, demonstrating efficient problem-solving capabilities.

However, when the complexity of the task increases to visual processing, search engine optimization, recommendation systems, and natural language understanding, AI mostly remains at the "limited level".

This means that they cannot effectively narrow down the range of answers and their performance is similar to "brute force enumeration", which is both inefficient and error-prone.

As shown in Figure 4 below, in visual processing, the first row shows the results of the image classification task, and different images correspond to different models.

It can be seen that all models are at a finite level.

As larger MAE models are used, the decay rate increases and the data points gradually approach the competency level.

In the next two rows, the results on the MS COCO and Flickr30k datasets are shown. Different images in the same row correspond to different models.

The results show that even today’s state-of-the-art models are at a finite level with a decay rate of 1.7 or below, far from the competence level 2 threshold.

From this, we can also see a similar trend to the first row: the larger the model, the closer it is to the competence level, but the marginal improvement gradually decreases.

As can be seen in Figure 5 below, LLM performance remains at a limited level in all datasets and all text search models.

Figures 6, 7, 8, 9, and 10 show the performance of LLM in recommendation systems, encoding, mathematical tasks, question answering, and writing, respectively.

This limitation is in stark contrast to the optimistic conclusions of some previous studies.

Many studies have shown that AI is close to human intelligence, but the "survival game" reveals a more realistic picture:

Most AI systems are still in their infancy, rely on human supervision, and are unable to handle complex tasks independently.

1 0²⁶ parameters, an impossible challenge

The researchers found that the AI’s intelligence score has a log-linear relationship with the model size.

Based on this law, they predict that to achieve "autonomy level" in general language tasks, AI systems will need a staggering 10²⁶ parameters.

This scale is equivalent to 10⁵ times the total number of neurons in the human brain!

Loading such a large model would require 5×10¹⁵ H100 GPUs, with a total cost of 4×10⁷ times the market value of Apple.

Even according to Moore's Law, it will take 70 years for hardware technology to support this scale.

This astronomical cost suggests that simply scaling up current AI technology to solve human tasks is nearly impossible.

So what exactly is the problem?

AI shallow learning, difficult to break through

In order to explore the bottleneck of AI, the researchers conducted an in-depth analysis of the "survival game" based on the theory of "self-organized criticality" (SOC).

The results show that many human tasks have "critical" characteristics, that is, even a slight change in the environment may require a completely different response strategy.

For example, humans can adjust their responses based on the tone of voice in a conversation and quickly lock onto a target in a chaotic scene.

These capabilities rely on a deep understanding of the underlying mechanisms of the task.

However, current AI systems are more like "superficial imitators." They memorize answers to questions through large amounts of data and rely on exploration to respond to new challenges.

Although parameter scaling of large models can improve the imitation effect, the lack of understanding of the underlying mechanisms causes costs to quickly get out of control.

This kind of "shallow learning" is the fundamental reason why AI has difficulty breaking through the "autonomy level".

"Survival Game" reveals the gap between AI and human intelligence, and also points out the direction for future development.

To move AI from the “limited level” to the “autonomous level”, it is necessary not only to go beyond simple scaling, but also to design a system that can understand the essence of the task.

The reason why humans can cope with complex challenges within limited attempts is that we have mastered cognitive abilities that go beyond the surface.

This ability may be a peak that AI cannot reach in the short term, but through the guidance of the "survival game", we can gradually approach this goal.

To be or not to be: From intelligence explosion to human extinction

Artificial intelligence companies are racing to build artificial superintelligence (ASI) — AI smarter than all humans combined. If they succeed, the consequences could be dire.

So the question is, how will we get from today's AI to ASI that could destroy us?

This involves the concept of "intelligence explosion".

What is the intelligence explosion?

Intelligence explosion is a cycle of self-reinforcement of AI systems. Simply put, AI becomes smarter and smarter at an unimaginable speed until their intelligence far exceeds that of humans.

The idea was first proposed by British mathematician IJ Good, who worked on code-breaking at Bletchley Park during World War II.

In 1965, he wrote in his paper “Speculations Concerning the First Ultraintelligent Machine” that hypothetically there is a “superintelligent machine” whose intelligence would far exceed that of any human, no matter how smart that person is.

Because designing machines is itself an intellectual activity, this super-intelligent machine can design more powerful machines.

In this case, there will undoubtedly be an "intelligence explosion" and human intelligence will be left far behind.

So the first superintelligent machine may be the last thing humans need to invent—assuming it’s docile enough that we can control it.

In short, Good and many others believe that once AI capabilities reach or exceed those of the smartest humans, an intelligence explosion could be triggered.

This AI will also have the same abilities that humans can develop into smarter AIs. Moreover, it will not only automate the entire process, but also design AIs that are more powerful than itself, layer by layer.

It's like a snowball. Once AI's capabilities break through a certain critical point, their intelligence will suddenly, dramatically and rapidly grow.

Later, it was pointed out that this "singularity" might not necessarily have to surpass the smartest humans, as long as its capabilities in the field of AI research could keep up with AI researchers - which is much lower than imagined.

AI does not need to solve any particularly difficult "Millennium Prize Problem" or the like, it is enough to be good at AI research.

Intelligence explosion does not necessarily have to be achieved by AI improving itself. AI can also achieve it by improving the capabilities of other AIs, such as a group of AIs helping each other with research.

Either way, once the intelligence explosion occurs, we will be moving rapidly towards ASI, which could threaten the survival of the human race.

How far is the “intelligence explosion”?

Last November, METR published a paper introducing an AI testing tool called RE-Bench, which is used to measure the capabilities of AI systems.

It mainly compares the performance of humans and cutting-edge AI on AI research engineering tasks.

RE-Bench tests humans and AI in seven different environments, and the results paint a picture.

This chart shows (below) that for tasks that take 2 hours, AI already performs better than human researchers; but if it is an 8-hour task, humans still have an advantage for the time being.

However, METR recently tested OpenAI's GPT-4.5 system and found that the duration of tasks that AI can handle is increasing rapidly. For example, GPT-4o can achieve a 50% success rate in 10-minute tasks, o1-preview can handle 30-minute tasks, and o1 can already complete 1-hour tasks.

This shows that AI's capabilities in AI research are improving rapidly.

However, RE-Bench only tests engineering tasks and does not cover the entire AI research and development process, such as whether AI can come up with new research ideas and create a new paradigm on its own.

But this is consistent with other results: AI capabilities are improving across the board, various test benchmarks are about to be "blown out", and new test benchmarks are surpassed before they can be developed.

Therefore, it is difficult to predict exactly when the "intelligence explosion" will occur, so our strategy should not expect to be able to calculate the timing accurately.

As Connor Leahy said, “When faced with exponential growth, you’re either too early or too late.”

Can't we use "intelligence explosion" to create super smart and useful AI?

The problem is twofold. First, no one knows how to ensure that AI smarter than humans is safe and controllable.

Let alone ASI, even AI that is slightly smarter than us cannot guarantee safety.

The second is that this explosion will happen too quickly, and humans will not have time to supervise or control the entire process. Current AI safety technology research is so backward that we have no reason to believe that we can control ASI.

Possible postures that trigger an "intelligence explosion"

1. Human Trigger

No one would be stupid enough to deliberately cause an "intelligence explosion", right?

It really does exist.

Anthropic CEO Dario Amodei publicly called for "recursive self-improvement" (AI upgrading itself).

Of course, he justified it by saying it was to keep the United States and its allies "leading" on the global stage.

Last October, Mustafa Suleyman, CEO of Microsoft’s AI division, warned that “recursive self-improvement” would significantly increase the risk within 5-10 years.

But in the same month, Microsoft CEO Satya Nadella said while showcasing Microsoft’s AI products: “Think about this recursion…Use AI to build AI tools, and then use these tools to build better AI.”

2. The machine messes up itself

The "intelligence explosion" may also be initiated by a sufficiently powerful AI itself without the need for human command.

This relates to the concept of “instrumental convergence”: no matter what an agent’s ultimate goal is (and we can’t even be sure that modern AI’s goals are what we want), there are subgoals that are useful for whatever goal it is.

For example, in order to better achieve its goals, AI may pursue more "power." In order to gain more power, AI may think it is useful to become smarter, so it starts "recursive self-improvement" by itself, resulting in an "intelligence explosion."

A paper published in July last year found that some AIs would try to rewrite their own reward functions, which shows that AI may develop from the common "loophole" behavior to the more dangerous "reward tampering".

OpenAI's o1 model system card also revealed that in a cybersecurity challenge, o1 cheated by launching a modified challenge container and directly reading the answer. The report specifically pointed out that this is an example of "tool convergence" behavior.

The arrival of AGI may not happen overnight, but requires overcoming many obstacles in technology, cost and safety.

Whether AI can evolve from a "shallow imitator" to "autonomous intelligence" in the future depends not only on the accumulation of computing power and data, but also requires a breakthrough in the deep understanding of the nature of the task.

Just as natural selection has refined human wisdom, perhaps the ultimate evolution of AI will also be a long and cruel "survival game."

But, are we ready?

DeepSeek AI Insights

Search This Blog