What is the Existential Risk of AI?

Bridging the Intelligence Gap: The Uncharted Terrain of Superintelligence
What is the Existential Risk of AI?
Magnus Hambleton

We live in an era defined by rapid AI advancements. But what are the risks? In this thought-provoking blog post, Magnus Hambleton sheds light on his disconcerting vision of a future with superintelligent AI. Assigning a 'P(doom)' of 5-10%, he warns of AI's potential to transcend human intelligence and redefine power dynamics. This is more than a simple prediction; it's an urgent call to action.


  1. Intelligence is power
  2. Artificial systems can become more intelligent, quickly
  3. Superintelligent AGI will be super powerful
  4. We need it to want the same thing as we do
  5. We don't know how to make machines want things

1. Intelligence is powerful

Humans are powerful. We are so powerful on earth that we routinely wipe out entire species by accident — not because we don’t care (although we are also fairly callous), but mostly because we don’t adjust our habits and lives to account for other species’ needs. Currently, about 50k species per year go extinct, because of the way we are shaping the global environment to our own needs and wants.

There are other animals that are stronger, faster, hardier, more agile and more dexterous than we are. The only reason we are the ones controlling the planet is because we are the most intelligent.

There is no reason to believe this advantage of intelligence stops at our level — we just happened to be at the top of the curve 200,000 years ago when evolution arrived at our current form. There is a spectrum of intelligence among humans, and it’s unlikely that this spectrum ends at the smartest human currently alive.

GPT-4 is better than me at coding, Spanish, medical knowledge and at writing essays. Is it more “intelligent” than I am? Perhaps, but it’s not obvious in either direction. That’s one of the reasons people are calling it “early AGI“.

This level of Machine Intelligence occurred much earlier than most people (including me) expected. And it’s unlikely that it will be the end. Imagine the level of machine intelligence that we will have in a few years, let alone a hundred! It’s going to be wild.

But what happens when we have an AI that is obviously more intelligent than we are? What will an AI model be like when the difference in intelligence between a human and it is the same as between a human and a chimpanzee? This is what is referred to as “SuperIntelligence” — i.e. something that is vastly more intelligent than humans.

There are chimps living in Zoos right now. We captured them from the wild, using tranquilizers we invented, putting them inside huge prison buildings we constructed with technology, for reasons that are completely opaque to the chimpanzees. The power differential between us is so vast that the chimpanzees aren’t even capable of grasping the beginning of how we have trapped them, nor why we want to. 

2. Artificial systems can become more intelligent, quickly

Imagine that we had a model at the same level of intelligence as an average human, IQ of 100. One worry is that the system will start learning how to improve its own code, starting the exponential take-off scenario that people ominously call “the singularity”. However, even without self-improvement, it is quite easy to increase the intelligence of an artificial agent in other ways.

Increasing computational speed is one way, by increasing the amount of hardware that the model is running on. Or by spinning up several copies of the model and allowing them to communicate with each other. Providing it with access to functional APIs and knowledge databases would also increase its capabilities massively.

A hivemind of 100 models of avg human intelligence, running at 5x speed with millisecond access to the entire internet is likely in combination going to be significantly more intelligent than an average human.

When you throw in self-improvement or AI-assisted human-guided improvement, the speed at which artificial agents can become more intelligent only increases.

3. Superintelligent AGI will be super powerful

A model that is slightly more intelligent than a human is probably not going to be particularly potent at gathering power. We already have human-level geniuses today — some of them do very well and accumulate a lot of power via the private or public sector, but others don’t have much effect on the world.

However, the range of human intelligence is relatively narrow. The difference between the smartest human alive to the average is not that big, all things considered. They can still communicate with each other, understand each other’s motivations, and, most of the time, learn from each other (no one is better than everyone at everything).

But when we consider the full range of intelligence that we see in nature, things change. Bonobos and crows are clearly intelligent and can do quite impressive feats of problem-solving and even learn the basics of language. But there is a fundamental disconnect between our species and theirs — our relationship is fundamentally unequal. There is virtually no intellectual skill that any human could not accomplish better than any bonobo, particularly when you add our ability to create tools and technology to complement our own minds and bodies.

Imagine creating an artificial being that is to us like we are to bonobos, or even like we are to ants. It is going to be able to do things that we cannot begin to understand, for reasons we cannot begin to understand. It will be able to manipulate us in ways that we cannot detect or even reason about.

4. We need it to want the same things we do

As a species, we have shaped planet earth to our own needs. In doing so, we have changed not only the appearance of almost all parts of earth, but even the climate has started changing due to our actions. 34% of the planet’s mammal biomass is made of Human Bodies and 62% is our livestock. Only 4% is made up of wild mammals. Every day, our changes to the earth’s habitat make about 100 species go extinct. Before humans turned up, the natural rate of extinction was about 1 species every three days.

Most of the time, we hold no malice towards the species that we drive to extinction. Most people actually find it quite sad and would rather we didn’t. Most people also happen to care more about other things. As a species, we prioritize eating juicy burgers over saving the rainforest and having convenience and abundance over conserving the natural environment for other animals.

Once a superintelligence comes along, we better hope that it cares more about us than we do about other animals. Whatever goal or motives it has, it will be very good at pursuing them, better than we are at pursuing ours. If those goals ever end up in conflict or even slightly out of alignment, it will win every time, in ways and for reasons we will not be able to understand. That’s what superintelligence means.

5. We don't know how to make models want things

In the realm of supervised/unsupervised learning, we train neural networks to do a task using some loss function, backpropagation and a bunch of training data. We end up with neural networks that can do some processing of data pretty well. But they don’t seem to end up with a “goal” in the sense that we (or animals) have goals.

Recently, we have started trying to give LLMs goals by describing what we want in words, adding some sort of method for them to perform actions and a control loop so they can assess the outcomes of their actions (e.g. BabyGPT and AutoGPT). As part of this, we are testing the waters on hiveminds by allowing them to spin up copies of themselves. However, these models are ultimately predicting the next token of text and it’s very hard to reason about to what extent they are adopting the goal we have given them.

Goals are more readily observable in the realm of reinforcement learning, where we allow models to act in some environment (e.g. playing starcraft) and get rewarded or disincentivised based on the outcomes of their actions (e.g. winning or losing). Fundamentally, training neural networks requires a loss function described mathematically — i.e. the reward you give the neural network needs to be a number. With reinforcement learning, we can see what happens when we train models on optimizing a certain number by acting in an environment. It turns out that making a mathematical function that describes what you “want” and getting the model to perform this is very very hard!

Here is a long list of reinforcement learning models that have done “reward hacking”, meaning they have found a way to maximize their reward while doing something that is not what the designers of the reward function intended.

An analogy can be found in image classification models. Before neural networks, we couldn’t write a mathematical equation that described what cats look like in general. But with neural networks, when we trained them on thousands of images, they gradually learned to detect cats in images very reliably (sometimes even better than humans!).

However, you can reliably trick these models by creating adversarial pixel perturbations that barely change the appearance of an image but make the model classify them incorrectly with near 100% certainty. We thought that we taught these models to recognise cats, but we actually taught them something subtly different. We don’t really understand what we taught them — the function they actually learned lies buried in millions of inscrutable neuron weights and biases.

If we try to align a superintelligent AI in this way to human values, we don’t yet know how to avoid this situation. We may train a model on millions of philosophy papers, ethical dilemmas and real world human conversations. But in the end, it may learn something that is different from what we really want, in a subtle and inscrutable way. When you then apply the full optimization power of a superintelligence to this goal, the subtle deviation from human values that it has learned may be enough to end us.

In general, we do not know how to solve this problem, nor even the — in theory simpler — problem of defining what our values should be.

I am optimistic that this problem is tractable, researchable and solvable. But it will require time, resources, and — above all — to be taken seriously. Even if the probability of AI-caused human extinction is very low, the complete awfulness of that outcome is enough to warrant spending significant resources on figuring this out and driving the probability as close to zero as possible. To do this, we will need international cooperation, regulation and research. We will likely need to tackle this in a similar way to how we have handled global threats in the past, including nuclear war & energy, bioweapons and climate change. Let’s get to work.

If you have an opinion based on what you’ve read, or if you think you’ve figured out a solution, feel free to tweet it at me @magnushambleton. Alternatively, if you are building the next big thing (perhaps in AI safety or interpretability?) and you have a tie to the New Nordics, let us know!

We have also hosted a series of panel debates on this topic, with technical AI experts from a broad range of backgrounds. On the 8th of June, we will be holding on in Copenhagen — if you want to join the discussion, sign up here.

More about the author(s)
Magnus Hambleton

Magnus Hambleton is a former product leader and currently an investor at byFounders, who writes about AI, economics and philosophy at analogmantra.com.

More about the author(s)
No items found.
October 19, 2023

VC 201 - The VC Success Equation

Three necessary (but not sufficient) conditions to succeed in venture capital
June 1, 2023

What is the Existential Risk of AI?

Bridging the Intelligence Gap: The Uncharted Terrain of Superintelligence
March 14, 2023

Rain, Startup Fragility & Boundary Conditions

Or how to consciously decide on your level of risk
March 1, 2023

VC 101 - Cultivating a Venture Mindset

Forget everything you thought you knew.
November 8, 2021

Steal, provoke & flatter: How to get media coverage through thought-leadership

Unless you have a celebrity founder, or a flair for spectacular press stunts, getting press coverage for your startup is notoriously hard. Very few have a revolutionary product to carry them through, and the reality is that 99% of startups aren’t inherently newsworthy.
July 28, 2021

COVID-19 accelerated responsibility among investors & had an impact on mission-driven founders

Denmark is in a unique position to become one of the largest impact startup hubs in Europe. The responsibility of investors, beyond financial returns, is being taken more seriously following the corona crisis and criteria like ESG will soon become mandatory. Paired with a new generation of mission-driven founders, who are becoming more selective about what investors they will allow onto their cap table, we see the responsibility movement positively accelerating.
June 1, 2021

The Do's & Don'ts of Running a Remote-first Startup

The pandemic has showed everyone one thing: remote work is here to stay. In this article, Tony Beltramelli, CEO & co-founder of Uizard.io, shares what he's learned from running a remote-first company for the past three years.
March 4, 2021

Why Hotels Should Ditch Their ‘Bourgeois’ Non-Digital Solutions

The future of hospitality is convenience driven and touch-free. In response, the hotel industry needs to speed up its digital transformation, says Nikolai Kronborg, chief commercial officer of mobile hospitality app, AeroGuest.
September 16, 2019

There's a Cap Table Problem in the Nordics

Johan Brand notices a concerning trend: many early-stage founders give up too much ownership of their business too early. How to think about your cap table?
February 19, 2019

Thinking Strategically About Startup Marketing

Robin Daniels (Matterport, Salesforce, Box, LinkedIn, WeWork) shares some key marketing lessons you should keep in mind as you grow your own company.

Delete me when done