Emergence



Emergence is coming, thanks to Duncan Anderson for this piece:: What follows is an essay on a topic that I’ve been thinking about for a while now. I hope you find my words thought provoking and a counter balance to some of the more hysterical AI commentary. Reasoning and explainability are topics full of nuance — I hope this comes across in this essay. When GPT-4 was released last year I noticed one particular conversation develop: “explainable AI”. GPT-4 was the first AI model that showed real advancement in the field of reasoning. To some of us that was exciting, but it also threatened some of those who’d been making a living with more traditional decision making technologies. Explainability has been held up as a barrier to the adoption of models like GPT-4.

In some fields, e.g. healthcare or financial services, it can be especially important to explain why a particular decision has been taken. It therefore follows that we need to understand why an AI has taken those decisions, hence explainable AI.

Before I respond to this challenge, it’s worth taking a moment to consider how an LLM works and how it comes to be able to make decisions.

An LLM works by predicting the most statistically probable next token in a sequence. Thus when I ask it “who is the president of the USA?”, the model does not perform some form of structured reasoning and database lookup to find Joe Biden’s name. Instead, it knows from its training data that Joe Biden is a statistically probable sequence of tokens that could be produced to complete the input “who is the president of the USA?”. The fact that an LLM has read a very, very large (hence the first L in LLM) amount of text means that it’s able to perform this trick for a very wide variety of inputs.

Critics of LLMs might finish here and point out that such a model is not reasoning in any meaningful sense. They’d also say that it’s not “explainable”, because its answers come from a giant statistical machine. To explain why the model came up with Joe Biden would entail an understanding of the many billions of parameters in the model — something that’s clearly impractical and impossible for any human.

However, to finish the discussion at this point would be a mistake and a wilful disregard for what LLMs actually represent.

Let’s take a detour into the world of science…

In the world of science there are two explanations for how to examine and understand the properties of a system. The first, reductionism, interprets a complex system by examining its constituent parts. Thus, a reductionist sees the world as simply an extension of the behaviour of building blocks like atoms, molecules, chemical reactions and physical interactions. If you understand the basics, everything else is just a bigger version of that.

Reductionism is how most of us tend to think about most things — it’s a very logical way to think about complex systems and is mostly the default for human thinking.

A reductionist’s analysis of an LLM sees it simply in terms of its ability to predict the most statistically probable next token. By definition, LLMs cannot reason and any evidence that they can is just an illusion brought on by the large training set. It’s a fancy party trick.

However, I’m not sure I buy the reductionist angle when looking at LLMs. For me, it doesn’t fully explain some of what we see happening.

However, reductionism is not the only way to analyse complex systems. In fact, reductionism cannot explain much of science and how we actually experience the world.

Take salt, which we all know to be a combination of Sodium and Chloride atoms (NaCI). Sodium is a metal that reacts explosively with water, whilst Chloride is a poisonous gas. And yet, when we combine them, we get an edible crystalline structure with a distinctive taste. To my knowledge, salt is not known for its explosive properties when in the presence of water and salt is not especially poisonous. Reductionism cannot explain why salt is so dramatically different from its constituent parts. Nothing about studying the properties of Sodium or Chloride tells us anything about salt.

To understand salt we need a different way of thinking. That different way is known as emergence.

Emergence predicts that as complex systems become more complex, they frequently take on properties and behaviours we cannot predict by looking at their constituent parts — as nicely explained by the wikipedia article on the topic.

Acknowledgement and thanks to:: Duncan Anderson | Medium
Feb. 11, 2024