How does the breakthrough around LLMs change our view of that relationship? What does that relationship mean for the future of AI and the business around it? Do you know what is more mind-boggling than LLMs? Split brains, like the one Split-brain Joe has.
Language and Intelligence: It Is Complicated
The two halves of the human brain are normally connected through a “bridge” called the corpus callosum. To treat epilepsy, Joe’s bridge was surgically cut, effectively giving him two independent brains. That made him a precious subject for important neurological experiments. Which is what Michael Gazzaniga did. There are a lot of fascinating things happening here, but this video elaborates on the part relevant to the LLM discussion.
- In the experiment, Michael Gazzaniga presented one image (a saw) to Joe’s left visual field and another (a hammer) to his right visual field. These images flowed to the opposite side of the brain, i.e. the left image to the right hemisphere and vice versa.
- Michael then asked Joe: “What did you see?”. Interestingly Joe answered that he only saw what was projected into his right eye: “a hammer”. He wasn’t aware that there was an image in his left eye. Keep in mind that the left hemisphere has the language centers of the brain. Since it is now split from the right hemisphere, it only processes information coming from the right visual field, the hammer. Hence it was able to articulate it. Joe was only conscious of what he was able to articulate.
- Michael then gave Joe a pen in each hand and asked him to draw what he saw. The right hand is connected to the left hemisphere. Using his right hand he drew the hammer that he saw with his right eye.
- Here is where it gets fascinating: Using his left hand, he drew the saw that he saw ( 🙂 )with his right eye despite not being aware he’d seen it. Here it is again.
- Joe was surprised that he drew a saw despite having seen a hammer. When confronted with “What did you draw a saw for?”, Joe could only exclaim “I don’t know!”. Michael interpreted that despite the information of the saw being in Joe’s brain, it wasn’t part of his consciousness. He could still act on that information (draw the picture), but he wasn’t aware of it.
Such experiments demonstrated that there is a strong link between language and consciousness. Out of both hemispheres, the one capable of communicating was the one with consciousness. The other hemisphere was just as intelligent, it just didn’t have a language center. Split-brain experiments provided a valuable experimental foundation for the centuries-old philosophical debate on the nature of consciousness. It linked consciousness to language ability. On the other hand, it decoupled consciousness from intelligence.
The ability of human-like communication has always been a cheat code to gain the status of intelligence in our collective imagination.
That is difficult to wrap our heads around. How can we only be conscious of what we can communicate? How can we be intelligent without being conscious? What about things that can communicate? Are they conscious? Are they intelligent?
Skipping the rabbit hole of consciousness.
We Love to Equate Language-Ability With Intelligence
We have a long history of doing this. Koko the gorilla captured the public’s imagination with her capability of using human sign language. Crows, on the other hand, were ignored for decades by the neuroscientific community, despite demonstrating amazing cognitive abilities rivaling gorillas. We are hardwired to put a premium on authentic communication and assume intelligence everywhere we find that ability. In fact, as soon as we could build computers, we set language ability as the measure of intelligence for AI systems. Alan Turing did.
After kick-starting modern computing in the 1930s, Alan Turing wondered what an “intelligent computer” would be like. In what later was called the “Turing test”, he proposed that an intelligent computer would be one with communication skills indistinguishable from those of a person. In the original formulation of the test, an interrogator engages in a text-based conversation with a human participant and a machine (commonly referred to as the “Turing machine” or “AI entity”). The interrogator is unaware of whether they are communicating with a human or a machine. The goal of the test is for the machine to generate responses that are so convincingly human-like that the interrogator cannot reliably differentiate between the machine and the human. If the machine successfully fools the interrogator into thinking it is human, it is said to have passed the Turing test.
Whether in animals or computers, we glorify language. The ability of human-like communication has always been a cheat code to gain the status of intelligence in our collective imagination. Language and intelligence are often tightly intertwined. The fact that an algorithm that predicts the next word in a sentence now seemingly has high reasoning capabilities demonstrates that. But there is intelligence where there is no language, and there isn’t always intelligence where there is language.
Language Where There Is No Intelligence
…and that seems to be the central point in such debates: whether an agent actually understands what it is saying or whether it just has the appearance of understanding. The philosophical rabbit hole here is bottomless. The engineering aspect of that question is a very relevant one for the future of AI: What is the difference between engineering a system to have real understanding vs. a mere appearance of understanding? Is there value in the appearance of understanding? What is the value of true understanding?
Where Language and Intelligence Meet
Creating an appearance of understanding is a technical breakthrough in its own right. The principle behind large language models has a deceptively simple name: next-word prediction. Using certain AI magic (transformers), these models are trained to predict the next word given a set of words. Give it astronomical amounts of data and some more AI magic, these models would create an almost flawless appearance of understanding.
Language is a highly condensed form of not only our knowledge but our intellect.
In a recent paper, Microsoft researchers demonstrated that GPT4 has high reasoning abilities in mathematics, coding and physics. The astounding thing about that was that the model was able to solve problems it hadn’t seen before (probably). Reasoning ability after all is the ability to create new knowledge out of existing knowledge. This behavior seems to go far beyond the mere “appearance” of understanding and anything we have seen from an AI system so far. The researchers called the paper “Sparks of A.G.I.” announcing the dawn of Artificial General Intelligence.
But how can that be? How could a system only trained on language become that good at general reasoning? Could it be that language contains way more of our reasoning than we thought? Is it possible that a lot of our reasoning and cognitive capabilities are encoded in all the language we produce? LLMs not only memorize huge amounts of data; they decode and learn the underlying structure of language. That structure is what enables them to generate human-like language.
But it seems that by learning the structure of language, these models have learned some of our cognitive and reasoning mechanisms encoded in them. When the researchers presented it with novel problems, GPT4 seems to use these mechanisms to solve them and thus produce “new knowledge”. Language is a highly condensed form of not only our knowledge but our intellect. GPT4 memorized that knowledge, internalized some of that intellect, and applies it to new problems it hasn’t been trained on. Whether it is AGI or not, it is a leap from what we’ve been seeing from AI systems.
But some linguists knew this all along. Noam Chomsky theorized that all human languages share such a universal structure. He called it “Universal Grammar”. He argued that babies are hardwired to this structure, which enables them to learn languages so quickly and accurately. Maybe that is exactly what GPT4 has captured and maybe universal grammar is what encodes a lot (but surely not all) of our reasoning capabilities. As the researchers put it: “One general hypothesis is that the large amount of data (especially the diversity of the content) forces neural networks to learn generic and useful “neural circuits”.
Any time we are faced with a system auditioning for intelligence we seem to set ourselves as a fuzzy standard, fetishize human-language ability and reject anything with a statistical foundation.
GPT4’s capabilities could be just the proof of the existence of the underlying universal grammar. Ironically, Chomsky himself isn’t too impressed by ChatGPT. Basically, he thinks it is all statistics (which it is) and that statistics cannot capture the depth of our minds. Either we underestimate how much of our minds we leaked through language or we romantically hold our minds eternally superior to statistics. We just don’t like to be reduced to statistics.
Either way, we aren’t even remotely close to AGI. The paper offers an in-depth analysis and examples of the shortcomings of GPT4 (and LLMs in general) in that regard. We don’t even have a consensus about the definition of intelligence, let alone AGI. Any time we are faced with a system auditioning for intelligence we seem to set ourselves as a fuzzy standard, fetishize human-language ability and reject anything with a statistical foundation.
That is too restrictive. What about Joe’s right brain? What about the crows? A common denominator of many definitions of intelligence (or AGI) is the ability to perceive the environmentand solve new problems in novel ways in it to fulfill needs or goals, without these being explicitly defined for all the different states of the environment. That type of intelligence is obviously not limited to humans, why should AGI be? If anything, engineering AI away from our environments frees the endeavor from any self-indulging misconceptions about intelligence, cheat codes, misinformation, and many of the other inherent difficulties of the human-likeness of an artificial agent.
Neuroethology (animal neuroscience) has a good way of thinking about this.
Intelligence Where There Is No Language
It is tricky to understand how intelligent animals really are. For a start, we cannot fully communicate with them. More importantly, we don’t fully understand nor imagine the way many of them perceive the world and react to it.
The German biologist Jakob Johann von Uexküll was curious about how animals perceived their environments. In 1909 he coined the term “Umwelt” to describe an animal’s sensory world (distinct from the literal German translation of “environment”). An animal’s Umwelt is a fusion of all its senses that forms its experience of its environment. The recent fantastic book “An immense world” by Ed Yong entertainingly describes how diverse and strange the Umwelten of different animals are.
Bats famously perceive their environment through echos. Bumble bees use electric fields to build their Umwelt. Airflows are a significant part of a peacock’s Umwelt. The whole body of a catfish is a “tongue” that functions similarly to how touch does with humans. The mantis shrimp’s environment is so alien, I wouldn’t even attempt to describe it.
An Umwelt isn’t just what the animal perceives; it is the model of the world where it acts, where it moves, where it solves problems, and where it co-exists with other creatures. An Umwelt is where an animal exercises its intelligence. It is not just a case of alien perception systems and bodies. Many of these Umwelten aren’t describable to us. We cannot conceive what cannot perceive, and we surely can’t communicate about it.
Therefore in nature, our Umwelt isn’t the only Umwelt with intelligence. There is intelligence where there isn’t human language. How about AGI then? Why should we restrict ourselves to building AGI only in our Umwelt? (That is if we should build AGI at all…but that is a different conversation….but we should). “Animal-like” AGI is just an interesting if not more, proposition than human-like AGI.
AGI for Other Umwelten: Hypothetical Exercise
Right.
First of all, we need to design an Umwelt. We need to decide what kind of Umwelt should our AGI exist in. In other words, what type of senses do we want our AGI to be affected by (i.e. sensors), and how do we want it to affect its environment (i.e. actuators or what the engineers call “Agency”). We are good at that kind of thing. We came up with so many ways to sense stuff that isn’t available to our natural senses and found ways to do things in the world that transcend our bodies. Choosing the Umwelt is also a “product” decision: it depends on where we see the value of an AGI working for us. An AGI could live in the Umwelt of financial markets, the Umwelt of biometric data of a patient, the Umwelt of physical laws and respective measurements, the Umwelt of Computer-Aided-Design merged with product usage data, etc.
Then, we need to encode knowledge of the Umwelt into the system while also giving it a way to continuously update it based on its own experiences in it. We are getting good at that sort of thing. LLMs are a fantastic foundation for doing so. They will definitely have better interfaces to them, interfaces our AGI system can use in real-time and will get cheaper to train.
Here is where it gets tricky. The very definition of AGI requires the system to be able to act intelligently in different and novel situations. New situations require new goal definitions. Obviously, since the designer cannot explicitly determine all of them ahead of time, the system would need to continuously redefine its goal every time it faces a new situation. It needs to take a few high-level goals (as with animals: eat, reproduce, repeat) and figure out what they mean in a specific situation. The definition of the high-level goals would be closely tied to the choice of Umwelt (i.e. our financial markets agent needs to generate profits rather than eat). We “just” need a way to take the perception of a specific situation, develop it into an understanding by matching it with knowledge about the environment as well as the high-level goals, and then derive situation-specific goals. There is a lot of research in this area.
AGI in other Umwelten can extend our existence, our understanding, and our agency into other Umwelten that transcend our bodies. It doesn’t necessarily need to have language-ability though. It can be Joe’s right hemisphere-type of intelligence.
Two Sides of Not Exactly the Same Coin
Splitting Joe’s brain showed us how deeply intertwined language and intelligence in his left hemisphere are. His right hemisphere showed us how separate language and intelligence can be. ChatGPT and the Chinese room argument demonstrate that language-ability doesn’t always predict intelligence. GPT4 and Chomsky’s universal grammar show how language ability can create intelligence. Animals and their very different Umwelten inspire different ways of thinking about AGI and underline the case for intelligence with no language. Language and intelligence are very close but distinct. They are two sides of not exactly the same coin.