\( \newcommand\cvec[1]{\begin{pmatrix}#1\end{pmatrix}} \newcommand\bra[1]{\langle#1\rvert} \newcommand\ket[1]{\lvert#1\rangle} \newcommand\braket[2]{\langle#1\rvert\,#2\rangle} \newcommand\ketbra[2]{\lvert#1\rangle\langle#2\rvert} \newcommand{\frasq}{\frac{1}{\sqrt{2}}} \)
In my last post I demonstrated how easily a mathematical proof can lead to completely absurd conclusions if the "base truth" is incorrectly set up. This led me down the rabbit hole and I spent much of my leisure thinking about mathematical proofs in relation to artificial intelligence (AI), and related concepts like learning, understanding, computability, and consciousness.
Rather unexpectantly, you might hear these concepts discussed in the context of quantum physics or even gravity. It is far more likely to hear opinions on such matters from neuroscientists, or philosophers, but being a "learned" physicist I cannot help it and would like to come at these subjects from an other perspective.
During my time as a graduate student I have been fascinated by theories from quantum physics, general relativity, cosmology, astrophysics, and information theory. While I realize that these theories are meant to explain the reality we see and feel, ironically, they always felt to me more like fantastical stories from other worlds, some quite literally, others more metaphorically. Looking back, I realize now that the most well-established scientific theories in these fields are fundamentally flawed, and not many people seem to be much bothered by the status quo in physics.
Perhaps because it is exceptionally difficult to take a step back and question the fundamental tenets of the prevailing theories of today which we have simply learned to accept. It is even tougher to find alternative explanations to them, especially in quantum mechanics and general relativity, whose models have been tested and (for the most part) validated time and time again over the last and current century. Maybe it is difficult because, scientists generally don't like to speculate about ground truths which can only be accepted, but hardly proven directly, which gives such ideas strong resistance from proponents of the commonly accepted "scientific dogma". Still, I would like spend some lines here wildly speculating and writing about my thinking process and problems which bother me about the current state of physics, and possibly draw lines to the origins of consciousness and the possibility of its computability.
I recently found myself sitting in the audience of a podium discussion about pathways beyond present AI. In the discussion, one of the central themes was the question whether current systems should be modelled more closely in the image of the human brain. The discussion triggered many ideas in my mind, most of them probably just recognitions of patterns in the noise, others may have some actual meaning. Is the key to fully autonomous, general artificial intelligence different hardware that more closely resembling the brain? General AI (GAI) and awareness often seem to be implicitly equivalated. Missing a commonly accepted, concrete definition, one can broadly think of GAI as the ability to learn and understand intellectual tasks on a similar level to a human being. According to such a definition, it is reasonable to conclude that GAI requires awareness which in turn requires consciousness.
Consciousness has many facets. It is actively driving free will, it is passively involved in the qualia, the way things subjectively seem to us, such as the perception of the color blue, the feeling of sadness, or the impression of a musical piece. These are phenomena that are seemingly private, and by some accounts, even beyond any physical explanation. For being physical means being public and measurable and is epistemically inconsistent with privacy. Some think of consciousness in the physiological perspective as an error correction mechanism which focuses on irregularities produced, e.g. by motoric automatisms, and tries to fix a perceived incorrectness, in the way a piano player practices a piece and tries to fix muscle memory which seems incorrect to him. In my opinion, consciousness is also essential in true understanding. To understand something, requires conscious self-reflection and self-evaluation of some sort, the ability to assume different perspectives on a subject and recognize its relativity to other views, similarly to the concepts of time and gravity. This suggests that consciousness could emerge from self-modeling. A sort of fractalized construct for which only a small seed might be necessary in order to start a self-referencial loopback process until emergence of consciousness is achieved, which is admittedly a very "loopy" idea. Whether these properties and abilities are computable, I am unsure, however I will try to explain later why it may require something more than simple deterministic hardware in single state machines, as the computers we conventionally use today. No matter what perspective is taken, computational models have to encompass all manifestations of consciousness in order to be completely conscious. So, logically speaking, if some manifestations of consciousness are incomputable (maybe due to inherent physical limitations), computational models will never become fully conscious.
Nevertheless, most people seem to believe (or hope) that with increasing amount of computational power, consciousness emerges naturally. Again, this is likely motivated by the fact that we often think of the brain as a biological, mushy, albeit intricate, Turing machine, i.e. computer. With this premise, most of what you might call computation which the brain performs is done by the cerebellum and the brainstem. From what I gathered by listening to neuroscience experts, these parts are believed to have a much higher density of neurons in total than the cerebrum which enables us to speak, think, reason, problem solve… and learn. The cerebellum on the other hand coordinates muscle movements, and the brainstem regulates reflexes and bodily activities, including the heart rhythm, breathing, sneezing, and swallowing. Although it is slightly controversial, as far as we know all of these computations happen unconsciously. Speaking from experience, a pianist who plays Chopin's "Nocturne in E minor, Op. 72" does not consciously activate muscles of his left hand's pinky to hit the lower E everytime a new bar starts, but is rather conscious about the harmony of the piece and his performance as a whole. This raises the question (at least for me), how can computation somehow produce consciousness if so much of the computations in the human brain is unconscious?
The idea that the brain is simply an intricate computer system, but still comparable to a conventional computer, seems more and more unlikely to me, or at least incomplete. What is the difference between unconscious computations and the fully aware computations we perform when we solve puzzles and problems? Can we really construct a mathematical system which can achieve consciousness?
Computability of mathematical systems
While the brain is still a big mystery to us (and especially to me as a lay person), computation is a mathematically well-defined concept. Amongst many others, Alan Turing worked out the idea of what a computation actually is. One of his greatest achievements is the development of the universal Turing machine, an abstract mathematical model of computation that "manipulates symbols on a strip of tape according to a table of rules". You may think of Turing machines as idealized computational machines, one realization of them being personal computers. He intended to use these models as devices to help investigate the extent and limitation of what can be computed. In a lecture in 1951, when Turing was talking about the question whether the brain be a machine similar to what he described in his research, he sounded reasonably optimistic on the view point that it might be possible. However, he also pointed out that several assumptions have to be made which can be challenged. The behaviour of such a machine must in principle be predictable through computation. He noted that the most critical problem of this predictability might be the indeterminacy principle from quantum mechanics, and that due to this no such prediction might even be theoretically possible. If the indeterminacy principle simply induces some randomness or noise into the system, this could easily be fixed by a randomization engine of some sort, however the problem lies deeper than producing randomness. While Turing machines are very well known even amongst non-mathematicians, Turing's oracle machines are less known. Devices generalizing and extending on a Turing machine to deal with problems unsolvable by logic. Already then he was beginning to think about models which go beyond mathematical systems due to their inherent limitation of which he surely must have known.
For, mathematicians and scientists in the early 20th century (and I for some time before I visited my first lectures in mathematics at uni) believed that mathematics is consistent and that all statements about numbers – statements which can be expressed by a computer – were either true or false, and that there must be a set of basic rules which can determine to which category such a statement belonged. It is important to know that all mathematical frameworks are built on axioms, simple and most basic postulates (or rules) which we believe to be true, such as "the number zero exists", or "\(\,2+1=1+2\:\)". One of such mathematical systems is called the Peano arithmetic which is the system you probably use most in everyday life. These frameworks can be arbitrarily extended by these rules to encompass more and more mathematics. Now, if it were really the case that every statement whether true or false can be proven or disproven, the logical conclusion were that there is an ultimate set of rules that could prove or disprove any conceivable mathematical theorem. According to this logic, mathematical proof always leads to truth, and is the true source of knowledge and understanding.
In 1931 however, Kurt Gödel, an Austrian mathematician and logician, presented his theorems of incompleteness which blew this world view out of the water with one of the most elegant recursive proofs I have come across. He demonstated that there is a disconnect between mathematical proof and truth. He showed that there are statements about mathematics, conjectures, which may be true, but can never be proven within any mathematical framework. The way he proved what became known as his incompleteness theorems is genuinely clever: he managed to device a method which allowed mathematics to make statements about itself. He thought of a mapping that assigns to each symbol, formula, mathematical statement, and axiom a unique natural number. There are an arbitrary number of possible mappings, but in his proof he used prime numbers:
Symbols | encoding | |
---|---|---|
0 | \(\rightarrow\) | 1 |
not | \(\rightarrow\) | 5 |
or | \(\rightarrow\) | 7 |
\(\forall\) | \(\rightarrow\) | 9 |
\((\) | \(\rightarrow\) | 11 |
\()\) | \(\rightarrow\) | 13 |
… | … |
This mapping protocol is known as the Gödel numbering, or Gödel
coding. This is very similar to how computers assign a specific
sequence of 1s and 0s to every symbol in the alphabet, for instance
the uppercase letter E
is represented by 01000101
in ASCII code.
The importance of this protocol is that the provability, or any other
property such as thruth or falsehood, of a mathematical statement can
be can be equivalently assessed by determining properties of the Gödel
number of said mathematical statement. For example (although
simplified), in some particular Gödel numbering, a mathematical
statement being provable could mean showing that the statement's code
is divisible by the axioms' code numbers.
Gödel now put forth the challenge: "This statement cannot be proved from the axioms" within that mathematical system. Since it is a mathematical statement about the system, it has a Gödel number and can be shown to be true or false. Notice that this self-referencial statement is not just about itself, but also about its provability. The trick here is that Gödel constructed a mathematical statement which was the numerical equivalent of that sentence. Gödel's actual, encoded prove is quite hard to follow even for me who speaks German and is somewhat familiar how mathematical works are written, but we can still sketch it in relatively simple terms: If we assume the statement is false, meaning the statement "This statement is provable from the axioms" should be true. However, a provable statement must be true, on the other hand we started our chain of thought by assuming that Gödel's challenge is false, and thus we are lead to a contradiction (two equivalent statements being both false and true). Assuming that mathematics is consistent, the statement that Gödel put forth must be true. Now, we have proved the statement "This statement cannot be proved from the axioms" to be true. So, we have a statement about mathematics which is true, but cannot be proved. The brilliance of Gödel lies not in the fact that he constructed a statement which could not be proven, but that he found a statement that couldn't be proven and was true.
You might now ask yourself, how could we prove a statement that cannot be proved true? The important detail in Gödel's proof is that within a mathematical system with certain axioms, we found a true statement which cannot be proved true from within that system. We have proved the statement by working outside the mathematical system. Now, you might also be tempted to add the statement as an axiom and simply expand the mathematical system to encompass more of mathematics. However, within that system the same self-referencial trick can be applied again, and again, and we always end up with an incomplete mathematical system. Thus, we can conclude that mathematical systems are either inconsistent or incomplete. Gödel's second incompleteness theorem says that a consistent mathematical system cannot prove its own consistency, which makes things even worse, as incomplete systems are generally preferred over inconsistent systems.
Gödel's theorems can easily be used to shine a depressing view on mathematics, as it concludes that there are conjectures, we will never be able to prove. This could mean that decades of work on attempting to find a proof for the Riemann hypothesis, Goldach's conjecture, or the twin prime conjecture, might have been for naught. However, it can also be re-interpreted in the following way: for any set of consistent, computational rules \(\mathcal{R}\), we can construct a mathematical statement \(\mathcal{G}(\mathcal{R})\) which, if we believe in the validity of \(\mathcal{R}\), we must accept as true; yet \(\mathcal{G}(\mathcal{R})\) can never be proved using \(\mathcal{R}\). In this form, Gödel's theorems seem to suggest that understanding, in the sense of consciously becoming aware of the truth, is at least in some cases beyond the realm of computability. And since understanding seems to be related to some aspects of consciousness, it too is in some instances beyond computation.
Whether this conclusion can be carried further and breaks even the possibility of some generalized notion of computation – hyper-computation – reaching consciousness, I don't know and will have to think about some more. Moreover, the argument might still be refuted, maybe through careful considerations of the assumptions with which the incompleteness theorems were constructed. In particular the consistency criterion might play a big part in this. Is the computation of today's machines mathematically consistent? This question is probably a bit harder to answer as you might think… chaos theory and the machine epsilon could take a role in this consideration. Another core matter which could be important is the notion of truth itself. Are there maybe more classes of truths than one?
A question that is particularly interesting to me is: What physical process is central in capturing the key issue of consciousness, and is it computable? As far as I know, all processes in classical fields of physics are computable by a universal Turing machine, given unlimited storage and/or computing time. There are mathematical functions, which are non-computable by a Turing machine, an example being the Halting problem which is strikingly similar to Gödel's statement in the incompleteness theorems. No physical processes however seem to be known which embody such functions. Perhaps, entirely in Gödel's spirit, the most likely place to find non-computable physical processes is in theories which are either inconsistent or incomplete, but seemingly "truthful". I can think of two very prominent theories in "modern" physics of which one is – in my own, probably quite controversal opinion – inconsistent, and the other – by the community commonly accepted – consistent, but incomplete; the former theory is quantum mechanics, the latter general relativity. It is my suspicion that the same physical process or processes which are missing from these theories might be a building block for consciousness. To explore the possibility of such non-computable physical processes, I would like to introduce in the following these theories as I understand them and discuss some concepts which I still don't understand.
General relativity
The most influential equations of the 20th century are the Planck equation \(E = h\nu\) which lead to the discovery of the particle-wave dualism and kickstarted advances in quantum mechanics, and Einstein's most famous equation for special relativity \(E=\sqrt{(m_0c^2)^2 + (pc)^2}\) which reduces to the more commonly known \(E = mc^2\) in the rest frame. The latter equation layed the foundation for Einstein's continuation almost ten years later, the theory of general relativity.
Einsteins scientific work can be roughly divided into two segments, before and after his formulation of general relativity. After graduating from the University of Zürich in 1905, Einstein developed basic intuition about the nature of light, quantum mechanics, and space-time. Already then, he saw the need to generalize the principle of relativity from inertial to accelerated motion. He was captivated by the ability of acceleration to mimic gravity and the idea of inertia as a gravitational effect. These ideas finally issued in a theory of static gravitational fields in 1911. In it gravity bends light and slows clocks, and the speed of light varies from place to place. In collaboration with his friend and mathematician, Marcel Grossmann, Einstein managed to transition to a generalization of gravity, described by the mathematics of curvature. Building on concepts from his mentor Hermann Minkowski, the geometry of space-time is modelled by a four-dimensional Lorentzian manifold with which the gravitational effect beautifully emerges from the curvature of space-time. In simpler (less precise) terms, it is the fabric of the Universe itself which gets distorted by massive objects and drags other objects into a fall, producing the effect of what we perceive as gravity.
In general relativity, space-time "curves" in the presence of mass or energy (and yes, light distorts space-time too, just not much). But what do we mean with curved space-time? To the general public, the effect is often explained using a rubber sheet with a heavy ball sitting in the middle of it (representing e.g. the Earth), causing it to sink and stretch the sheet, and other marbles (e.g. the Moon) on it to roll around or towards the heavier ball. I personally have several issues with such a comparison: To begin with, by placing object on the rubber sheet, the demonstration suggests that objects are moving on space-time rather than within it, as a part of it. Next, the demonstration uses gravity to explain gravity. It suggests that objects are attracted to each other because they are pulled downwards. It is not acceptable to explain gravity inside space-time using gravity outside space-time. It is more rigorous to say that if objects follow the well created by the Earth, it is because they move in a straight line, but within a curved geometry. When objects fall, they move straight ahead, but the curvature of space-time gives us the impression their trajectories are being deflected.
I think of curved space-time as a constant, perpetual contraction of a grid towards a massive object, such as the Earth, however it is important to understand that the geometry does not really contract, it is simply the fact that straight lines come closer together in curved geometry which gives the impression of contraction. The concept of straightness on curved geometries is a bit counter-intuitive to people used to think in Euclidean terms; for instance, straight lines on a sphere which start out parallel eventually cross at their directional pole. Distances also become much more difficult to measure in space-time.