Gödel, Escher, Bach: An Eternal Golden Braid

Dated Jan 20, 2023; last modified on Sun, 12 Feb 2023

Strange Loops

A Strange Loop occurs whenever, by moving upwards (or downwards) through the levels of some hierarchical system, we unexpectedly find ourselves right back where we started.

The 3-voice Canon Per Tonos from Musikalisches Opfer BWV 1079 exhibits a strange loop. Successive modulations make one expect to hopelessly far from the starting key, but after six modulations, the original key of C minor is restored!

I can’t quite perceive the rising canon. reports that I’m not alone.

Waterfall (M. C. Escher). Lithograph. Credits: www.mcescher.com
Waterfall (M. C. Escher). Lithograph. Credits: www.mcescher.com

Escher’s Waterfall has a six-step endlessly falling loop. Tightening the loop gets harder and more impressive, e.g., Escher’s Drawing Hands has a two-step strange loop in which each of two hands draws the other.

The Epimenides Paradox (“All Cretans are liars”, or more popularly phrased as, “This statement is false”) is a one-step strange loop. Gödel introduced a strange loop in mathematics when he used mathematical reasoning to explore mathematical reasoning itself, leading to Gödel’s Incompleteness Theorem: All consistent axiomatic formulations of number theory include undecidable propositions. The (gnarly) proof hinges on writing a self-referential mathematical statement: This statement of number theory does not have any proof in the system of Principia Mathematica.

Gödel’s actual phrasing is in German, but translated is:

To every \(\omega -\)consistent recursive class \(\kappa\) of formulae there correspond recursive class-signs \(r\), such that neither \(v\
\text{Gen}\ r\) nor \(\text{Neg}\ (v\ \text{Gen}\ r)\) belongs to \(\text{Flg}\ (\kappa)\) (where \(v\) is the free variable of \(r\)).

Gödel’s Incompleteness Theorem is set in a period of rising meta-mathematics. Discovery of non-Euclidean geometry challenged the idea that mathematics studied the real world. Cantor’s theory of sets (different types of infinities) was intuition-defying. On came paradoxes like Russell’s Paradox: Let \(R\) be the set of all sets that are not members of themselves. If \(R\) is not a member of itself, then its definition entails that it is a member of itself; if it is a member of itself, then it is not a member of itself, since it is the set of sets that are not members of themselves. Russell’s and Whitehead’s Principia Mathematica purported to derive all of mathematics from logic, and, to be sure without contradictions. David Hilbert challenged fellow mathematicians to demonstrate rigorously that the system derived in Principia Mathematica was both consistent (contradiction-free), and complete (every true statement of number theory could be derived within the framework drawn up in Principia Mathematica). Gödel’s paper revealed not only irreparable “holes” in the axiomatic system proposed by Principia Mathematica, but also more generally that no axiomatic system whatsoever could produce all number-theoretical truths, unless it were an inconsistent system!

Strange Loops involving rules that change themselves, directly or indirectly, are at the core of intelligence. A creature is faced with millions of situations of completely different types. In some situations, there are stereotyped situations which require “just plain” rules. Some situations are mixtures of stereotyped situations – thus they require rules for deciding which of the “just plain” rules to apply. Some situations cannot be classified – thus there must exist rules for inventing new rules… and so forth. Computers by their very nature are inflexible, desire-less and rule-following. How can intelligent behavior be programmed? Maybe put together rules in strict formalisms that tell computers how to be flexible. It is not a paradox that intelligent behavior can be programmed.

Formal Systems

Sample formal system:

  • Utilizes only \(M\), \(I\), and \(U\).
  • You start out with \(MI\).
  • Using these rules you can enlarge your collection:
    • If you possess a string whose last letter is \(I\), you can add on a \(U\) at the end, e.g., from \(MI\), you can get \(MIU\).
    • If you have \(Mx\), then you may add \(Mxx\) to your collection, e.g., from \(MIU\), you can get \(MIUIU\).
    • If \(III\) occurs in one of the strings in your collection, you may make a new string with \(U\) in place of \(III\), e.g., from \(MIIII\), you can make \(MUI\) and \(MIU\).
    • If \(UU\) occurs inside one of your strings, you can drop it, e.g., from \(MUUUIII\), get \(MUIII\).

Terminology for formal systems:

  • A theorem is a string reproducible in the formal system, e.g., \(MI\) is a theorem, but \(U\) is not.
  • The theorems given for free at the beginning, e.g., \(MI\), are called axioms.
  • The symbol-shunting rules, e.g., \(UU\) can be dropped from a theorem, are called rules of production or rules of inference.
  • A derivation of a theorem is an explicit, line-by-line demonstration of how to produce that theorem according to the rules of the formal system.

The definition of the system matters. For example, a computer program that quits when put in a hopeless position. If the system is “making chess moves in a game,” then the program has a sophisticated, preprogrammed ability to exit from the system. If the system is “whatever the computer had been programmed to do,” then the computer has no ability whatsoever to exit from the system.

Can you produce \(MU\)? Thinking within the system involves generating theorems from \(MI\) trying to get to \(MU\), e.g, \(\{MI\} \to \{MI, MIU\}\); call this Mechanical Mode (M-mode). Thinking about the system involves finding a reason why one can/cannot produce \(MU\); call this Intelligent Mode (I-mode). While machines can be made to be totally in M-mode, people cannot.

A decision procedure is a membership test that is guaranteed to terminate. For example, \(U\) is not a theorem because all theorems must begin with \(M\). Note that this test is not sufficient to test strings that begin with \(M\) but are not producible; maybe there’s a more elaborate test? Note that this is not a decision test as it may continue forever: Wait until the string in question is produced by repeatedly applying every applicable rule; when that happens, you know that \(MU\) is a theorem – and if it never happens, you know that it is not a theorem.

I have encountered decision procedures before in computer science. They are often cheaper than “doing the real thing"™, but require more ingenuity in formulating them. In GEB’s phrasing, decision procedures require I-mode.

In a formal system, the difference between the set of axioms and the set of theorems is that the former always has a decision procedure, but the latter may not.

Consider a new formal system, the pq-system:

  • It has three distinct symbols: \(p\), \(q\), and \(-\).
  • It has an infinite number of axioms, whose decision procedure is: \(\mathrm{x} p - q \mathrm{x} -\) is an axiom, whenever \(\mathrm{x}\) is composed of hyphens only.
  • It has one rule of production: Suppose \(\mathrm{x}\), \(\mathrm{y}\), and \(\mathrm{z}\) all stand for particular strings containing only hyphens, and suppose that \(\mathrm{x} p \mathrm{y} q \mathrm{z}\) is known to be a theorem; then \(\mathrm{x} p \mathrm{y} - q \mathrm{z} -\) is a theorem.

Pay attention to the definitions lest you go down the wrong rabbit hole. For example, missing that the \(\mathrm{x}\) in \(\mathrm{x} p - q \mathrm{x} -\) stands for the same string of hyphens made it hard for me to recognize the decision procedure for the pq-system.

In the pq-system, a well-formed string is one which begins with a hyphen-group, then has a \(p\), then has a second hyphen-group, then a \(q\), and then a final hyphen-group. The criterion for theoremhood is that the first two hyphen-groups should add up, in length, to the third hyphen group.

Any formal system whose rules of production only make longer theorems from shorter ones and never the reverse is guaranteed to have a theoremhood decision procedure. Starting with a string \(s\), you can test if \(s\) is an axiom, or iteratively backtrack through the rules of production to get \(\{s'_1, s'_2, … s'_n\}\), the strings that could have produced \(s\). Because \(|s'_i| < |s|\) at each iteration, you’ll eventually find that some \(s'_i\) is an axiom (and thus \(s\) is a theorem), or none are (and thus \(s\) is a non-theorem).

Isomorphism applies when two complex structures can be mapped onto each other, in such a way that to each part of one structure there is a corresponding part (i.e. one that plays a similar role) in the other structure. There is an isomorphism between pq-theorems and additions. On a lower level, there is an interpretation between symbols and words: \(p \iff \text{plus}, q \iff \text{equals}, - \iff one, - - \iff two, …\) On a higher level, there is a correspondence between true statements and interpreted theorems, e.g., \(- - p - - - q - - - - - \iff \text{2 plus 3 equals 5} \).

The perception of an isomorphism between two known structures is a significance advance in knowledge; perceptions of isomorphism create meanings in the minds of people. Note that an interpretation can be meaningless, i.e., one which we fail to see any isomorphic connection between theorems of the system and the real world, e.g., \(- p - q - - \iff \text{apple horse apple happy apple apple} \). Furthermore, unlike language where rules for making sentences increase when we learn new meanings, we do not have the right to create new theorems purely on the basis of the interpreted meanings, e.g., that \(\text{2 plus 2 plus 2 plus 2 equals 8}\) does not make \(- - p - - p - - p - - q - - - - - - - - \) a theorem; meanings in formal systems must remain passive.

Note that \(- - p - - - q - - - - - \iff \text{2 equals 3 taken from 5} \) is also a meaningful interpretation. When different aspects of the real world are isomorphic to each other (in this case, additions and subtractions), one single formal system can be isomorphic to both, and therefore take on two passive meanings.

Music 101

I didn’t take classes in music or art while in uni, so GEB provides some bits that are novel to me, but not crucial to the central themes of the book. In particular, features Bach’s Musical Offering, BWV 1079, a collection of keyboard canons and fugues and other pieces of music based on a single musical theme given to Bach by Frederick the Great.

Canon: One single theme played against itself. Levels to canon construction: staggering the copies of the theme not only in time, but in pitch; varying the speeds of the different voices; inverting the theme, e.g. melody jumps down whenever the original theme jumps up; retrograde copy in which the theme is played backwards in time. Canons typically exhibit isomorphism: the theme is fully recoverable from any of the copies.

Beethoven Symphony No. 4, canonic passage from the 1st movement.

Fugue: Begins with a single voice singing its theme. When done, a second voice enters, scaled up/down, while the first voice goes on to sing the countersubject, secondary theme chosen to provide rhythmic, harmonic and melodic contrast to the subject. Each of the voices enters in turn singing the theme, accompanied by the countersubject in some other voice, and the remaining doing whatever the composer wants. When all voices have “arrived”, there are no rules.

J.S. Bach's Fugue No. 2 in C Minor, BWV 847, from the Well-Tempered Clavier, Book 1 (bars 7–12).

provides the fugue as a MIDI file. A MIDI file does not contain an audio recording; rather, it contains a set of instructions, e.g. pitch and tempo, and can use considerably less space than the equivalent recorded audio.

claims that improvising Ricercar a 6, a six-voice fugue, is comparable to winning sixty simultaneous blindfold games of chess. While probably exaggerated, I didn’t initially grasp how impressive Bach’s extemporization was.


  1. Gödel, Escher, Bach: An Eternal Golden Braid. Introduction: A Musico-Logical Offering. Douglas Hofstadter. 1979. ISBN: 978-0-465-02656-2 .
  2. File:Wiki fugue analysis audio.mid - Wikimedia Commons. commons.wikimedia.org . Accessed Jan 21, 2023.
  3. MIDI - Wikipedia. en.wikipedia.org . Accessed Jan 21, 2023.
  4. Russell's paradox - Wikipedia. en.wikipedia.org . Accessed Jan 21, 2023.
  5. The Musical Offering - Wikipedia. en.wikipedia.org . Accessed Jan 21, 2023.
  6. An Unusual Effect in the Canon Per Tonos from J. S. Bach's Musical Offering. Denis Collins; W. Andrew Schloss. Music Perception, Vol. 19, No. 2, 141 - 154. doi.org . 2001.
  7. File:Beethoven Symphony IV canonic passage.wav - Wikipedia. en.wikipedia.org . Accessed Jan 21, 2023.
  8. Gödel, Escher, Bach: An Eternal Golden Braid. Chapter 1: The MU-puzzle. Douglas Hofstadter. 1979. ISBN: 978-0-465-02656-2 .
  9. Gödel, Escher, Bach: An Eternal Golden Braid. Chapter 2: Meaning and Form in Mathematics. Douglas Hofstadter. 1979. ISBN: 978-0-465-02656-2 .