Carnap and symbolic spaces
Written by Sil Hamilton in May 2022.

To what extent can one reduce the axioms behind any phenomenon down to their barest nature? We deal with models on a daily basis – models govern our economics, our social interactions, our movement in and around (non-)corporeal environments of all sorts. Our models encode our expectations, our beliefs, and our understanding of what the future should entail given the priors. Models are essential, but what is essential to them is not clear.

We know their size motivates their utility. Larger models describe more complex phenomena, which allows for better predictions the future. What limits the size of models? Perhaps our own ability to comprehend and compute them. Compressing models is important. What is the minimum size a model can achieve? Rudolf Carnap sought to answer this question in 1928 with his Der Logische Aufbau der Welt, an exploration of the limits of boiling scientific knowledge down.

Carnap sought to unify the sciences by providing a framework for capturing reality in a single description, a notion epitomized in his concept of the world-sentence. The world-sentence was a string of atomic symbols describing the most base relationships between entities occupying the same topological space. Given this world-sentence, one would be able to reconstruct the space in its entirety. That the entities reproducible by the sentence must be mutually present was key: Carnap believed all reality (whether corporeal or not) must occupy the same mathematical space for his framework to take hold.

“That is the discovery of one [single] comprehensive space. All things are in space; any two things are always spatially related to each other. So there is also a path from me to any [given] thing.”

“Every thing is accessible. If someone now claims that a thing of a particular kind exists, I can demand of him that he show me the path from me to the claimed thing.” [1]

Taking the above two quotations at face value, we find Carnap has planted his endeavour in the same roots as those promoting the reductionist campaigns of the latter twentieth century. Carnap yearned for an objective similar to those advancing cybernetics two decades later. By unifying the physical and the conceptual realms, Carnap placed all reality under the purview of mathematics. Mathematics in turn allowed him to capture reality (however reduced or simplified for his purposes) in a single model. Practitioners of the first cognitive revolution (say, Pitts and McCulloch) sought for the same goal. While the Aufbau was eventually abandoned by its creator, Carnap’s goal of seeking a unified symbolic space has since come to light (albeit in a different domain).

Neural models of language are successful. Predicated on distributional semantics, the most successful deep learning implementations date only to 2017. Despite training on unlabelled data, self-supervised models like BERT and GPT develop sophisticated representations of language. Take BERT as an example, which encodes the meaning of a word into a vector of 768 dimensions. When overlying the vectors of multiple semantically-similar words into the same space, it is clear the model encodes information topologically. BERT encodes semantic and syntactic information in the same space. While Chomskyan notions of language may separate the two, advanced language models of sufficient size show this is not the case. This has startling consequences for where our own linguistic faculties emanate from, especially since models like GPT predict in manners similar to specific neural pathways in the human brain.

Scholars are becoming increasingly fascinated with language models. We take note of their scale: verging on trillions of parameters. We see their awesome ability to simultaneously encode information about both language and the physical world; all via unlabelled data. But there are greater consequences. We talk of language models, but what we are seeing is that language is itself a model of reality: it captures semantic information in a web of syntactic relationships. But despite this, we also see GPT and BERT are becoming capable of cognition through language. Language is a model of reality. Language models are models of language. What is a model of a model?

While the Aufbau failed, it is becoming clear there is a growing need to formalize notions of how and why producing models of language results in models of reality. Talk of language models is increasing on a daily basis. Most popular science descriptions of the matter introduce language models as a black box containing little elves who conduct magic. It is true researchers have yet to develop coherent theories of why deep learning on its own is effective. We may understand the precise mechanics, but the big picture continues to be lacking. Struggles in BERTology hearken the struggle for developing theories of consciousness. That said, there may be hope on the horizon. Mathematics has gone through a parallel revolution in the past decades. Growing fields like category theory and homotopy theory represent new attempts at unifying fields under the mathematics and beyond. One feature in category theory relevant to our question is that of descent theory. Descent theory attempts to explain what conditions allow for the creation of losslessly reduced images of more complex objects. It is worth investigating if the topic of modelling interests you.

Language models have brought us to a strange staging area of scientific progress. We have developed symbolic spaces, wherein our symbols are not referents for particular objects but are rather points in spaces, coexisting in a universe all to their own. We develop models on models. Our models are expanding in size, both token-wise and parameter-wise. If we continue to climb this ladder of models upwards, we may find our map stretch and expand.

And through this journey one may encounter a zombified Borges’ Empire, awakening once again.