# Phase Space

If we took stock of everything that we know and compared it to what we don't know, we'd find that we know a lot about almost nothing.1 As we explore new things, we need tools which give us an idea of what we're working with even when we don't know what it is. In textual scholarship, we like to do close readings: understanding all the nuances of a text word by word so that we can tease out almost hidden meanings that rely on us understanding the text as well as its context.2 Sometimes, we don't have a text or a context, but the effect of the text upon an audience. Or, to put it in more practical terms, we can't tell what goes on inside an author's mind, but we do have the resulting text. What can we learn about that mind from the text it produces?

I want to introduce a few terms from physics that will prove useful. A "system" is an ensemble of things that we consider a self-contained set for experimental purposes. A "closed system" is a system that does not interact with anything outside of itself. An "open system" does interact with its environment. A system has a number of measurements that we might make that would uniquely name the state of the system. No other state would have exactly the same set of measurements. If we knew how, we could reconfigure the system to reproduce the measurements and, in so doing, set the system back to the same state it was in when we took those measurements. The smallest number of measurements needed to uniquely specify the state of the system is the system's "degrees of freedom." Each measurement represents a choice in how to configure the system. If there were three degrees of freedom, then there would be three choices that we would need to make to distinguish one state of the system from any other state that the system could be in.

Knowing how many degrees of freedom are in a system is useful: it helps us eliminate certain models. If a model doesn't have enough degrees of freedom, it can't describe the system.

If we took the right number of measurements of a system and plotted them on a graph where each degree of freedom was an axis, we'd draw what is called a "phase diagram" or "state diagram" (some use the word "portrait" instead of "diagram"). The space in which the diagram lives is called "phase space" or "state space" because it is built from the degrees of freedom (or states or phases) of the system. The problem is that this requires knowing how many degrees of freedom a system has and, more importantly, what the right things are that we should measure.

Building a Portrait

What if we don't know enough about the system to know how many degrees of freedom it has? And if we did, what if we didn't know which things to measure because we didn't know how they were connected within the system?

Fortunately, we have some tools that can help us for certain kinds of systems.3 I don't have room in this post to go into more detail than to say this: we assume that the behavior of many identical systems running at once is the same as the behavior of a single system running for a long time. I'm glossing over some details, but that's the gist of it. It's the same foundation for quantum mechanics. In the humanities, this means that we want to see if many novels written at the same time (many "identical" authors doing the same thing) will tell us something about a single author writing many novels. Or, perhaps more importantly, if a single author writing many novels can inform our understanding of many authors each writing a single novel.

This ability to switch from a large number of identical systems to a single system is important because sometimes we only have a single system (or limited number of systems) available. More importantly, we can't always measure everything about a system to identify its state. We might only be able to measure a single thing without knowing if that is the critical thing that will help us understand the system. Fortunately, Takens' 1981 paper, "Detecting strange attractors in turbulence," gives us a way out. Given a single series of data measured over time, we can build a qualitatively similar phase diagram of the system.

This is where autocorrelation becomes critical.

If we had a pendulum and wanted to reproduce the phase portrait in the above figure, we could measure position and velocity, but wouldn't have to. If all we measured was the position, we could reproduce the shape of the portrait by finding the right way to produce two-dimensional points from the single stream of numbers. Essentially, we want to take a list of numbers $f(t)$ and produce a list of pairs $$. We just need to figure out the right $@delta$.

$@delta$: 128

In the figure to the left, we plot these points for the given $@delta$ (in this case, 128 is the value from last week's post on autocorrelation since we've used the same data set). You can use the slider to see how different values for $@delta$ change the picture.

Interpreting the Portrait

So we have a phase diagram, but what does it tell us? Keep in mind that the portrait that we produce from a single stream of data will represent the qualities of a portrait drawn with real data built from measuring all the right independent variables, but it might not have the same quantitative features. Consider that the range of values that we have for each axis will be the same in our single time series version while each axis may have a different range when representing the true independent variables.

One of the most obvious qualitative features is the embedding dimension of the diagram. This is just the number of axes used to draw the diagram. In the above example with the pendulum, the embedding dimension is two because we use two of the data points for each point. There is no limit to how many embedding dimensions can be used to build a phase portrait other than the number of data points available. The embedding dimension has to be larger than the next number we can find out.

The dimensionality of the portrait4 (or "attractor" as some people call it since it represents the behavior that the system is attracted to after some time to settle) represents how many independent variables determine the behavior of the system. This is a bit harder to figure out because it depends on how the attractor behaves. If the attractor passes through a particular point more than once and doesn't go ahead to the same point every time, then there's another independent variable that needs to be taken into account: if the same point proceeds to different points at different times, then we are missing the information that lets us decide between the succeeding points.

A common way to figure out this dimensionality is to calculate the dimensionality of the portrait in succeeding embedding dimensions until the portrait dimensionality stops increasing. At that point, adding more information to each point doesn't increase the complexity of the portrait.

If the number of dimensions doesn't plateau, then the data might be random or we might not have enough data. Consider that given two sets of random data that are same for the first $n$ numbers, there's no reason the $n+1$st numbers should be identical. We can always add another dimension with random data and get two different points from two previously coincident points.

Another use of the portrait is to make predictions about how the system will behave given a particular state. The more data we have, the better these predictions can be, but any predictions are limited by the Lyapunov exponents which tell us how quickly errors grow. While the exponents we measure for a portrait built from a single time series might not match those from the true portrait built from full, independent variables, they share the same qualitative aspects (similar number of positive and negative exponents) and tell us how we can expect error to behave when using the constructed portrait to make predictions.

Phase Portraits in the Humanities

I don't have any examples yet, but I want to explore a few possibilities someday. The key is figuring out how to represent the humanities information as numbers beyond simply the binary code representing the text.

One possibility is to tie this to topic modeling in some way. If the topics can be represented as measurements of a text (perhaps 500-word selections from a novel), then the variation in the strength of a topic over the series of selections can act as a measurement of a system. Together, all the topics represent a collection of measurements. Do the topics capture the complexity of the text composition, and is that complexity (as an attractor dimensionality) less than the number of topics? How does that complexity increase as the number of allowed topics increases?

A fair amount of work seemed to happen with music up through the mid-1990s (e.g., Jean Pierre Boon and Olivier Decroly. "Dynamical systems theory for music dynamics." Chaos 5, 501 (1995)). These made use of a range of tools to describe and produce music.

1. In statistics, saying that something is "almost never" and "has zero probability" are pretty much the same. If we counted all the things that we know and divided it by the number of things that we don't know, the result would be almost zero. It is ironic that the more we study, the closer the ratio gets to zero.
2. See Borges's "Pierre Menard, Author of the Quixote" for a humorous example of text within context.
3. Note that ergodic literature has nothing to do with ergodic theory. Don't even try to link the two meanings of "ergodic."
4. I'll cover in future posts how to calculate this and the other calculations in this section.