Trustworthy Data Visualization (Kieran Healy, Duke University)

Transcript#

This transcript was generated automatically and may contain errors.

Good afternoon, everybody. Thank you for coming. Thank you for staying right till the end. I'm sure everyone is quite tired. The good news is this talk is mostly pictures.

Yesterday Hadley showed you all ggbot. Well, this is OG ggbot.

I am a sociologist interested in data visualization. Unlike most of you, I work with quantitative data quite often and I think visualization is a good way to help me investigate it and understand it. It's also a great way for me to take something that I've found and quickly show it to somebody else. Maybe I want to explain it to other people or just as often it's a way for me to ask a question to say I don't understand what's happening here or what the hell is this.

A bit more generally, I'm also interested in how data and the tools that we have for collecting and analyzing data have transformed the world around us. In that regard, I think of data visualization as a kind of synecdoche for data science as a whole or for the use of data in society. It's part of a bigger thing that can stand for the whole thing. For me, thinking about data visualization naturally leads to these bigger questions. As researchers and data analysts, how should we think about our tools and techniques? What's the relationship between our data and the world that it's about and how does that get encapsulated by the graphs and visualizations that we draw?

All of these things, I think, have been transformed, indeed revolutionized over the past few years as the scale and scope of what we can do with data has just gotten more and more powerful. Now, data comes in many forms but speaking very generally, a common pattern is that we start from a stream of data that we get from somewhere. Maybe it's the result of our own really painstaking collection efforts or maybe more often these days it's regularly emitted by some device or some piece of infrastructure that's already in place. That might be a network of satellites or it might be a server in a giant air -conditioned shed somewhere or the phone in your pocket. And of course, these things aren't that far apart anymore.

And what we do is we take that stream of data and we construct some sort of score or sequence of scores with it. And then again, sometimes that thing is something that people habitually call a score, like a credit score. Other times it's a quantity or a coefficient that's the result of some sophisticated model, more like an estimate or a prediction of some result. Or maybe it's just a count or an average or a price, a score in the sense of a number that characterizes something. And once we have that sequence or a collection of scores, we want to use it to tell a story. And that's often the point where visualization comes in. See, look, here's what the data look like, here's what they mean, here's what we should do next.

Technical analysis and the chart men

One early stream of data that people really wanted to tell a story about was this one, the Dow Jones Industrial Average or averages. In the early days there was one for industry and one for the railroads and you can just see kind of at the bottom there one for utilities too, which actually still exists. The Dow Jones was a steady stream of data about the stock market as a whole and the desire to transmit that data far and wide in a more or less continuous stream was one of the driving forces behind the expansion of the telegraph, the development of modern communications technology generally. Deep in the bones of your terminal, of your PowerShell, of your Positron or RStudio or VS Code session, you will find the marks of the teletype and ticker tape machines that were built to transmit price data.

There was a reason that people really wanted that stream of data. They wanted to make money and in some cases they wanted to make money by using graphs to see into the future. Charles Dow, the founder of both the Wall Street Journal and Dow Jones, was retrospectively credited with the Dow Theory, which later became technical analysis. And technical analysis started from price charts, which of course had to be drawn by hand, and combined it with a theory of what investors in the market were thinking and then he used that to predict prices. And in practice, the main technique was telling stories with graphs.

The people who did this were the chart men, and they were all men. The classic source is a book by Robert Edwards and John McGee called Technical Analysis of Stock Trends. It came out in 1938 and it is still in print. It's now in its 11th edition, I think. The idea was that if you looked at trends in stock prices, either for the market as a whole or for individual stocks, you could spot recurring patterns and patterns that told you what might happen next. Not infallibly, of course, but enough to give you an edge, to let you make an informed decision to buy or sell.

So you start with a high-low closed chart like this one, which is from Edwards and McGee, and then you start looking for the telltale patterns. What patterns, exactly? Well, it turns out there are a lot. Some of them seem more or less unobjectionable. A number will go up, a number will go down, a number will go sideways. I think this style of drawing is very nice, by the way. They're not like sparklines, right, that people put in summary tables even today.

But the chart men also thought that you should take care to look at fluctuations within the trend. They were very keen to argue that downward trends, true downward trends, had to be measured by the downward trends of the highs, whereas true upward trends had to be measured by the upward tendencies of the lows, for example. Now anyone can see a basic upward or downward trend, but at this point the theory starts to get a little richer. One idea was that after a period of steady growth, you would often see investors pull back from a stock as they lost confidence in it, and hence the pullback effect. So the trick would be to sell at the top of that third peak there.

They had a whole family of these patterns, each with a name, a distinctive visual signature, and a story associated with what to do when you spotted it. Their favorite was one that was called head and shoulders, which was this triple peak thing where the third rally of a stock failed and it ended up falling below its original price. And so what you really wanted to be was where the smart money was, which was selling to buyers at the top of the head there. As I say, the underlying theory here was a kind of psychology of market segments, which is not an unreasonable idea. John Maynard Keynes made a lot of money in the stock market in his day and once remarked that the business of the investor was to predict what average opinion thought average opinion would be.

John Maynard Keynes made a lot of money in the stock market in his day and once remarked that the business of the investor was to predict what average opinion thought average opinion would be.

By the way, I've often wondered if the anti-dandruff shampoo was named after this phenomenon, and I honestly think it was. Procter & Gamble introduced head and shoulders in 1961 and it was marketed as a medicinal cream to well-dressed men embarrassed about getting dandruff on the collars of their suits. Men who, as you can see here, looked at stock prices in the paper.

Anyway, the chart men didn't think these patterns were iron rules or laws of nature. They regularly cautioned that you can't just look at the chart and read the future, obviously, because if you could then everybody would be doing that. But they were still committed to the theory, to the story, and this led them understandably to develop what a philosopher of science might call some auxiliary hypotheses to save the main idea. For instance, maybe some data doesn't quite fit the head and shoulders pattern. That's probably because sometimes head and shoulders patterns are complex, like maybe you have two heads, or two shoulders, three or four shoulders. You already have two shoulders. Maybe sometimes there's a failure pattern either at the bottom or at the top, and of course when I say a failure pattern, what I mean is that the world has failed us in some way, not the theory, because our theory knows about failure patterns. We have a name for them and everything.

So the technical analysts, the chart men, poured over charts like this and they made their recommendations. They told their stories with their data. What killed this theory in the end was the rise of the efficient markets hypothesis in its various forms, and that's the idea that in something like the stock market, asset prices reflect all available information, or at least all publicly available information about an asset. So in the short run, prices follow a random walk. That means you can't consistently beat the market, at least not like this. The technical analysis people were just reading patterns into noise. Sure, sometimes it worked, but in the long run, you wouldn't outperform an index fund, except by luck.

I wanted to write a little tiny bit of code to generate a random walk of stock prices, and I was kind of hoping it had produced one of the standard patterns, and I put in my seed for today's date, and sure enough, it did on the first try.

In a way, it's like the HAL we've ended up with is untrustworthy like the old HAL, but for totally different reasons, like for the opposite reason.

Half the struggle is inductively figuring out what commitments the robot can reliably keep.

The third question is about where data comes from. If we take the perspective of, let's say, the last two centuries, there's been an absolutely mind-boggling expansion in the capacity of organizations to collect, maintain, and analyze increasingly large quantities of data. That began with the expansion of states as compulsory organizations looking to collect information about their own populations and territories. It was enhanced by other formal organizations, the modern corporation doing similar things on a smaller scale, and it was supercharged by a series of revolutions in information technology that we're still living through. If we take the perspective of, let's say, the last two fiscal quarters, then a lot of things people have taken for granted about the state's continued capacity for collection and analysis of data is very much in question. That word mere in the phrase mere reliability, the one that the philosophers use all the time, suddenly appears a little complacent as questions of institutional reliability and trustworthiness are on everyone's mind.

All I want to emphasize here is the tremendous degree of reliance that we have on a data collection infrastructure that consists mostly of public institutions or the product of work done funded by public institutions, by people trained at public institutions, and on tools that are produced and contributed to voluntarily or engineered and maintained in the public domain.

Sea surface temperatures and the bucket correction

One last example of the difficulties involved with getting a stream of data, accurately producing a score, and telling a visual story in a trustworthy way. NOAA is one of several international organizations that has collected, managed, and distributed climate data over the years. Thanks to those efforts, we have a number of carefully maintained time series of sea surface temperature measurements, some going back well into the 19th century. Some years ago, there was quite a bit of debate in the scientific literature about how this particular time series didn't quite line up with other data that we had from other sources of measurement. In particular, the sea surface data seemed to indicate a kind of pause or plateau in warming in the post-war period, beginning around 1940, when other indicators did not.

Some very careful work by various scientists, not all of them working in the same group or anything, but a lot of independent work, established what was going on. It's one of the basic but repeatedly forgotten lessons of the sociology of science that measurements come from somewhere and are eventually turned into data. The sociologist Bruno Latour liked to think that there was a moment when they passed from being a thing in the world into being a record.

How do you measure the ocean surface temperature in the 1800s? If you're part of the Royal Navy and oceans are battlefields, you do that with a wooden bucket thrown over the edge of your ship. Eventually though, sailing ships get replaced by steam and then diesel engines and the method of getting the temperature changes. On newer ships, you measure the seawater that's pumped into the engine room to cool the engines, before you use it to cool the engines, and the water that way tends to be warmer than water hoisted up in a wooden bucket, because a wooden bucket is not a good insulator. Then later, someone digitizes a lot of U.S. Navy logbooks and adds them to the time series. That makes for more warm biased measurements, especially for years when there are suddenly a lot more American Navy ships floating around in the North Atlantic, like starting around 1940, for some reason. And so you get a spike in apparent ocean temperatures, which make subsequent years look like they plateau for a while, and thus the so-called bucket correction is born. Our score needs to be adjusted.

Later, temperature sensing gets done in other ways that are more comprehensive, more standardized, more automated. The end result, or one end result, is me taking advantage of all this careful work, instead of remodeling my kitchen, to draw a picture on my laptop.

Actually, more than a picture, we can make an animation with this. This is an animated recreation of a graph you might have seen circulating in a few places. I'll set it running. We're just tracing changes in daily global averages, year after year. The graph employs the standard virtues of ggplot, a layered structure where we try to highlight the elements we're interested in, where we repeat design elements in a way that lets the viewer follow the structure of the data and the graph more easily. Conceptually, it's just the same as the COVID figures I showed earlier. It's just animated, thanks to Thomas Lynn Peterson's gganimate package.

The thing is, the trustworthiness of this visualization has very little to do with the graph directly. The trustworthiness comes from a web of actors that includes government-sponsored data collection, private sector organizations, universities, scientific institutes, individual technicians, researchers, scientists, engineers, staffers, midshipmen, all making and meeting their commitments to one another. If you think all of that is fake, or a put-up job, there isn't anything I can draw, there's no mark I can put on a page or a screen that can fix that.

If you think all of that is fake, or a put-up job, there isn't anything I can draw, there's no mark I can put on a page or a screen that can fix that.