Resources

Claus Wilke | Visualizing uncertainty with hypothetical outcomes plots | RStudio (2019)

Uncertainty is a key component of statistical inference. However, uncertainty is not easy to convey effectively in data visualizations. For example, viewers have a tendency to interpret visualizations of the most likely outcome as the only possible one. Viewers may also misjudge the likelihood of different possible outcomes or the extent to which moderately rare outcomes may deviate from the expectation. One way in which we can help the viewer grasp the amount of uncertainty present in a dataset is by showing a variety of different possible modeling outcomes at once. For example, in a linear regression, we could plot a number of different regression lines with slopes and intercepts drawn from the range of likely values, as determined by the variation in the data. Such visualizations are called Hypothetical Outcomes Plots (HOPs). HOPs can be made in static form, showing the various hypothetical outcomes all at once, or preferably in an animated form, where the display cycles between the different hypothetical outcomes. With recent progress in ggplot2-based animation, via gganimate, as well as packages such as tidybayes that make it easy to generate hypothetical outcomes, we can easily produce animated HOPs in a few lines of R code. This presentation will cover the key concepts, packages, and techniques to generate such visualizations. VIEW MATERIALS: https://docs.google.com/presentation/d/1zMuBSADaxdFnosOPWJNA10DaxGEheW6gDxqEPYAuado/edit?usp=sharing

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

I'm going to talk about visualizing uncertainty with hypothetical outcome plots. I just tweeted my slides so you can find me on Twitter at Klaus Wilke, very easy to remember, just my first name and my last name. And I'm going to talk about, it's kind of an experimental package which you can find here at this location and again if you find the slides on Twitter that will be the easiest to find all of that.

Okay, so to motivate that, let's look at this extremely original plot. It's of cars, you may have heard of it. We did not coordinate this. So fuel efficiency miles per gallon in this case plotted versus a displacement of the engine and you see it goes down and I fitted a nonlinear model here and I'm sure you have seen plots like that. So you have this line that was fitted and then there's this gray band and that tells us something about the uncertainty of that fit, right? Every time we model something there's uncertainty and we are used to displaying it like this but what does that actually mean, right? The truth is most people don't know, like I don't really know, nobody knows but it has to be there because if it's not there you get complaints, right? What's your uncertainty? Oh there's a band, okay now I know.

Okay, what that really means is that I mean this line is just one of multiple ways that we could have fitted it and so there's really an ensemble of different possible lines, we could plot fitted draws and that's not bad. The fundamental problem though is if we have the fitted draws, okay there's the line that goes here and then there's a little gap here so does that mean the line would never go here, right? Can the line ever go here? So you don't know, right? And the problem fundamentally with uncertainty is that you don't know but when you see a plot it looks certain, right? It's really difficult to plot uncertainty because the plot is kind of certain but it's not certain.

And so there was this idea that was popularized very recently of the hypothetical outcome plot was this paper plus one 2015 by Jessica Holman which is to actually animate to cycle through different potential fits and that to some extent reduces this because it's not static it's more clear that these are alternatives and you don't really exactly know what it would be but it's something like that, right? So it's a much more intuitive way of thinking or of experiencing the uncertainty.

So it's a much more intuitive way of thinking or of experiencing the uncertainty.

Okay so last August Hadley tweeted hypothetical outcome plots are a great way of communicating uncertainty to non-experts and then he put forth this challenge. Someone should make an RStats package to make these easier to create. Would be easy on top of gganimate. Famous three words. Okay so I was thinking I've done a lot of work with ggplot2 lately and I really like gganimate and I thought well maybe I can do that, let's see. And so what I'm talking about today is really my discovery of what are the useful things that maybe a package could contribute in this world.

Okay so there's really three questions that Hadley's challenge poses. So the first one is how do we generate the outcomes? It's many different ways that they could be generated so it's not trivial. Once we have them how do we get them into ggplot2? I'm just assuming we do this with ggplot2, right? gganimate though you of course you could use some other platform also if you wanted to. And the last one is is there anything to be done? And you'll see in a second why that is actually a meaningful question.

So is there anything to be done? Immediately after Hadley tweeted this there was this following response. You can make them easily with tidybase and gganimate. The hop examples in this talk were done that way. I'm adding examples to the tidybase vignettes when gganimate hits crown. Okay done. Problem solved. Well okay so this goes to how do we generate outcomes, right? So we can do Bayesian MCMC sampling. That's what the tidybase does and that's great but not everybody is a Bayesian, right? Or maybe in some models it just takes too long to do MCMC sampling or whatever the reason is you may want to do other things. So maybe you want to bootstrap, right? Resample the input data or maybe you want to fit a regular regression model and then just sample from the normal approximation to the uncertainty distributions that you get, right?

Bootstrapping with ggplot2

So we could argue that this is done with tidybase. It's very natural in the Bayesian framework to get these fitted draws that then you just somehow get into gganimate but the other two still need some work. Okay so let's look into bootstrapping. So actually Alex Hay's talk at the beginning of the session if you were here was actually the perfect introduction to my talk because he explained why I was unhappy with bootstrapping. What he did to bootstrap was terribly complicated, right? It was a lot of work which is fine if you have a sophisticated bootstrapping setup but I don't want to do that. I just want to do a few bootstraps to plot, right?

Okay so what I did is I bugged some people and the person I mostly ended up bugging is David Vaughn who sits right here. He's running this and said we need an easier way of doing this and so that's coded now. It's not yet really released but it's going to be in our sample eventually and it works as follows. So if we take say the iris data set, we group it and then we bootstrapify. It's the magic word. It does the following. If you know the iris data set, this is the iris data set, nothing has happened except there's now in groups it's species. I've grouped by species and it says bootstrap 15 because it's three species and we've bootstrapped each five times. So it's this virtual bootstraps that we've generated that we can use in a deplier sequence.

So now we summarize, we calculate the mean sepal length for example and now the output we get five numbers out here because we have five bootstraps and we have the mean lengths. So that's about the level where I feel comfortable with. That's easy enough. I can live with that.

So now we can think about how do we get this into ggplot2. So let's say this is our static plot without any bootstrap and so now we have to bootstrap. I make a bootstrapped data table, I bootstrap 20 times here and this gives me the name of the key column which I group here and so I get bootstrapped plots. Okay, that's fine. It's still not easy enough and so this is where I thought a little harder and I think we can do it even easier and so the problem here is which I fundamentally dislike, I'm piping the data into the plot but I'm sticking other data into the plot again right. And so if I want to do anything here in the pipe before I put it into the data then I also have to change this and it's just too complicated, right. I want to bootstrap in the plot.

Now I could start writing all sorts of special layers like geom bootstrap smooth or whatever and I end up writing a new layer for each type of geom or stat you have and that's also too complicated. So I came up with the following idea which uses a very little known feature from ggplot2 and so this is this library ungewiss that I wrote. German word ungewiss means uncertain so it's like a play of words if you're German you understand it. That's the website. I'm German as you probably guessed by now.

Okay, so this is what it does, right. The key line is this line so you can actually in ggplot2 as data give it a function that then does something to the data and so bootstrapper creates this function and there the 20 is the argument of how much do I want to bootstrap. And so this these six lines of code generate this right and so now you see I feed in the data once I plot points so the points are just the original data set right and the geom smooth I bootstrap and that's why how I get these bootstrap lines. And then of course I can connect that with gganimate so if you don't know gganimate it's this really nice way of animating plots so this is the only additional thing that I need to do. I say I want to transition between draws and then these two numbers are just how long do I take to transition I transition instantaneously how long do I stay in the state and the one is then meaningless it's just they all stay the same amount of time and I get this nice animated plot.

Sampling example: chocolate bar ratings

Okay let's do another example just to show you so we can do different things we don't have to do just smooth regression lines so chocolate bar ratings these are the data set each dot is one chocolate bar rating and the orange dots are the mean just to show you on average. So here we have USA this kind of similar to other and then Canada on average has a higher rating than the U.S. right but there's a lot of U.S. ratings not that many Canadian ratings. So we might want to know does a randomly chosen Canadian chocolate bar test taste better than a U.S. one so that's a common language effect size we just take one U.S. chocolate bar one Canadian chocolate bar and we say which one is better. So the blue lines that are jumping there those are the two bars right you go in the store you buy two bars and then you go which of the two is better. And as you can see it's it's sort of even the exact percentage is actually 53 percent of the time Canada is better and 47 percent of the time U.S. is a draw is better with Canada and you kind of get that right you see from the jumping dots like it could really come out either way.

Okay so how do we do that so now we don't have to bootstrap right we now have to sample and so the David Vaughan also wrote code for sampling and I piggybacked on that and wrote a feature to add that to ggplot2. So we start with our data set here the cocoa data set we only take Canada and the U.S. so this is an example of how we may want to process the data set before we feed it into ggplot2 right rating and location is what we want to plot. First we point a we plot a cloud of all the ratings right that's just a geom point and then I invented this geom which I think is needed and nobody had done it I think if somebody has tell me and I'll delete it from my package call this a vertical point line it's like a point but it's a line right.

And so this is now here the key thing data now I instead of bootstrapper I do sampler I sample 25 times but actually have to group because I don't want to sample from the entire data set right I want to sample individually from Canada and the U.S. What I haven't specified is my sample size by default it's one but I could sample different sizes I could sample 10 or whatever right the group is really important if you don't specify the group you might end up with two U.S. chocolate bars and no Canadian one and that wouldn't work. Okay and then in the end oh yeah this is just a little trick you need to know if you don't do this then the the lines they tween so they move smoothly from one to the other which looks nice but it's actually for perception is not that great of a choice so it's better to turn that off and then again we animate and then that's the result right.

Okay so I think that's that's a pretty efficient way to either bootstrap a sample works with any geom any stat right all you need is these two functions bootstrapper and sampler. And the nice thing is the bootstrapper if you assign it to a variable and then you can reuse it multiple time in multiple layers so they'll all bootstrap exactly the same way if you look in the documentation you can see how that works so you could like draw lines and points or whatever on top of each other and they'll all be bootstrapped exactly the same.

Normal approximation and animation styles

Okay but there's still this other thing right we can also generate outcomes by normal approximation to the sampling distribution and I did some work on that also and so I'll introduce that by comparing to the bootstrapping so we'll go back to the motor cars example so that was the example where we bootstrap.

And now so now we have to write special stats or geom so I wrote this stat smooth draws so instead of geom smooth now I'm using stat smooth draw so that's like a essentially a copy of stat smooth so it draws a smooth line but it does it multiple times it samples from the posterior distribution now I don't have to feed it in the data but I have to tell how many draws I want right so this is times equals 10 and I again have to set the group but now because the draw column is generated by the stat itself I have to write stat of draw. So here I write this because really it's the draw column is part of the data set that is fed in like it's kind of it's done by this function but to ggplot2 it looks like it's the data that it comes in versus this is something that happens in the statistical transformation. And then these are the two plots and they look kind of similar you if you do it enough you can see that the bootstraps they tend to be more grouped you get like kind of sets of bootstraps that kind of look the same in this particular plot it's not that obvious but conceptually it makes sense because bootstrap you add or remove entire points right either the point is there it's not there with the fitted draws you smoothly sample the entire space of possibilities.

And then this actually also works with gganimate so gganimate understands that it can transition on columns that have been generated by the stat and so then you get this.

Okay so just very quickly because the people that invented hops care a lot about so there's different ways that you could animate this right i've been using hard transitions you could do hard transitions but on a background of all the outcomes i don't know if it's it's sort of visible right. Um okay you could fade outcomes this is not really like if you you can sort of see it they fade out and the new one comes in or you can tween between outcomes this is actually the default that gganimate will do if you don't tell it otherwise it looks very fancy but the perception researchers argue that this doesn't really work as well to pursue. So in psychology and experiments people have a harder time judging the uncertainty from these kinds of graphs than from these kinds of graphs so it seems that the general consensus is this or this is the best way to do it.

And that's really it my take-home message is make hops it's easy there's the website but since we have a little bit time left there's the bonus. So this just to show off right so there's a bootstrapping example where you actually see the individual dots how often they've been resampled right. So for example if you look here it's now it's zero now it's one now it's zero now it's three one right and you see also when this one gets zero the line ends and now it's there again right. And then when this turns to zero the line now it starts here right so this really visualizes how the bootstrap operates. And that's the code that does it and so this is that's all you need for that right this is the trick here i'm assigning this bootstrapper to an object and then i'm using it here to draw points here to draw text and here to draw the bootstrap lines and so they all see exactly the same bootstrap right and so that makes it easy to make really quite fancy visualizations hopefully easily that's all i have thank you.

Q&A

Thanks class that was really cool uh any questions we definitely have some time.

This is really cool um can you hear me this is okay yeah um i'm curious a little bit about how much we know about how these how much these work better and about how well people are interpreting uncertainty out of these sorts of things.

So yeah there's two research papers as far i'm aware so it's a really active research in the two papers that have been published in in direct comparisons where people had to judge like how uncertain something is from a hop versus some other visualization of uncertainty generally people get a better sense of the uncertainty from the hop and people are better at actually judging accurately what the chance for example is. I mean like you have to judge are you getting the train or not like what's the chance if you get a coffee now versus not that you're going to catch the train and people do better with these hop type of uncertainties than just like error bars or something like that.

So yeah there's two research papers as far i'm aware so it's a really active research in the two papers that have been published in in direct comparisons where people had to judge like how uncertain something is from a hop versus some other visualization of uncertainty generally people get a better sense of the uncertainty from the hop and people are better at actually judging accurately what the chance for example is.

Can you can you repeat have confidence bands with animations the standard error around the so for instance the the top right graph you had the all the possible lines can you also add the confidence bands to that? Yeah i mean it's it's just a plot right so you just layer them on top of each other so you could use so with the confidence band there's a little subtle problem in the sense that if you use a geom smooth to draw the confidence band it uses slightly different math from my geom stat smooth have to do a little more work but it's like three additional lines and it's shown in the in the vignettes that i wrote for the package i mean yeah you just layer plot layers on top of each other and you show whatever you want to show yeah.

Thanks uh really cool talk just wondering has anybody asked you about maybe incorporating p values into this you know bootstrapping visualization and if somebody were to ask you about it how would you respond.

Uh so in my in my day job i'm a scientist i'm a biologist i've seen lots of figures in my life i've never seen a p-value visualization that i thought was useful or credible so i really don't know i mean that doesn't mean it's not possible it's just i've never seen it i don't know how to visualize p values in a way that is useful.