Conformal Inference with Tidymodels - posit::conf(2023)

Transcript#

This transcript was generated automatically and may contain errors.

So, I'm here to talk about conformal inference and how to do that in tidymodels . Conformal inference, you put those two words together and if you would ask me maybe like a year ago what they mean, I'd essentially be like, I don't know. So you know, it's kind of like an oddly named technique, so in hindsight maybe a better title for this presentation was how you can make prediction intervals for any type of model without making very much statistical assumptions about your data or your model.

Just to remind you, if you want to put in some questions, here's the link for it. And also in this wee tiny little font down here, you can see the link to the slides if you want them.

What is a prediction interval?

All right, so maybe I should start first by saying, well, what's a prediction interval if you've never heard of that? And so a prediction interval, it's like a confidence interval, but it's an interval on a separate type of object. If you had like a 95% prediction interval, that means that you have a bounds where 95% of the time a new observation that you would acquire later will fall into that interval. So whereas confidence intervals are on like the mean prediction, this is about new observations.

And the little diagram here shows you both confidence interval, which is kind of very narrow for this data set, and a wider prediction interval. And you can see there's a couple of data points on the bottom and a few on the top that don't fall in. So I think that's a 95% interval right there.

People use these when they can get them, which is not that often for models. It gives you a sense of uncertainty about your prediction usually. So it gives you a sense of how much should I not trust the prediction, but you can sort of calibrate your expectations as to how good the prediction is.

The intuition behind conformal inference

All right, so let's start with just some data, right? So this is just like a histogram on its side. It's 500 data points. It's centered around zero, maybe like plus or minus like, I don't know, 0.15. And so you spend some time collecting this data, and then you get the 501st data point, and let's say it falls down here. And it's like, well, it's not outside the range per se, but is that like a new data point from the same distribution, or has something changed, like the model drift question and things like that?

And so what we can do is, if we were from a statistical standpoint, if we want to make some sort of judgment about whether this new data point is from our original distribution, one thing we could do is use good old-fashioned quantiles. So if you wanted to do like, let's say you wanted a 9% interval, what you could do is get the 0.05 quantile on the lower end and the 95% quantile. And if you were to make a completely distribution-free probabilistic statement about that, you could say that 90% of the time, when I get data from this distribution, they're going to fall in between that interval.

And so this is where the conformal part comes in. So you would say that that new data point, if it fell within that interval, it conforms to sort of this original reference distribution here. So if you were to take this data set, compute those particular intervals, so there's a lower 95%, and this is like the 95% interval quantile, and you would think that any new data point that falls in here would be consistent with the original data. And you can see, of course, there's false positives, 5% of the time here and 5% of the time up here. In this particular data point, you would not really consider to be consistent with the original data.

So why am I telling you this? Like suppose instead of me just saying we have some data, let's say they were out-of-sample residuals. So let's say you fit some sort of regression model, you had an extra data set, you know, just laying around like we do all the time, and you took that model and you predict on this different data set, you can compute those residuals. And so that gives you a sense of, for the data that was collected, especially in what we call this the calibration data set, you could say like, on average, this is what I expected the noise around my predicted values to be based on these residuals.

So if this is a training set, and this is just a nonlinear function I fit to it, we build the model on this data set, we take that same model fit to the calibration data set, this 500 I just showed you, and calculate the residuals. And that generates this histogram that I just showed you. And then what we can do is we can take this zero-centered histogram and basically center it around the predicted values of our model. So this is like a test set, so we get new data points, we, you know, the same model fit here.

Oh, and by the way, I swear to God, it's completely accidental that these colors match the ones of our T-shirts. I didn't realize until like an hour ago, I was like, oof, it's not like me.

So anyway, so these are like the, it's a consistently, consistent-width band around the predicted values in here, and I guess we're calling that purple. And you can see some of the data points don't fall inside the band, but mostly they do. And this is basically something akin to a prediction interval, it's just going about it in a completely different way.

Those prediction intervals are what we call parametric, we have to make some sort of statistical or probabilistic assumption about your data and the model and things like that. So what we tend to think of when we do these conformal intervals is that they have, if it were like a typical prediction interval you get from just straight-up linear regression, we'd say that, you know, a particular data point has a coverage of, let's say, 95%. But in this case, what we would say is, on average, across all the samples that we use, the coverage for that interval is 95%. So it's really the additional bit here is average, usually, for most conformal methods at least.

But in this case, what we would say is, on average, across all the samples that we use, the coverage for that interval is 95%. So it's really the additional bit here is average, usually, for most conformal methods at least.

Conformal Inference with Tidymodels - posit::conf(2023)

Transcript#

What is a prediction interval?

The intuition behind conformal inference

Methodology and assumptions

Method 1: Split conformal inference

Method 2: Cross-validation conformal inference

Method 3: Conformalized quantile regression

Checking coverage with simulations

What's next and resources

Q&A

Featured software#

tidymodels