Julia Silge | Monitoring Model Performance | RStudio

0:00 Project introduction 1:50 Overview of the setup code chunk 3:05 Getting new data 4:05 Getting model from RStudio Connect using httr and jsonlite 6:20 Bringing in metrics 9:45 Using the pins package 10:50 Using boards on RStudio Connect 13:30 Benefits of using pins 14:00 Visualizations using ggplot and plotly 17:00 Knitting the flexdashboard 18:10 Project takeaways You can read Julia's blogpost, Model Monitoring with R Markdown, pins, and RStudio Connect, here: https://blog.rstudio.com/2021/04/08/model-monitoring-with-r-markdown/ Modelops playground GitHub repo: https://github.com/juliasilge/modelops-playground pins package documentation: https://pins.rstudio.com/ flexdashboard documentation: https://rmarkdown.rstudio.com/flexdashboard/ tidymodels documentation: https://www.tidymodels.org/

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Okay, let's get started with this model monitoring example. So I'm going to open up a new rmarkdown file from a template. So I've got all these different templates from, you know, the packages I have installed here, and I am going to use a flexdashboard .

I really like the flexdashboard package for a lot of different situations, like, you know, making really pretty shiny apps without much effort, um, to, you know, dashboards that I don't want to use shiny with, but I want like a dashboard kind of situation. So that's what we're going to use here.

I'm going to call it this monitor.rmd here, and then the title, so I'm going to, this is going to be a dashboard for monitoring a model for traffic crashes. So the data here, the data in the model predict the probability that a crash involves an injury. It's crashes that are in the city of Chicago, and the data comes from the city of Chicago. So we have all this information about what the crashes are like, and then we predict whether the crash involved an injury.

So flexdashboard, we have all these different, it is a very flexible tool, um, you know, you can do columns or rows. I am going to actually change this out and make this, um, the storyboard mode here, which is pretty nice.

Overview of the setup chunk

And then it turns out that almost all of the logic for the model monitoring is going to go here in the setup chunk. So we're going to have a really big setup chunk with lots of code, and then actually less code down to create the dashboard itself.

And when it's time to monitor something, I think it's nice to kind of make clear in your head what are the pieces that you need. And so one piece that you need is your model. So your model is deployed somewhere. In this case, it's a model that I already trained, I have another video that shows how I trained it. So it's deployed using a Plumber API on RStudio Connect, but really that model could be anywhere. You just need to be able to get to it from R, which of course we have tons of options for being able to do that.

And then you need to get your new data, so the new data that's coming in that you want to be able to see how is my model doing with this new data. And then your metrics, you know, the metrics that you had back at the testing time and then your, you know, the metrics that we would measure over time as we continued to move forward. So these are kind of the pieces that we need to get going.

Getting new data

So to get the data, often the easiest thing is just to go back to when you trained the model. This is the GitHub repo where I have all the code I used to train the model originally. And so let's just go to the beginning of that whole script and get the code that I used to train the model. But instead of getting two years of data, let's just get two weeks of data.

What will happen is as I run this dashboard week after week after week, it will get newer and newer crashes as time passes. So let's run this and make sure it works. So what this is doing is it's going to the city of Chicago, getting the new crash data, and we have, you know, the information about the weather condition, the traffic way type and whatnot, and then whether there was an injury or not. So that's one piece. We need the new data.

Getting the model using httr and jsonlite

The next piece we need is we need the model. So to handle the model, we're going to use the HTTR package. We're going to use JSON Lite. The model, in this case, is deployed as a Plumber API on RStudio Connect. It doesn't have to be this way, right? Like wherever your model is published, you can go and get it from there. But this is how I'm going to do it here to demonstrate.

So this model has a predict endpoint. It's got a metrics endpoint that are the testing metrics that we expect to see. So I'm going to copy this URL here. And then what I'm going to do is I'm going to set up a POST call, and I am going to send as the body, I'm going to, it's this crash. This crash data is what I'm going to send, but I need to make it to JSON.

Okay, so to JSON, and I'm going to do, I'm going to send it crash. I'm not going to send the injuries though. And then, so that would give me the result. And then I want the text as UTF-8, and that will be JSON at that point. So I'm going to do from JSON. So these are predictions like so, like that.

So let's run this. So these now are predictions. So it's sending for all of those new crashes that I have that I want to use for monitoring. It's sending them to the model, getting predictions for them, and then handling, you know, all of the output here and getting back. So what we have at the end there, let me do it like this, is a vector like this. So it's a vector, and these are probabilities of having injuries.

Bringing in metrics

We have our new crash data, we have our model, and then let's put it together and find our metrics. So we take that crash data and let's bind on, let's call it predict underscore pred injuries. So we just bind that on together. So now we have a column of predicted probabilities.

Let's make a column of hard class predictions. So if this thing that I just made is greater than 0.5, let's call it injuries, and if it's not, let's call it none. So this now matches these. So I'm making a hard class prediction. I think I need to say if anything is a character, I need to change it to a factor. Right, there we go.

Okay, so because the function that I'm about to use needs factors, and those functions are from the yardstick package. So the yardstick package is from tidymodels , and you can use it with models that have been trained with tidymodels, or you can use it with, you know, any other kinds of models and get out these nice results.

Actually, before I started working for RStudio or using tidymodels at all, I used yardstick. Yardstick was actually the first part of this whole ecosystem that I used at all because it is so nice.

Yardstick was actually the first part of this whole ecosystem that I used at all because it is so nice.

And so we can do metrics, and the way we do it is we say the true value is in injuries, and then we can say predicted and pred injuries, so it's the hard class and then the class probabilities like this, and it gives us accuracy, ROC, and you look at this, you're like, is this bad? Is this worse than what we expect based on how the things did at testing? This is what we want to know.

We probably want to do a little bit more because this is for that whole two weeks, and we probably want to add a little bit more in here and say mutate, so we've got a... Crash date. So we're going to do a little bit of some rounding here. Actually, let's do floor date. Say as date, crash date, and then we're going to say unit equals week. So now we're rounding by the week that we have, and then we're just going to group by that crash date like this.

So now we have it for these... we've got three beginnings of weeks in this data set, so we have these different values here. So these are the metrics by week like this that we have.

Using the pins package

And if we are starting out our model metrics, let's say we just trained our model, and we are starting out, we would probably take this. Let's say we trained our model on, you know, March 20th. We would start this, and it would be time for us to start keeping track of this. And I think a great way to do this is the pins package.

I'm using the development version of pins as of, you know, this recording. It will go to CRAN soon. But if you see me use, you know, some of the functions and things, they might look a little different than what's on CRAN right now, but those functions will go there soon.

And so what we want to do is I actually have some stored metrics in a pin right now. And so I'm going to show you how to get them and then how to rewrite to them. But if you were starting from scratch, you would just start by writing without reading first.

Using boards on RStudio Connect

The way that this works is, let's say I have to register a board somewhere to store things. So in pins, you can have a board on Azure. You can have a board on, gosh, there's so many. There's on, you know, Google Cloud Storage, on GitHub, on Kaggle. There's all these different kind of boards that you can have. I'm going to show how to use a board on RStudio Connect.

And then I need my key. This is, and it is stored in my .r environment file. So I'm getting it out of there with this here. That's a, it's a pretty nice place to store things. So I think if I type out board here, it'll be like, it's a pin board on RStudio Connect. And here are all the things that are stored there, like for my coworkers, RStudio, all these things. And I have something stored there too.

If I do pin read, it's called, it's, it's, can I spell my own name? I don't know, Julia.Silly, and it's called traffic crash metrics. And if I read that and I can say, let's call it old metrics like this. So I'm reading the old metrics here.

So I've actually been monitoring this model for a while now. But so I've been like monitoring, monitoring, monitoring. And so now what I can do is I can combine the old metrics with the new metrics and then write it back.

So I think the way to do this, old metrics. So I'm going to do like a filter that the, this crash date isn't in it. Like that. And then just bind them together. So this is the, this is kind of a nice way to do it so that you don't, you don't end up overwriting things and all that kind of thing.

Okay. So if I do this, there we go. So I have taken my old metrics. I have added on anything that I have that's new. Yeah. So I can, maybe I should arrange it so it's the same metric crash date like that. There we go. And so I can write this back. So we have our old metrics, my new, and the metrics by week. And let's call that new metrics like this. And then I can take my board and I can write to it my new metrics.

Benefits of using pins

So what a pin gets you is versioned ways to write things back and forth, which I think is really super duper nice because you can always go back. You can also pin and version models themselves. It's really useful for model monitoring.

So what a pin gets you is versioned ways to write things back and forth, which I think is really super duper nice because you can always go back.

Visualizations using ggplot and plotly

Okay. So look at that chunk. It's enormous. So much is going on in it. But this is like, we have our new data. We have our model. We have our metrics. And so now let's just make a few visualizations.

So I think for the storyboard, I'm just going to put like different, a few things here. So let's, for example, say model metrics here. And then we'll put this. We'll say like new metrics because that was all of it. And let's say what do we have? The new metrics. There was a bunch, there were several different kinds of metrics in there. So let's just do two, filter metric in accuracy, ROC, AUC.

And we can make a, say do a crash date estimate like this. And then we can put a geome line like so. And we'll say AES color equals metric like that. And I think that we'll need a facet wrap by metric. Let's say free Y. And I think this will look better if it's like they're on top of each other like this. I think that looks pretty good. So we'll do something like this. And then, oh, yeah. So I'm going to use Plotly to do these.

So let's do, let's save this as P and then let's do gg Plotly P like this. So we'll need to also load library Plotly like so. Let's see what that looks like.

Whoops. No, it's off source. That's so silly. It's called Plotly. Right. Which I put there, which is right.

And I think we can do, if I do it like this, if I save it here as an object and then I do hide legend P. Yeah. Let's do that. Nice. Okay. So that's pretty good. Right. Like this is our metric. I can spend more time and make this, you know, nicer. X equals an all Y. I mean, I can definitely keep going and like make this nicer in so many different ways. But this is pretty good.

So model metrics, I can go on. I can, you know, I'm just going to say like I could do, I could make a map. I could make a table. There are so many different things I could do.

Knitting the flexdashboard

And then I can, if when I hit knit, let's see if this works. It's going to go through now and it's going to do all the stuff in the setup chunk, which remember calls two different APIs, computes all the metrics, does all this kind of stuff. And then it should pop out a dashboard here for me.

So let's talk about what you might do with this dashboard. So let's look at this here. Okay. So what might you do with this dashboard? So this dashboard does not use Shiny and we can, you know, put more narrative in this place. And, you know, maybe we put our other things that we would have here.

So a dashboard like this, what I would do with it is I would publish it somewhere where I could update it once a week in this case. Right. And then, you know, be able to look at it whenever I wanted to. I have this published in the example that we can show you on RStudio Connect, which of course has this great feature of where I can schedule it to run on RStudio Connect once a week, send me a nice email saying that it succeeded and send me a plot and all this kind of thing. But if you publish your R Markdown documents in a different way, you can pursue that in whatever way is appropriate to your own data infrastructure.

Project takeaways

What I want you to take as a takeaway here is that model monitoring requires you to think about what are the pieces that are required that are important to you, your model, your data. And then there are components of the R ecosystem today that lets you move forward with the kind of model monitoring that you need to do.

What I want you to take as a takeaway here is that model monitoring requires you to think about what are the pieces that are required that are important to you, your model, your data. And then there are components of the R ecosystem today that lets you move forward with the kind of model monitoring that you need to do.

Featured software#