Resources

MLOps with vetiver in Python and R | Led by Julia Silge & Isabel Zimmerman

Many data scientists understand what goes into training a machine learning model, but creating a strategy to deploy and maintain that model can be daunting. In this meetup, learn what MLOps is, what principles can be used to create a practical MLOps strategy, and what kinds of tasks and components are involved. See how to get started with vetiver, a framework for MLOps tasks in R and Python that provides fluent tooling to version, deploy, and monitor your models. Blog Post with Q&A: https://www.rstudio.com/blog/vetiver-answering-your-questions/ For folks interested in seeing what data artifacts look like on Connect, we have these for R: ⬢ Versioned model object: https://colorado.rstudio.com/rsc/seattle-housing-pin/ ⬢ Deployed API: https://colorado.rstudio.com/rsc/seattle-housing/ ⬢ Monitoring dashboard: https://colorado.rstudio.com/rsc/seattle-housing-dashboard/ ⬢ Create a custom yardstick metric: https://juliasilge.com/blog/nyc-airbnb/ ⬢ End point used in the demo: https://colorado.rstudio.com/rsc/scooby Our team's reading list (mentioned in the meetup) Books: ⬢ Designing Machine Learning Systems by Chip Huyen: https://www.oreilly.com/library/view/designing-machine-learning/9781098107956/ Articles: ⬢ “Machine Learning Operations (MLOps): Overview, Definition, and Architecture” by Kreuzberger et al: https://arxiv.org/abs/2205.02302 ⬢ “From Concept Drift to Model Degradation: An Overview on Performance-Aware Drift Detectors” by Bayram et al: https://arxiv.org/abs/2203.11070 ⬢ “Towards Observability for Production Machine Learning Pipelines” by Shankar et al: https://arxiv.org/pdf/2108.13557.pdf ⬢ “The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction” by Breck et al: https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/aad9f93b86b7addfea4c419b9100c6cdd26cacea.pdf Web content: ⬢ _How ML Breaks: A Decade of Outages for One Large ML Pipeline_ by Papasian and Underwood: https://www.youtube.com/watch?v=hBMHohkRgAA ⬢ _MLOps Principles_ by INNOQ: https://ml-ops.org/content/mlops-principles ⬢ _Google’s Practitioners Guide to MLOps_ by Salama et al: https://services.google.com/fh/files/misc/practitioners_guide_to_mlops_whitepaper.pdf ⬢ _Gently Down the Stream_ by Mitch Seymour: https://www.gentlydownthe.stream/ Speaker bios: Julia Silge is a software engineer at RStudio focusing on open source MLOps tools, as well as an author and international keynote speaker. Julia loves making beautiful charts, Jane Austen, and her two cats. Isabel Zimmerman is also a software engineer on the open source team at RStudio, where she works on building MLOps frameworks. When she's not geeking out over new data science techniques, she can be found hanging out with her dog or watching Marvel movies

Sep 20, 2022
1h 23min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hi friends, so nice to see you back for today's meetup. If we haven't had a chance to meet yet and this is your first meetup, I'm Rachel. I'm calling in from Boston today, actually at the RStudio office in Seaport. It's so nice to meet you and to see so many familiar names joining as well. Feel free to introduce yourselves through the chat window and say hello as well. I love getting to see where people are calling in from all over the world and also to be able to see people sharing helpful resources with each other over there in the chat as well.

I host these meetups every Tuesday at noon eastern time and they are all recorded and shared up to the RStudio YouTube as well if you want to go check out past sessions too. I'll say this one will be, the recording will be up immediately at the same exact YouTube live link, but this is a friendly meetup environment for teams to share different use cases with each other and teach lessons learned.

Together we're all dedicated to making this an inclusive and open environment for everyone, no matter your experience, industry, or background. You can add the whole calendar or individual events to your own calendar with a link that I will show on the screen here right now.

For a heads up about next week, we'll be back here, same place, for a talk on beautiful reports and presentations with Quarto, and the following week a talk on model monitors and alerting at scale with RStudio Connect. Today we are so lucky to be joined by both Julia Silge and Isabel Zimmerman presenting on MLOps with Vetiver in Python and R. During the event you can ask questions on YouTube live or LinkedIn, wherever you're watching from, and are also able to ask anonymous questions as well through the short link that I will put up here on the screen.

And so we will try to answer as many questions as possible from there, but with that I would love to pull Julia and Isabel up here on our virtual stage with me.

Awesome. Well, hello everyone. We are so excited to see you all here for the MLOps with Vetiver in Python and R meetup today, and talking about beautiful Quarto presentations is next week. This is a Quarto presentation. I love plugging that. It's my favorite thing.

And so who are we? I am Isabel Zimmerman. I am an open source MLOps software engineer here at RStudio, and I'm joined here today with Julia Silge, who is also an open source software engineer writing great MLOps tools in R, and I write them in Python. But another important question is who are you? So who's joining here today? We have a Slido poll question for you all. What language does your team use for machine learning?

So Julia and I have worked really hard to make sure that Vetiver feels great for bilingual data science teams. So that's data science teams that are writing in both R and Python, but it also feels like a great native experience if you are just an R user or just a Python user. So if you're developing a model, you can operationalize that model, and in fact we believe that you likely should operationalize that model, which gives us a really big looming question of what is operationalization? What is MLOps?

And if you're here, you're probably on some sort of MLOps journey, whether you're just curious about what it is, or maybe digging into tools. And the MLOps landscape looks a little bit like this. There's a lot of pieces going around. Is it infrastructure? Is it analytics? Is it open source? Are we looking at APIs? And it makes us feel sometimes a little bit like this.

But there is kind of a one-liner. Everyone finds MLOps a little differently, but the way that we see MLOps is as a set of practices to deploy and maintain machine learning models in production reliably and efficiently. All of those tools are helping you do this in some way. And MLOps is not something you can pass off to your IT department. We believe that data scientists should be owning part of this process.

Everyone finds MLOps a little differently, but the way that we see MLOps is as a set of practices to deploy and maintain machine learning models in production reliably and efficiently.

So there are great tools out there to help you through that data science life cycle. We have another quick Slido poll for what packages are you guys using the most for machine learning right now. We know there's great ways to collect data, to understand and clean data. On the R side, that's tools like the tidyverse or data table. On the Python side, it's maybe Pandas or NumPy. Once your data is cleaned and understood, you are ready to train and evaluate a model. Once again, on the R side, you're looking at tools like Carrot for tidy models, or in Python, things like PyTorch or scikit-learn.

But then you enter kind of the wild west of MLOps. And it's sometimes hard to find, you know, MLOps open source tools that fit your needs. And Vetiver is really well scoped into helping data scientists version, deploy, and monitor their models. And we'll take a little bit of a closer look in what we mean when we use those three verbs.

What MLOps means: version, deploy, monitor

So, MLOps is versioning. This is managing change in models. It's going to help you avoid the pitfalls of, you know, model final, and then model final one, and then, like, model final, it's actually this one. So, it helps you scale your model, but it also will help you organize your models and share your models by giving them context, more robust versioning, and a little bit of metadata. So, you can kind of manage, you know, your production versus your staging environments. So, you make sure your models are well organized.

MLOps is deploying. And this is a loaded question in Slido we have for you. Have you ever deployed a model? Deploying can mean different things to different people. But it might have actually been something you've done before and didn't even realize it. Whenever you're taking a model out of the environment that it's been trained in, and you're putting it in some other computational environment, you can say that is deployment. So, things like putting a model in a Shiny app are deploying.

But we really believe that stable models are models that are in REST APIs. So, this is an individual computational environment that just hosts your model at an API endpoint. So, it is independent of maybe a larger application. It makes it faster and easier and a little bit more robust to just host this model by itself and not inside of an application.

And MLOps is monitoring. This is tracking model performance. So, if you are not monitoring your model, you might be immune to if your model is performing poorly. And that's because models can fail silently. It can be at 0% accuracy and still spitting out predictions left and right. So, you really have to be tracking this model performance so you can tell when things are going wrong.

Vetiver demo in Python

So, we can see our data science life cycle over here. Once again, we know data scientists have effective tools to do this half of the data science life cycle, but maybe need some more support on the other half. So, we're going to start by building a model. We're going to load in some environment variables, and we're going to make this model to predict what Scooby-Doo episodes have a real monster and which ones have a fake monster. We're going to load in some data in the arrow format, and we're going to do our preprocessing and fitting our model. And we're going to put these two things together in a pipeline and deploy them as a whole.

And the way we're going to be able to deploy these models is using a deployable model object called a Vetiver model. Now, notice this is the first place that Vetiver comes into their life cycle. So, you get to use those same tools that you're very familiar with, and this is kind of just an extension of your current workflow. So, this Vetiver model will take in the pipeline. I'll give my model a name, and then I'll feed it a little bit of my training data to kind of prime the API endpoint. That way, when new data comes in, my API endpoint can check to make sure it looks the same way that the model expects it to.

So, next, we're going to version our model. And Vetiver actually has a secret weapon called Pins. And Pins is something that helps you create boards that you can write data but also models to. And it will automatically version your model in a really robust way. So, we can see our version right here. We've pinned this board to RStudio Connect, but this would just as easily be board S3 or Azure or even just a local folder.

So, when we write our model, we have a little prompt that says model cards provide a framework for responsible reporting. Model cards are something that there have been a lot of research on that have a really holistic view on deploying a model. Inside of Vetiver, there is a template that'll get you like 80% of the way to a model card. You can see inside this card, you get to document things like metrics and evaluation data, but also things like maybe ethical considerations or caveats and recommendations. So, you can really show your knowledge and all that information that you've used and generated when you're making your model.

So, we can generate our model card template here. And now, we will deploy our model. It is versioned. It is ready to be deployed. But before it's deployed somewhere else, we need to make sure it's working locally. So, here we can use our Vetiver API and set up our API and click run.

We will see in the browser, all of this is automatically generated by Vetiver to have a visual documentation at your API endpoint. We can see the server. It's running locally. We get a little bit of information of what kind of model it is, the name of the model, a health endpoint, and then we get to the predict endpoint. So, we can try it here to make sure that this API is working as we expect it to. Let's not look at the year zero, but maybe the year 2000. And this Scooby-Doo has an IMDB rating of eight. And it's probably a fake monster in this Scooby-Doo episode.

But of course, this is not our end goal. Our end goal is not to have our model running on a local machine, but somewhere else. So, this is looking at deploying this to RStudio Connect. So, Vetiver has a kind of one-liner here to deploy to Connect. We're feeding it in a Connect server, the board that we've specified earlier, and the pin name that we just created. We can also specify a version if we want this API linked to a single version to kind of quickly iterate on that version that exists.

So, I've already deployed this model. Let's just make a prediction from this that's living up in our demo server cloud. We'll create a Vetiver endpoint. And let's generate some new episodes. And we'll predict with this model by feeding in the new episodes and the Connect endpoint.

So, this can really feel like it's a model that is inside your computational environment, even though I'm connecting to something that's out on a cloud somewhere else. It's on a different server. And this is doing the kind of DataFrame to JSON to API server to JSON back to DataFrame feeling for you. So, you don't have to do any of the JSON gymnastics. And we can get these predictions back in a DataFrame format. So, it feels really natural, even though this is not a model that's even on this machine.

Once this model is deployed, you're going to be continuing to update some metrics. You're going to be collecting kind of the dates you're making these predictions. So, we're going to read in some validation data that has, you know, the date that the model was making these predictions. And we're going to feed this into a Vetiver.computeMetrics function. We'll give it the data, the date variable, a period, and it'll do kind of a time series computation of the metrics so you can track these over time. This is for the F1 score, but it could just as easily be accuracy or any metric set that is available.

Then we will take these Scooby metrics we have just created and use the built-in plot function. It looks a little small since we only have these three data points, or three years of data, and we're aggregating year over year. But you can make this as extensible and as complex as you want, adding, you know, loads of data and multiple different metrics to track.

And I think with that, we have gone through the entire data science life cycle in just about five, ten minutes. And this is what Vetiver looks like in Python. So, I'm going to pass this over to Julia to show you what it looks like in R.

Vetiver demo in R

Fantastic. Thank you so much for that, Isabel. That's such a great introduction to what it is that we're talking about. And just to put this up here again, just to reiterate, where Vetiver sits in the model, like what we'll call the MLOps cycle, and what are the tasks that it does. Like an underlying assumption or driving design decision for Vetiver is that these things on the right-hand side, you have great open-source tools that you like to use for those. And it is over here on the left-hand side that Vetiver comes in and you can use to connect what you've already done here through to that kind of MLOps task, like versioning your model, deploying your model, and then monitoring your model.

So, let's see what that looks like in R. I want to highlight the name a little bit because I think it's a helpful metaphor. So, Vetiver, if you're like really into perfume or scented candles, you may have seen or heard the word Vetiver as an ingredient because it's a stabilizing ingredient in perfumery. It takes the more volatile fragrances, stabilizes them so that your perfume can last a long time. So, in this analogy, your models are these like more volatile, really valuable fragrances, and what Vetiver does is stabilize it so that you can deploy with confidence.

So, in this analogy, your models are these like more volatile, really valuable fragrances, and what Vetiver does is stabilize it so that you can deploy with confidence.

So, let's look at the same data on Scooby-Doo episodes. So, this Scooby-Doo dataset is originally from Tidy Tuesday, but we transformed it a little bit and saved it as an Arrow file so that both on the Python and the R side, we can read it in the same way. So, it has this thing we're going to predict whether the monster is fake or real in a certain episode, the year that it aired, what the IMDB rating was, and then we've got the title in there as well so we can keep a track of that.

Now, let's move on to training and evaluating a model. So, here, I'm going to use a support vector machine model. So, let's set up that support vector machine specification. And now, I'm going to make a feature engineering recipe here. So, support vector machines, it turns out, you have to do some data preprocessing to them. So, and then I'm going to combine the feature preprocessing together with the model into a workflow and fit it to the data that I have.

So, what I want you to notice here is that I'm going to be able to – what I'm doing is I'm using good statistical practices, treating my preprocessing together with my modeling, and I can take that whole thing and move on to my MLOps tasks. So, again, from before everything I've shown you so far, we want you to keep using the tools that you know and love, that you think are a good fit for your particular use case.

Now, what I'm going to do is I'm going to start using Vetiver. So, just like on the Python side, you know, we read data in, maybe did some EDA, trained a model, and now it's time for us to start going into the MLOps tasks. So, the first thing that we'll do here is that we'll use Vetiver to create a deployable model object.

I want you to notice what's printed out here because the idea of, like, why? Why do I need to make a deployable model object? It's because at the time of training your model, you have a lot of information about it. You know, for example, whether it was a classification model or a regression model. You know, like, what kind of computational engine you used to train it with. You know how many features there were and what the names and data types were of those features that went into your model.

And the reason why you want to use Vetiver to create this kind of deployable model object is that it is designed so that it collects and stores all the information you need to make predictions in a new computational environment at training time. Because, you know, you have all that information, let's store it in a nicely organized package bundle so that you can then take it somewhere else. So, just like Isabel was saying that when, like, the good way to think about have I deployed a model is have you taken a model that was in one computational environment and put it somewhere else so that it can integrate with your IT infrastructure.

All right. So, we have our ready-to-go model object, deployable model object. And now I'm going to version slash publish slash share my model. So, I'm going to use pins to do this. And I, right now, am going to, you can see from my screen here that I've connected to our demo connect server. And I am going to write my Vetiver model there.

So, this is pins have support for many different kinds of infrastructures upon which to version, share, publish models. So, just like on the Python side and the R side, you can write to connect. You can write to S3 buckets, AWS S3 buckets. You can write to, like, an Azure blob storage. You can write to a network drive if that's the way that your particular organization works. The important thing is that the pinning works the same.

And it allows you this nice little bit of, you know, usability in terms of I need to get a version. I need to store this in a versioned way on my, to keep my model somewhere that, like, is the appropriate place for my particular infrastructure.

So, I want to just highlight here, again, that I got a little prompt that I get once per session about that reminding me to create a model card. So, I am going to show you in R how you might go about doing that. So, if I go to say I want to make a new R markdown, I've got a couple of things here that are Vetiver templates. One of them is this model card. And what this does is, just like Isabel said, this can get you, like, 80% of the way there to documenting your model with a model card. So, if you haven't heard or read about model cards, this paper, I encourage you to look at it.

And what this does is it goes through, for example, things that can be automated, like, hey, we know what kind of model this is. We know what version we're using. We can automate those parts. And then it provides you an outline to the parts that take human domain knowledge to be able to answer. So, like, what are the uses of this model? Why did you choose the metric that you did? So, it allows you to walk through and do a good job of documenting your model in an appropriate and specific way.

Great. Okay. So, we've versioned our model. We have thought about how we're going to document our model. And now it's time for me to deploy the model as a REST API. So, here I am going to run this locally where I am going to see, oh, right, look, I can create a REST API that's going to run locally on my machine right here. So, this is running in this R session. Notice that my R session is kind of busy right now because it's serving this API to me. And I can understand, I can locally debug. It takes me this many lines, three lines of code to locally debug the model to be able to see, like, what is it doing?

So, let's say there was let's say it's the year 1990. And something has an IMDB rating of 8.3. I can literally, like, interact with my model locally here. And I can, you know, like, I can understand, like, what does it need to, like, what does it look like when I interact? Let's say I'll do, you know, batch prediction. And it tells me, like, okay, both of those are fake. Both of those are fake monsters.

And it also, as I'm interacting with this, I get the curl, like, it shows me the curl. So, think of this interactive documentation as a tool for you as the model developer to be able to understand, debug, figure out how your model is working. And also as a tool for collaborating with other people in your organization, data engineers, IT infrastructure people, SREs. If anyone needs to know how do I interact with your model, you can show them. You can show them directly here.

So, I am going to stop this little local API that was running in my R session. And then I'm going to talk about, okay, that's good for local debugging. How do I get it from my computer into a new computational environment? Which is exactly what we, you know, what we're talking about, what we want to do here. So, we have a, if you want to deploy your models to Connect, we have a one-liner function that will take your, where you have stored your model, which model it is, and it will, with that one function, deploy it to Connect.

We also have support for deploying to other kinds of targets. Say you want to deploy to SageMaker, you want to deploy on a different AWS way of serving things. You want to deploy on Azure or some other target. Then what you're going to want to do is you're going to want to deploy using Docker. So, to make a Docker container for Vetiver, you are going to want to first generate a plumber file.

So, this is what the generated plumber files look like. Like I, like we run this line right here and it makes a file that looks like this. So, this file, you know, it is something that is editable by you if you have some sort of higher, like more complex use case, but the default, you know, serves most people's use cases pretty well.

So, this gets generated here and then what I'll do is I will write here, let me delete this so you can tell me, I can show you what it makes. Okay, it's gone and let me come over here and I'm going to write my Docker file here. So, it is, so what's happening right now is it's looking at the model V that I have, this deployable model object. It's saying, okay, what packages do I need to install into the Docker container to be able to make predictions?

So, for example, this, just to give you like a little bit of a, like a insight of like what does that mean, the packages that you might need. If you tuned your model, you don't need the tuning infrastructure when you deploy. You only need the part, the packages that are required to make a prediction. So, then that's what was happening. So, we made, when it was like, okay, let me generate this Docker file for you, it updated this, this lock file and then it made this Docker file.

So, this is what I just made by running that, by running that function. So, it will, it's, it looked at exactly the version of R that I'm using right now. It's going to use the public package manager to install binaries that are fast, which is good, and then it knows exactly which software dependencies it needs, like system dependencies, what do we need to install in there, and then it will use, it will use the, this rm.lock file that I made.

This is a, this is a standardized format for saying what packages you have, exactly what version of it do you have, where did you install it from, and so forth. So, so it is going to use that lock file to install all those packages, copy the plumber file over, and then this part is what happens when the Docker file is run.

So, if we go over here, I'm not going to run this right now because it takes a little while. This is probably the slowest part of anything that we're showing you here. Building a Docker file can, can, the reason that it takes a little bit is it's setting up a whole computational environment, like a whole little mini computer, and so it installs everything. So, the keys to making it faster to build your Docker containers are be very careful about what you're installing in there, that you're only installing what you need to, to keep them getting too bloated.

So, I'm going to go, so Docker is not a tool used in R or Python, but rather a tool you use on the terminal. So, I'm going to go over here to the terminal. I, I'm gonna, I pre-built this, so I use this exactly here. If you're looking at this bit, the reason I'm using that is that I, the machine that I'm using right now is a, is a Mac M1, which is a different kind of chip, like a different kind of chip than an Intel chip, and the most places you want to take a Docker container don't use ARM chips, they use Intel chips.

So, if you're going to build your Docker container locally, and you're on like a newer Mac, one of the M1 or M2 chips, then you want to use this so that the Docker container that you build can go somewhere else and be, be deployed on that infrastructure. It also means we can, we can use the binaries that are built by the package manager, which are built for Intel chips. So, I, so I did this ahead of time. It took, I don't know, five minutes, something like that for me to build this.

And, but now what I can do is I can run it. So, I'm going to run this here. Notice, let's talk about a couple of things here. I'm passing in an environment file, a file of environment variables. So, the, whatever Docker container you have, for this to work successfully, it has to be able to authenticate where the binary model object is, where this thing, where the model is kept. So, if you look at the plumber file, it's reading here, right? Like, from wherever the, like, where is that model? I need to get to that model. So, a good way to do that is via environment variables.

I'm also connecting the port on the inside to the port on the outside, and then this is the name of the one that I'm running. So, let's start this off. So, first it's telling me, hey, you're actually on ARM, and this was built, like, to be a pretend computer. This may not be as fast as it could be otherwise, which is very true. And now, now it's running.

So, let me talk about what I can happen next. I can visit this in a browser. But I also can interact with it from R. So, right now the Docker container is running, and so on the little, whoops, the little local port here on my computer, I can interact with that directly from R. So, this is what it would look like if you were interacting with the model from Python or R, but it is, it is somewhere else. It's not here locally. So, I create something called a vetiver endpoint.

So, notice that the, this right now is just looking at my own little local computer, but this could be an endpoint anywhere. I can take that, that's, that's kind of the point of Docker, is we take this thing, I build it, it is self-contained, and I can take it where it needs to go. So, if this is, this could be on a cloud platform, this could be on a server that I have somewhere. I've taken that Docker container, and I've moved it somewhere that is useful to me.

So, now let's say I have some, some new episodes. So, I'm gonna, let's see what do those look like. I'm just kind of, I'm just generating some like pretend Scooby-Doo episodes, and I can predict from my endpoint whether those episodes are real or fake. And I pass the data, and what's happening here is that I, it looks like I'm just treating that endpoint like it's a model in memory, like either in Python or R, this is set up so that you just call predict, like a predict method or a predict function, the way that you're used to. But what this is doing, actually, is it's taking your data that's like a data frame, it's converting it to JSON, making an HTTP call, getting JSON back, and converting it back into a nice data frame format that you have.

Monitoring models over time

Okay, so, so far we've talked about versioning the model, we've talked about deploying the model, and now let's talk about monitoring the model. So, what Vetiver will monitor, it has functions to help you monitor the statistical properties of your model over time. So, to do that, you need new data that has labels. So, let's say we have, let's say we're able to look into the future, and there are some new Scooby-Doo episodes that are coming out starting in 2023, and I somehow happen to magically have the data now.

Notice, to monitor the statistical properties of how your model is doing, you have to have the, the answers, the labels. That often means you have some kind of feedback loop that may be very short, or maybe a little bit longer, but often, in many situations, we do have that feedback loop, and it is what we need in order to monitor. That's sort of the gold standard for how are we going to tell if our, if our, if our model is performing the way I want to.

So, what we're going to do is we're going to compute some metrics over a certain time aggregation. So, we have like 100 future Scooby-Doo episodes here, and we're going to aggregate it at that. What I want to highlight is that models can use time in different ways. So, this particular model actually uses the year as one of the features, and that happens sometimes. Sometimes your model uses some kind of date time quantity as a feature.

Monitoring always involves some kind of date time quantity. So, we've got this year in a date aired variable, like a date aired column that we have. So, that, that is not, that does not necessarily have to mean it's a feature in the model, but it's the dimension along which we are monitoring. So, monitoring always involves us over time collecting new validation data, you might call it, or monitoring data, you might call it, that has the labels that we have so that we can, we can compute and monitor how our modeling is doing over time.

So, let's come over here. So, first, I'm just going to show you what it looks like if I use augment. So, I can augment on a, using a Vetiver model, and what this does is it lets me predict and bind columns together. So, we've got all this data that we had before, and then a predicted class using this Vetiver model that I have in memory here. And once we have this, where we have, you know, variables that we're interested in, like monitoring over the predictions, then we can use this Vetiver compute metrics function here.

Which will use, use, it supports sliding metrics, quite complex windows along which to aggregate, and it will set that up for us to compute the metrics that we're interested in. So, I'm going to say this is the variable over which I am monitoring. This is the aggregation unit that I'm using. I want to aggregate at the year level, and then here are the things I'm going to use for computing metrics, like what are the variables there.

So, I'm going to, so here I now have computed my metrics. So, I have for each year in this data set, 2023, 2024, 2025, I have n is how many episodes there are. So, we've got, you know, 20 to 40 episodes, and then a metric here. So, that's computed using the support that Vetiver has for the common use cases for monitoring statistical metrics over time, the statistical performance of your model over time.

On the, like on the Python side, on the R side, we have support for plotting metrics, for pinning metrics, if you want to use pins, which we recommend to store the aggregated metrics over time. And so, we have just a default way of making a plot that probably will work for you, but it is just data. So, here I show just making like a, like a custom, a custom kind of graph visualization showing us how the metrics are doing over time.

So, this is, so a design consideration for Vetiver is that we provide functions that account for the most common use cases, but all of these components are extensible for more custom needs, more advanced use cases. So, if I, I'm like, oh, I don't want to use that, that default plot approach. I want to make my own plots, and I can do that because this is all, you know, data here. So, here, the, you know, it's like, oh, in this case, the accuracy looks pretty good, and kappa, you know, estimates how well the results you are getting agree with what you would expect, and takes into account the class imbalance.

So, you know, in the 2023 and 2024, it looks like this, it, you know, agrees about as much as we expect. In 2025, it's like, oh, it's agreeing a little better, maybe, than we would have, than maybe we would have expected.

So, this is a monitoring example where we just have three monitoring points, but a, if you are clicking through, I encourage you to see this more realistic monitoring example. I'm going to show you what it looks like. So, I am, again, going to go to make a new R Markdown file. I'm going to go to the templates, and I'm going to go down, and I'm going to pick this dashboard.

So, what this dashboard is, is it is a, again, like a template to get you 80 percent of the way there for more complex model monitoring. It's, it is, so a vetiver dashboard is a special, it builds on top of a flex dashboard in, on the R side, and it lets you use parameters to say, where is my model, what is my model called, how do I want to, you know, like, I can style this, because it builds on all, top of all the, you know, beautiful dashboarding that already exists in R.

And what, if we want to look here at what happens, the, we have a dashboard, so this dashboard provides like a, like a opinionated sort of set of advice of how to set up a monitoring dashboard that will keep track of your statistical performance of your model over time, plot something, plot, you know, the metrics like you might expect in using functions that we provide, and then also, it is, it is code that is generated, and so you can extend this in the way that is appropriate to your use case. You can add in different plots, you can add in tables, you can also, like, plop in that API visual documentation that you have there to be able to share this as a artifact that people can understand, not, not only data practitioners in your organization, but software engineering folks, IT folks can understand how the model works and how to interact with it.

Wrapping up

All right, so I am going to go back to the slides. So, so in these demos, what we did was we started with Python, and then we went to R, and we showed you how to, how to walk through these different kinds of MLOps tasks. So, I hope that what this did was show you that, that MLOps doesn't have to be, there are many tools out there, but at its core, the, what MLOps is, is a set of practices that you can learn and you can adopt that deploy and maintain the, the models that you build in production.

Which remember we said was like in a different computational environment where it's going to interact with different parts of your infrastructure, and that the set of practices that are here are to set you up so that your models that you have are, are reliable, and that you're, you're efficient in these practices without, that they work well for you to keep this, this, these systems up and running in a reliable and an efficient way.

So, as a, as a wrap-up, so, there's a lot of tools out there that do a lot of different kinds of things. So, what Vetiver does is it allows you to bring the, your, your existing modeling preferences and how, how you like to build models, and it provides support for you to version your model in a robust way so that you can identify which versions of models are in use at any given time so that you can do ad hoc analyses, get to previous versions without rolling back whole deployment environments.

Speaking of which, Vetiver provides support for you to deploy your model, say, using, using something like push button or what single functions to deploy and connect or using Docker to go to cloud platforms or servers of your own. And Vetiver provides support for you to monitor your models to, to understand, record, summarize, and visualize the statistical properties of your model over time.

And so, Vetiver is built from the ground up to work for, for both R and Python models. This means it's a really good fit for bilingual data science teams where people use R or Python, use both R and Python in those teams. Or even if you as an individual sometimes reach for R and sometimes reach for Python when you're building statistical machine learning models, that means Vetiver is a good fit because you, you use the same approach for both and you can support your team.

So we've got our last poll on Slido. So if you're, if you're on Slido and you click over to the polls tab, you can see this pop up and we're, we're really interested in what kinds of models you often use. So this has overlap maybe with the question about software, but what we're, what we're interested in here is more like, do you use a lot of deep learning models? Do you use a lot of tree-based models? You use, you know, are you currently spending a lot of time, you know, using GAMs or something like that? Like what kind of models do you often use?

So Vetiver right now has support for, I would say on the order of now, like a dozen different kinds of models, including all of scikit-learn, all of tidy models, like a lot of, a lot of the big ones, but we certainly are interested in what are the common use cases that you have or that you want to pursue more and to make sure we are supporting people, people's most common use cases there.

So, so yes, I said this already. So Vetiver works on RStudio's pro products like Connect and then also works in a, you know, public or private cloud or server using, using Docker.

So we are super excited about having gotten to share this with you today, but we encourage you if you want to get started with this to learn more. So the first place we recommend that you look is the, our documentation site at vetiver.rstudio.com. So this is, this is a bilingual documentation site where you can go to learn about R and, or Python together. I encourage you to check out Isabel's talk from RStudioConf this past summer. It's the one that's called demystifying MLOps.

And I, this one is fun. It's instead of Scooby-Doo, it's all about cookies. I'm using cookies and baking as a, as, as a illustration to understand what MLOps is. If you want to see more about how to use Docker, especially if you're newer to using Docker, I did a recent screencast on how to deploy a model with Docker. And then also there are some really great end-to-end demos from, from our, the solutions engineering team at RStudio. And there's, there's one for R and one for Python. It uses the same kind of data on bike shares and how to predict how many bikes will be, are used in different situations. And what is great about those is it is, it is really end-to-end. They're, they're fairly extensive and they show you how to go from how to, how to retrain a model on a schedule, how to integrate a model into an interactive shiny app so that you can, so it's really end-to-end, which is a great way to look at that.

And with that, I will stop sharing and we will see if we can use the rest of our time to answer a few questions. Awesome. Thank you so much, Julia and Isabel. I see a lot of clapping and saying great job in the chat. Awesome work. A lot of people really excited about diving into this. There are also a ton of questions on both Slido and through YouTube here. So if you don't mind me throwing a lot of these at you, I'd love to jump in. Let's do this.

Q&A

Awesome. So I see, I see Stefan, hi Stefan, asked a question on YouTube. Can you make that ever refuse to make predictions that are outside your applicability domain?

So this would be an example of, of like an advanced use case that we kind of alluded to. So the, the, the auto-generated like plumber files have like the predict endpoint, but you can write functions in there and you can add other endpoints that, that adjust or change, right? Like what is served at an endpoint or might be, what may be served at another endpoint. So some examples of things that we know people can use this for and that fit people's use cases are, for example, like an explainability score, like, like let's serve a prediction. Let's serve an, a, an explainability score along with that prediction. And, and the other one would be something like an applicability score. So you can either serve the applicability score, or you can change, like write a function to change what is served so that you check it and then do or do not serve it.

So the way you would go about doing that would be realizing like, okay, this is a more advanced use case. I am going to need to go into that plumber file and, and adjust it or on the Python side into that app.py and be able to write, write a function that will be a handler effectively. But we have support for all that and testing that it works and all that type of thing.

Awesome. Thank you. I see this is one of the most upvoted questions on Slido. Can you deploy models using Docker and cloud environments such as AWS? Can you pass custom Docker images that fit your organization's security architecture?

Isabelle, do you want to answer that one? Yeah, I can grab that one. The short answer is yes. The Docker files that Vetiver generates are just code. So any customization that you need to put in there to fit that organization security, if they have custom Docker images, you can update all of that in the Docker file itself. And you can take Docker files to cloud environments such as AWS, or GCP or Azure.

Yeah, as an example, like one thing that is currently in a PR on the R side for whatever is like specific support for deploying as a lambda function. And so what the reason I bring this up is because it's like it's a special kind of Docker container, like you have to add a runtime and stuff like that, versus starting with the like a base image that's like Python or base images are it's like a special, you know, base image for lambda functions. And so because we have that we're taking this approach that is generate code account for this use case, and then let it be extensible, we actually can even support some of these more extensible things directly in the software.

Great, thank you. Another question, I'm going to go back and forth between YouTube and Slido here. One question from Tony was, this may be more of a plumber question, but how do you make the predictions directly in the browser? Is there a particular syntax to pass arguments via the URL?

So if this is a question about the interactive documentation that we showed, and how we're like, we're directly interacting in a browser with the models, you know, things are going pass in, in the post request, and then we're seeing what's coming back. That's because of the open API specification. So vetiver in the Python side and the R side, creates model aware model appropriate open API specifications, so that when you have your API created via under the hood fast API on the Python side or plumber on the R side, you're getting something that is aware of the domain that you're working in. So the making predictions directly in the browser like we showed is because of the customization of the open API specification that vetiver provides on both sides.

Thank you. Another question on Slido is, is there an ebook for vetiver, similar to the one we have for tidy models? Isabel, you want to answer that?

There is not. There are a lot of great resources out there for mlops in general. We just had kind of like an internal book club and we went through chip one's designing ml application, I think is the name of it. So there are great books out there for mlops but not anything that is vetiver specific. I do think that the set of resources like documentation that we have is getting pretty robust. So this is something that's a priority for us, you know, this fall and moving forward to get even more documentation, especially for advanced use cases. But I do think that although not probably like a whole book worth of content, what we have at vetiver.rstudio.com, really, really, I'm proud of it. I think it gets you pretty far on the path.

Let's see. Another question over on YouTube is from Marlene. When initially deploying your ML model, how does vetiver handle data that does not meet the specs of the training data?

I can start with if there is an error that occurs, there is really nice error handling that kind of specifically points you to exactly what is giving you this error and what it should be instead that really leans on that initial P type. So if you're passing in a date that looks more like a string or something like that, vetiver will point to it and say, hey, this is a string, but we think it should be a date or from your P type, it should be a date.

Yeah. And yeah, that's exactly right. And on our side, if we, do you want to quickly let me share my screen again? So notice if I predict with good data, I get stuff back. But what if I predict with bad data? Like, let's say I only have a year erred. Like I'm missing the other thing. It is going to try to predict and say, oh, I should have made that go with debug equals true. Okay. So that's actually a really good thing to talk about very briefly is it will fail to predict. And there's an option where you set up your models. You can allow debugging errors to come through. It's turned off by default because it can provide information about your training data when it returns errors. And in some cases, that is a privacy issue. But in some cases, you're fine with it and you'd rather have the better errors. So you have a little bit of a choice there between I have really tight privacy constraints around my training data. I need to make sure nothing is leaked through, like, none of that information is leaked. Or you can say, okay, I would like to have more informative errors that are going to help me see what's going on.

Thank you. Another Slido question was, Plumber also has logging filters. Is there any work being done in Vetiver to publish a default logger alongside new models being deployed for usage monitoring?

This is something I think will be interesting for us to look into more detail. We don't have a default logger currently here. You can extend it in the way that you extend other plumber functions. Part of this is because we have been focused on, like, statistical modeling. I'm assuming this is, like, those logging filters that will log, here is what the call was, here is the result that you got, here is, like, somebody made a call with this in their post data, and here is what we did, we sent back. So right now, you can add those to any plumber file that you have. But we don't have a default one that's model aware. So that might be kind of good for us to look into.

I see James asked a question over on Slido. JSON and APIs can be foreign to those new to model deployment or those with no web development experience. Do you have any advice or resources to share?

I think there are some really good articles. But I think at its core, JSON is just kind of a data structure. And I think people get quite scared of it. But it kind of looks like a data frame if you find a simple JSON file and it just expands from there. And if you kind of can contextualize APIs as a gateway, that's a good place to start as well. It hangs out independently of everything else. But different applications will talk to it. So, we're talking to our API that has a model and then we're getting back our prediction. But on the other side, maybe an application has a standardized way to talk to this API to get back predictions or give them to customers in a different scenario. But there is a lot of great resources online as well that we can share links to.

Okay. Question on YouTube. Hello. Does it integrate somehow with MLflow?

So, if you're interested in thinking about, like, should I choose MLflow or should I choose Vetiver? One thing to, like, a main difference between MLflow and Vetiver is that Vetiver is more flexible about what you bring to the deployment process and MLflow has more support for – because it has more support for experiment tracking, it wants you to kind of enter into MLflow earlier. However, it is true that you can use these pieces somewhat composably. And so, you can take the pieces from, you know, different ecosystems that – and get them to work together in a way that's most appropriate for you.

So, Isabel, do you want to add anything to that? Yeah. I think my favorite part about this question is, like, open source is the face dial. You can take the experiment tracking that MLflow has invested a lot of time and effort into and then you can pair it with Vetiver's deployment if you prefer that. Maybe it's easier to make, like, predictions from an endpoint or it's faster to go from model to API there. So, I think it pieces together quite nicely and you can choose the pieces that work best for you and your organization and use them in conjunction with each other.

I think something that we're really excited about with Vetiver is the model to API time is, like, two lines of code to get a really basic API running. So, that is something that we have spent a lot of time to make sure it's a great experience and that is a little bit different from MLflow.

I think something that we're really excited about with Vetiver is the model to API time is, like,