Resources

Isabel Zimmerman | Demystifying MLOps | Posit (2022)

Data scientists have an intuition of what goes into training a machine learning model, but building an MLOps strategy to deploy that model can sound daunting for data science teams. Model services are not one-size-fits-all, so it is imperative to know a range of tools available. One option, Vetiver, is a framework for R and Python created to make model deployment feel like a natural extension of a data scientist’s skill set. This talk offers a high-level overview of what MLOps options are available for model operationalization, but also shows a practical example of an end-to-end MLOps deployment of a model-aware REST API using Vetiver. Session: Updates from the tidymodels team

Oct 24, 2022
18 min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

All right. Hello, everyone. I am Isabel Zimmerman. I am here to demystify machine learning operations. Mostly because machine learning operations are very hard. So infamously hard that in a past life, I was deploying a lot of models, and I found these tools so unfriendly for data scientists that I left that job, came to the company previously known as RStudio to build tools for data scientists to deploy their models. But maybe more important than this fact is I really love baking. I love baking. I love eating chocolate chips by the handful. I find them to be so delicious, especially like the dark chocolate chunks. Those are my favorite. And as many chocolate chips as I eat, it's never as good as the cookie at the end of my baking session.

And machine learning models are a lot like chocolate chips. They're really good on their own, but they're never as good as they could be or a lot of the value you're losing when they're not deployed as part of some sort of larger system.

What is MLOps?

So what is MLOps? It's a set of practices to deploy and maintain machine learning models in production reliably and efficiently. It's a set of guidelines to help your model live outside of a notebook. So what are some MLOps practices? If we were in the kitchen together, I'd tell you to write down your recipe and remember to put your cookies in the oven and don't just eat the cookie dough and make sure you don't burn your cookies at the end. But in the data science world, this looks a little different. Some practices that you might want to keep in mind are to version your model, to deploy your model and to monitor your model.

If you were at the keynote earlier, this might look pretty familiar, except I like to think my version is a little more delicious. So it's important to realize that MLOps is not a disjoint piece of this data science life cycle that you build your model and kind of throw it off to your IT folks and give them a thumbs up and tell them to deal with it. No, this is our job, too. We collect the data. We understand and clean our data with fantastic tools like the tidyverse or data table. If you use Python like I secretly do, you might do something like Suba or Pandas or NumPy. Once you've cleaned and understood your data, it's time to train and evaluate your model. Once again, we have great tools for this, like tidymodels and Torch and Carrot and tidymodels and on the Python side, we have scikit-learn, Torch, XGBoost, great tools. But then it gets a little hazy. You know, what comes next? And that's where vetiver steps in.

So vetiver is offered in both Python and in R, which means you can pip install vetiver or install.packages vetiver and it helps you do these things of versioning, deploying and monitoring a model.

Versioning your model

So we'll start with versioning. How do we track and manage change? When I bake, I might try out a little bit of a new recipe, like throw in some gluten-free flour or oops, I bought the wrong ingredients. I'm going to have to make it work. And I'll write down my changes so I can reproduce this model later. But in the data science world, it looks more like you saved a model. And you do your training and your tuning and all of this other fun stuff that we've heard so much about. And you end up with your final model. And then you actually get some new data or you realize you did something wrong. And then you have your, like, actual final model. But this is not over because models are living objects and there's still more to do. And now we have our actual final model. So this lacks context and it's not scalable.

We can see it gets kind of crazy for one model. So what happens when you have 5 or 10 or 50 or 100? Versioning is useful to track changes across time. And it should also be used for different implementations. So we saw that versioning is useful across time. So you're not doing the model 1, 2, 3, 17. But it's also important if you have different implementations. So if you have a staging server and a production server. If you have multiple models, being able to have strong versioning procedures makes you able to take them down and put them up or roll back to old versions very easily. It's important to have a strong structure around your versioning procedures. Really any time you have multiple models, this is important.

So in Vetiver, we actually have a secret weapon called pins. And pins is, once again, available in R and Python. So you'll load this package and you start with a model board. So boards hold, they organize, and they can create some metadata for almost any object. And when you use it in conjunction with Vetiver, you get some special features. But before you get to look at those special features, you also have to make a Vetiver model. So this is a deployable model object. You make a Vetiver model from your model, and you can give it some other information, like the name, for example, name. Or maybe an input data prototype. So we're giving it a little bit of training data so when you deploy your model, you know what to expect from your incoming data. So we have our model board, model board, our Vetiver model, V, and then we can Vetiver pin right, V, to our board, and your model is versioned. And it's also really cool because this can be on many different platforms. You can make a board maybe in AWS or on RStudio Connect. And so you can collaborate and bring these models into memory, even when your teammate is pushing them up into your board on a different server.

So this helps with scale, but what about context? So if we want to look into our model named name, we can see some very descriptive information on what type of model we created, the size, the hash, the title, the type of model, and also we can peek into that data prototype to see what this model is expecting, as well as how to run the model with required packages.

Deploying your model

So we have some infrastructure to version our model, and now it is time to deploy. So when building out Vetiver, there's many different ways you can deploy a model, but not all ways to deploy a model are made equally. So we'll start with a few different flavors, and one way you can deploy a model is in XML with PMML. And if I was baking, I would tell you that this is a lot like baking on an open flame. Fires are notoriously very portable, but they're not very customizable in temperature. So while XML can be put literally anywhere on the internet, PMML only gives you a few different options for models, and we've seen today, like, there are so many amazing ways to build a model, we don't want to restrict data scientists. So we knew XML was not the right place.

We also peeked into, you know, deploying into databases. So this is kind of like baking cookies in a waffle iron. It works really well for some recipes, but others you kind of end up with this weird half-baked goopy mess. So deploying in databases is great for people who have a workflow based around a database, and, you know, you have easy access to it, but this isn't the best case scenario for every team, and we wanted to give as many people an option to deploy their models as possible.

So if you're a baker, you're probably not carrying around matches, and you're probably not carrying around a waffle iron either. You're probably best friends with your oven. Best known for its predictable temperatures and easy-to-use interface to make baking a great experience for all skill levels. REST APIs are the same way. They are interfaces that connect to many different types of applications in a very standardized way. Any model that you can write, you can deploy into a REST API. You can interact in the browser to debug your API. You can deploy and run it locally on your computer or an on-prem server or AWS SageMaker, and if you're interested in AWS SageMaker, there is a birds-of-a-feather session about Vetiver and SageMaker, or maybe you're interested in deploying into Connect, and there's another talk called Yes, You Can Use Python with RStudio Team that can show you Vetiver inside this amazing integrated application that's all deployed on Connect.

REST APIs are the same way. They are interfaces that connect to many different types of applications in a very standardized way. Any model that you can write, you can deploy into a REST API.

All right, we have one last reason to be excited about APIs. They are really accessible for all skill levels, so maybe your team-mate that isn't as modelling-obsessed as present company, they're still able to discover and explore and interact with your model, and they don't even have to download R or Python.

This is a lot of hype, so how do we even deploy this model? In the R side, you start up your plumber router, create a Vetiver API, and our V Vetiver model from earlier, just plug it in. On the Python side, once again, create a Vetiver API and plug in your model V. And once you have run that, you get prompted with this great visual documentation that lives at your API. It will give you a little bit of metadata, like name, the version of Vetiver you're deploying with, kind of what model you're using, as well as where this API is running, so you can see this is a local server. But, if you scroll down, there's also a ping end point to see if your model is up and running. If you keep scrolling down, you can see there's a predict end point where you can actually interact with this model, so we see I'm making a prediction, and then I can edit some things and rerun it in the browser. And I was typing slower than I'm talking right now, but we can try, and, voila, our prediction changed. We can see response headers and curl request information to make your IT people happy.

But, if you're not interested in, you know, looking at this end point too much and you just want to interact with the model that somebody else has deployed or that you have deployed, you can also just live inside the notebook that you're already inside. So you will use predict, just like all of these tidymodels packages, so it's an expected function, give them where your model is running at, so this Vetiver end point, and the data that you'd like to predict with. Here we can see I'm doing batch predictions on MT cars, so you don't have to leave your computational environment if you don't want to.

Monitoring your model

So, we have versioned our model, we've deployed our model, and our cookies are baking in our API oven, but we have to keep an eye on them so they don't burn. It's important to monitor a few different things. The first is data drift, so does your data look the same today as it did two months ago? You also want to monitor for model drift. So, model drift is when your model's performance metrics start to decay, and this is so, so important to track. Models fail silently, so it will continue running without error, even if, you know, your model accuracy is zero. If you're not monitoring your model in some way, you are oblivious to model decay in production.

A good example of this is, you know, I listened to way more Jonas Brothers in 2012 than I do now, but my Spotify algorithm doesn't still suggest that I listen to Jonas Brothers because my tastes have adapted. If this, you know, recommendation algorithm hadn't kept up with my changing choices, they would have lost a customer.

Models fail silently, so it will continue running without error, even if, you know, your model accuracy is zero. If you're not monitoring your model in some way, you are oblivious to model decay in production.

Finally, it's important to know what to do when things are going wrong, so if your model is declining, it might be retraining with new data, and this works in a lot of use cases. You might need to try out a new model type altogether, but it all goes back to this versioning. If you have strongly versioned models, and your API is, you know, connected to this version, it makes it so much easier to take down and put up new models without much pain on your end.

So, in Vetiver, we are going to start by computing metrics. So this is coming from a data frame that has a few different columns, and we're going to pass it into this function, so we want to give them the date column, so this is the date that the prediction was made. We also want to give, you know, the time frame that we're aggregating over, so, for this one, we're looking at one week, so maybe this model has been running for years and years and years, but we want to look at it one week at a time. Then we want to give the actual miles per gallon, or the truth value, and we want to give the predicted value as well. You can send in different metric sets. R does this secretly behind the scenes for you, but you can also customise them. Python side, you have to write a little bit of I want to use RMSE and send in a list.

So, we've computed our metrics. And then we want to pin our metrics. This is, once again, very important to have this running log of your model performance. With vetiver pin metrics, you pass in the metrics you've just created, you give it a name for your pin, and, if you have overlapping dates, you can choose to overwrite them, and let vetiver do all the hard lifting of dealing with date columns. Finally, this is the beauty of it all. You get this out-of-the-box function to plot your metrics.

Best practices and wrap-up

So, putting it all together, some best practices for deployment are versioning your model, deploying your model, and monitoring your model. But there are more best practices to keep in mind. One of them is to think about model cards. These are partially automated R markdown templates that come inside the vetiver package. This is where you can help other people on your team better understand your model. So, you can document environmental or technical or ethical factors that you thought so hard about when you were developing your model.

There's also things to help with data validation. So, there's something called an input data prototype or P type in vetiver that will validate your data at prediction time at the API level. Because sometimes, the world messes up your data, and, as it streams into your API, it does not look the way that your model thought it was going to. And vetiver is able to give you helpful error messages to, you know, make sure you can choose or identify these points of failure very easily.

And there are many more best practices to keep in mind. But our cookies are taking a bow, so it must be the end of our time. And I challenge you all to think of, you know, what can you do with your own data science workflow? Where is you or your team at today? And where can you add in MLOps best practices to make it a little better? I also encourage you to check out vetiver in Python or R. And if you have any questions, Julia Silgi and I and the rest of the vetiver team and tidymodels team will be right outside after my talk. Thank you.

Q&A

Thank you, Isabel. We have time for a few questions. So, you said briefly, but how well does vetiver play with RStudio or Posit Connect? So well. There are, you know, really easy like one-liners where it's deploy RSConnect and it will ship your model off into the Connect world.

Great. Can you deploy multiple vetiver models to different routes on the same API server? So, depends on what you're doing. I guess there's a few different ways to approach this question. One, you can have multiple different end points running at once, which might be a better solution rather than trying to hit the same end point or same API at different end points. You can also write custom end points. So, maybe you want, you know, your model at one level and then some preprocessing or, you know, multiply by 100 to get a percentage or something like that. And that also is part of vetiver.

Great. Does vetiver work with any Python packages besides scikit-learn and Torch? So, currently those are our two supported packages for deploying models out of the box. But you can deploy any model because we have kind of an escape hatch of a custom handler. So, if you know how your model needs to make predictions, there's documentation on the Python side. I think it's like advanced usage custom handlers. Click on that and it will give you all the information you need.

One last quick question. What is the backend for deploying a REST API on the Python side? That's fast API. I'm a big fan. Okay. Thank you, Isabel. Thank you all.