
SatRdays London 2023: Julia Silge - What is "production" anyway? MLOps for the curious
What is "production" anyway? MLOps for the curious by Julia Silge (Posit) at the SatRdays London 2023 conference, hosted by Jumping Rivers! Abstract: Many data scientists understand what goes into training a machine learning or statistical model, but creating a strategy to deploy and maintain that model can be daunting. You may have even heard that R is not appropriate for production use. In this talk, learn what the practice of machine learning operations (MLOps) is, what principles can be used to create a practical MLOps strategy, what people mean when they say “production”, and what kinds of tasks and components are involved. See how to get started with vetiver, a framework for MLOps tasks in R (and Python) that provides fluent tooling to version, deploy, and monitor your models. This event was sponsored by: - CUSP London - Jumping Rivers - Posit - R Consortium
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
I am going to go here, I'm going to try that, I might take the clicker. Well thank you so much for that introduction. Thank you for Jumping Rivers for organizing this and I have been to a couple of these Saturday conferences and what I love about them is the variety of people in the community you get to hear from and so I'm really happy to be here participating in a Saturday's conference.
I am hoping that together we can ask this question, what is production? What does this mean? As people who work with R, I bet you have heard, R is no good for production or to have experience that maybe there's a lack of production knowledge or experience in our community and that people think we can't and so what we're going to talk about is asking and answering this question and talking about what MLOps is.
So I want to tell you a little about me and who I am and the perspective that I come from so that you can understand the perspective that I bring to asking and answering these kind of questions. So my academic background is physics and astronomy, kind of moved around in my career and eventually landed in data science. I worked as a data scientist in tech in some organizations and then now I work on open source software full-time for my job.
So if you think about that path, probably you think like one thing I want to notice or highlight is that I spent a lot of that time writing code but I was always writing code to answer a question or to ask a scientific question to do a scientific analysis and I, like many of you, I don't have a formal like computer science degree. Like that's not my background and my training and so like this is who I am and if we talk about like who you all are in this room but a lot of you have titles like data scientist. I bet a lot of you are statisticians by background. Maybe some of you are data analysts. You have, you write code but you probably don't primarily think of yourself as a software developer.
So today, as of today, my title is actually software engineer and there might be people here also who have like you have a more software developer research software engineer or you know regular software engineer. You might have a title like that but a lot of us who are here because we like to use R for our data practice, we probably primarily think of our identity from the data point of view and maybe less so from the software engineer point of view.
The case for model developers owning deployment
So what that means is that you are probably someone who has developed a model at point, at some point, right? Like you, you, you know how to, you know, use code to do exploratory data analysis, how to make plots, you know how to use code to probably you have trained a model at some point. And what I want to have the, if you have a one big takeaway from this, it's that if you are someone who has developed a model, then you can be the person to, let's say, operationalize that model.
The other thing that I want to make, like call out and really communicate is that if you are someone who knows how to develop a model, if you are someone who in your org ever builds, trains, develop model, then you are the right person to do the operationalizing of that model. So if you have spent time with EDA for that data, if you have chosen which kind of model to use, you've spent time tuning, evaluating that model, then you are the one that has the context, the domain expertise in the data and in the model. You know caveats about that model. You know what makes it more or less appropriate in different situations.
And if you have some kind of dynamic in a company where one person trains the models and then we kind of kick it over a wall to someone else, maybe it gets rewritten in another language or something like that. What that does is that leads to inappropriate use of modeling predictions. And if you as someone who is a model developer, a model practitioner, can gain these tools to go in your toolkit, then we move towards more reliable, more appropriately used, and fairer use of models in this way.
And if you as someone who is a model developer, a model practitioner, can gain these tools to go in your toolkit, then we move towards more reliable, more appropriately used, and fairer use of models in this way.
A concrete example: house price prediction
So we're going to look at a few, like a concrete example through a couple of this just to have something to kind of have in our head. So the data set is a housing, a house pricing data set. And we have their, their houses in Seattle in the U.S. We have a price that the house sold at, information about the house, like the number of beds and the number, or number of bedrooms, number of bathrooms, how big it is, when the house was built. There's also a date of the sale. You can see it's a, it's not, it's a little bit of an older data set at this point. I don't think you can buy anything in Seattle for these prices anymore.
So you've got this data. Picture yourself starting with it. You start on the process of exploratory data analysis. Then you start on the process of building a model. And this would be what you have as output. So this uses tidy models right here. But you know, you may use some other modeling framework in R. Maybe you're someone who likes to do EDA in R and then build a model in Python maybe. Anyway, think about you have a model, a trained fitted model that you feel good about and it's ready to go.
Is your job done? In which case, in which cases is your job done and is it not? So models I think, in my experience, most of the time when you get to this point right here where the model is trained, there are maybe two sort of like a forking path kind of at this point of what you do next. You might need to communicate about the model. So say you train a model and you learn something that you will tell people about the model. So maybe you write a report. Maybe you take what you learn and it's used for making some different kind of decision. So that's sort of the communication path.
The other sort of path is that the reason you train the model was so that you could generate predictions on new data. So we're going to have a new house in Seattle with a certain number of bedrooms and bathrooms and I need to be able to predict the price. Predict the price for the new house that comes in. So here, if that is the goal, if that is the purpose, like why did you train the model? Let's say that the main purpose there is predictive and there then you are not done. You are not done when you get to this point and instead it is time to think about how to put that model into production, how to deploy that model.
What is MLOps?
So we say, okay, I built this model for predictive purposes. I need to do something MLOpsy. What is it that I do? Let's talk about what do people mean when they say MLOpsy. Maybe you hear people talking about MLOpsy. Maybe you go on LinkedIn and you see a lot of things about MLOps, maybe mixed in with the chat GPT stuff right now or whatever and you say, okay, what is MLOps? Is MLOps this? This is like this sea of tools and startups that say they will do MLOps for you for money and it's just overwhelming.
Is MLOps like just a word that is full of hype and does not mean anything? I really don't like things that are not very well defined or like full of hype so I'm going to say no, that is not what MLOps is and instead let's write out a definition of MLOps that we can have in mind as a working kind of definition here. So MLOps is a set of practices to deploy and maintain machine learning models in production reliably and efficiently. So MLOps is not about any specific tool including the tool that I'm going to talk to you about today that I've been working on. It's not about one specific tool. In fact, MLOps is about what are the sets of practices that we use when we have a model and we need to go down that path, not the communication path, but rather the deployment path, the production path with it.
The model life cycle
So let's do a different kind of visual for this process and think about like a model life cycle, if you will, or like a cycle of modeling. So we start by collecting data at the top and I've kind of said this already before but like the first thing we do after we collect the data is we do exploratory data analysis. We understand the data. We prepare the data for the modeling process and here there are great tools. I'll lie about a lot of us in here since we're largely R users like to use the tidyverse for this. Maybe you like to use data.table.
There are tools for the same kinds of tasks that are available in Python. Notice that these tools are open source tools that are widely adapted and there may be some differences of opinion about what the best tool exactly is to use, but all of these tools have robust user bases, books about them, and the same is largely true when it's time to train and evaluate a model, whether you use something like tidymodels or Carrot or maybe you build your models in Python perhaps. Again, there's robust open source tools that you can read whole books about.
On the right side of this, there are, you as a data scientist or a modeling practitioner have a tool, have options of tools that you probably love, you're probably familiar with, they feel comfortable to you. This becomes much less true when we get over on the left side of this cycle. It might, you know, people might even not be sure, like not be sure what are the tasks that go on this left side of this cycle.
Introducing Vetiver
So I've been working on a new-ish project called Vetiver, and Vetiver sits over on the left side of this cycle where you have a trained model already. You trained your model in the way you thought best, you decided was best. You have a trained model, you use the tools that you are comfortable with and that you love. And now it is time for us to walk you through these, these next parts, these next tasks. And over here, this is what we're talking about as the MLOps set of tasks that we need to do. So we'll dig into this just a smidge more. We need to version our model, we need to deploy our model, and we need to monitor the model that we have.
So if you are, you may see this word Vetiver, and some of you may think, wait, I feel like I've seen that word somewhere. So Vetiver is a, it's also known as the oil of tranquility, and it is, it is a thing that is used in fancy candles and perfumery. Like it's a, it's a plant, but it is a, it's like a very good-smelling plant. And it's actually used as a stabilizing ingredient in things like perfume and candles. And so what it, what the metaphor here, right, is that these volatile fragrances that are important and people love, like these are your models, right? And they're like, you know, they're volatile and like how do I, how do I make sure it is what I, what I need to do? And Vetiver is a stabilizing ingredient that helps you feel tranquil about your, your, your deployed models there.
So I'm going to show you our code for how to do this. So let's think back to that model that we had, predicting prices for housing. So the, what we're doing here is we're creating a, a deployable model object. When we create this at the, say at the end of a modeling, like, like process or procedure or workflow, we can capture a lot of information that turns out to be very important later. So what this deployable model object has in it, you can see some of it that gets printed, printed out here. We, I don't have to tell, I don't have to tell my, my model bundle creation process any of this because when we, at the time of model training, we have a lot of information about that model. We know, you know, that in this particular case, we use Ranger, we're, we're predicting a numeric value, it's a regression, we know the, like the, the number of features, the names of features, and we can capture all of that and store it as metadata on the model.
And Vetiver is specifically designed to have metadata that supports you or protects you, I guess, against like common failure modes when it comes to deployed models. This is all found pretty automatically, like the, because it turns out when you are in the process of training the model, you have all this information. We just need a little bit of support in getting that recorded in a way that is useful.
So, a really sort of underlying goal or design, a design goal, I guess, for Vetiver is to set things up to make it easy to do the right thing, to, to, to keep, protect against common failure modes. So, I, you know, when I talk to people who deploy models, this is actually one of the most common failure modes, is that the model gets put in production and new data comes in, and something actually changes about the new data. Maybe it's an error at some point, or maybe some other team did it on purpose, right? And the sort of best-case scenario in that situation is that your model errors, and you know what happened. But that's not what happens. Sometimes, models are very capable of just like chugging along and continuing to generate predictions when the input data is actually now entirely wrong. It is not unheard of for this to happen. That's kind of worst-case scenario because your model is now generating nonsense, but you have no idea.
We also have another really common sort of struggle that people have when it comes to deploying models, putting their models into production, is, is, you know, tracking and then giving information to other teams about what, what exactly does my model need to generate new predictions? What are the, the dependencies, the software dependencies of the model? And making it easy to do the right thing is not just about what does it take to actually put the model into production, but also it's about how do we document models?
So in Vetiver, we have a template for writing a model card for, a model card, which is, the idea is taken from this paper, model cards for model reporting. And it, it's a template. We have it available in either R or Python. And then it has all these different sections here. And some sections can be automated, because we, we know some things about the model, like what kind of, what kind of metrics were used to tune it, what type of model it is. But some of the, some of the, like, sections of this model reporting framework cannot be automated. And it takes, it takes input from you as a model developer. It takes, it takes the process of you thinking and, and writing things out. So we want to make it easy to do the right thing by providing both software that does this right thing, and also opportunities for, for documenting.
Versioning models
Okay, so let's talk about these different tasks. So first, we start with versioning. So much like, you know, many of you probably have adopted version control practices for your code, we need to have some kind of approach for managing versioning of models. This, this starts to become important if you have, if the number of models you're dealing with grows, and or you're retraining models frequently enough that you, you need some process that is the equivalent of moving from emailing files to your co-workers that say, like, underscore is final, underscore final to, you know, and to moving to Git. You need something that's like the, the, the analogy for that for models.
So we need to manage change in models well. We need to deploy the model, right? Like we keep saying this, we keep, and in, in Vetiver, we adopt practices around using REST APIs for models. So there are different ways to deploy a model, to put a model into production, but the, the machine learning community, the MLOps community has kind of identified that REST APIs are the good option for sort of that big middle, like the big middle of models and production. The best option for you is a REST API. It maybe isn't, you know, like there of course are other ways to do this, and if you are in a very simple situation or a high-scale kind of situation, you may need something that's not a REST API. But certainly if you're getting started, this is where to go.
And then MLOps is also about, is monitoring. The other big category of tasks is monitoring the model once it is deployed and in production. So this is about, is the model performing the way that you expected it to? Is, is it time to retrain the model with new data? And at what point does, do you need to find out whether it's appropriate to just start again from scratch with, from the whole model development process? Like let's just start over. We need to figure out what's going on and make basically a new model.
So let's dig into these just a little bit more and show you some code for how you would do this. So when we version the model, we are, we use a, I don't know, I guess another metaphor or an abstraction that comes from pins, the pins package. So the metaphor here is that you have a board. Here I'm showing you a board that is, uses Posit Connect, which is one of Posit's pro products. But this instead of Posit Connect, this could say board S3 for an S3 bucket. It could say board drive if you're using a network drive. So depending on what your organization uses for where you want to keep data, the idea is that the, the workflow is the same. You just change out what the board is. And the metaphor here is like I come and I pin, I pin the model to my board. And so I say, okay, here it is.
So think of pins as a fairly lightweight storage, versioning, sharing kind of, kind of system that is available here. If I retrain the model, the same model on different data, I can pin it also to the board and then both versions are there. I can get to both versions. And if I need to, but like one is, is latest, right? So that one that's latest can be the default one that I get to.
Deploying models with REST APIs
So this is on, this is on our Connect demo, demo server. And you know, we have some, you know, we have things so that we can compare exactly the contents of what we have here. We, we get automatically, we get a lot of this information generated. So that is just very, so, you know, the model is stored there with the metadata. Notice it was a binary blob. And again, if you're going to have, use REST APIs, binary model objects are like, this, this is the easiest way to go for the vast majority of use cases.
Once we have that, then we can create a REST API. So here's the code you would write to set up a REST API that's running locally, say, say on your laptop or whatever your development environment is. One thing we notice when we talk to people who are trying to get started with MLOps is that many tools that are built out there for MLOps really focus so much on the production environment that it's really hard to use it locally. And you have to use it locally because you have to debug problems and you have to, you have to set things up, you have to do dry runs. So we, as we thought about, like, what are the common problems that the people we're talking to have, we talked about how to make sure that people can have a local experience as they're developing, debugging, solving problems that feels very fluent moving from the local to, to actually the production.
So if we were to run this, if I then, like, piped it to PR run, I like, if you were using the RStudio IDE, it pops up in the window. If you like to use something like Visual Studio Code, it, it will give you the URL that then you would go and, like, paste it into a browser. And you can actually see what, what the sort of HTML interface to the REST API looks like.
So I, we, I'm like, okay, that sounds great, but I really, I want to actually deploy the model. Like, what, what do we talk about getting it into some new kind of computational environment? So when, so let's, let's say, what does it mean to put something into production? I think the easiest way to think about it, for me, is that we develop a model in one computational environment. Think of this as maybe your laptop, or maybe you work, you know, on a server environment, but it's, it's one place. And it turns out the, the, the, the software that you need to have installed there is about tuning, is about training. The, putting something into production is getting it out of that computational environment, like, lift it out and successfully carry it over to a new computational environment. For, for many people, this might be like a cloud computing environment. It might be some kind of server that your organization has. We need to take it and lift it, and then successfully have it working over here.
So in Vetiver, we have two main approaches that we have for, like, like, how do you do this, and where does this work? The first one is the pro products that Posit has. And here, it's literally one function, because it is our own product, right? Like, we can make sure this works really well. And this function actually deploys the model. Like, after you run this, the model is literally, there's literally an API on your server that will generate predictions.
But Vetiver is open source software. It is not something that is only for Posit customers. And so for, for moving in a different way, working in a different way, we make it as easy as possible to generate a Docker container that can serve your model. So this function, notice it has a different verb. It says Vetiver prepare Docker. So what this function does is it generates a Docker file, an rms.lock file, and an app file, like a, like a plumber file in this case, that is specially tailored to your model. You know, you don't want to be known as, like, the data scientist who makes enormous Docker containers when they don't need to be enormous, right? So this is specially tailored to include just the information that your model needs to run. So you get these files, these bundle of artifacts, then you can use Docker to build that. And then you have a container that can go wherever it needs to go. It can go, like, so let's say you use AWS. You can take that model, you can put it in an ECR registry, and then you can serve that, that Docker container in any of the, like, five ways that AWS has to serve Docker containers.
So one approach that we like to think of here is that for people who are at the beginning of their, of their, like, model deployment or model MLOps journey, that they have these options, some of which are super fluent, like, you know, Posit's pro products are really built for, you know, people from the data science persona, people who are, like, that's your, your main toolkit. But we also have opportunities for people to go, you know, anywhere else that they need to with these models, and also that are the support for people growing, you know, like, like, moving to bigger scale or whatever it is that we need to do there.
So let me show you one of these, what one of these looks like. This is a model that is deployed on a, on our Connect demo server, and this is, you may be thinking, oh, interesting, like, you're, you're, like, clicking around here. Oh, I think it's waking back up. Like, I'm clicking around here, but this is not really, like, a shiny app that is meant for a person to interact with. I mean, I guess, I guess at some level, let me, there it goes. Okay, it has fallen asleep.
So this is not a, this is not something that's built most clearly for human users, like a shiny app. What this is, is visual interactive documentation for the REST API. So I can go here to this get endpoint, which if, you know, it's fine if you don't know a lot about APIs. Think of it as, like, two computers are talking to each other, and I say, oh, I need to get something. Here, if I go to get pin URL, I get exactly the version, like, exactly where the binary model, the versioned metadata, metadata-rich model object is that we can get to here, and, you know, if I were to, this is where, you know, the really important stuff happens here, and notice that I, I see, like, this is all self-documenting, like, this is all automatically created because of the information that you had about the model, when you trained it, and I can, I can actually interact with this.
So the, the, the, whoops, the purpose of me showing this is not that this is, like, the absolute best way to interact for a person to interact with a model, but rather that this is how your model gets documented, how, how, how to interact with your model. So say you're working with someone who is a software engineer collaborator, and you need to say to them, here is how you, you know, how you make an API call for, for my model of this API that I made, and this, you say, like, look, here's, here's exactly the curl commands that you would need to have to be able to interact and say, okay, let's make a request, let's go, let's post some new data for new observations, and then get back the, the. Response here, so what this is about is being a good, being a good collaborator, showing that you are the person that can take the model the last mile, and get it into the hands of the people that need to interact with the, with the model.
What this is about is being a good, being a good collaborator, showing that you are the person that can take the model the last mile, and get it into the hands of the people that need to interact with the, with the model.
Monitoring deployed models
Okay so so we talked about versioning we talked about deploying right putting into production lifting the model getting it somewhere else, and now let's talk about monitoring, so that ever has in it. Uh, functions and code to help you get set up with either you know we can start with default metrics, but we can use custom metrics and one thing that we have found when we talked to people who were working on monitoring problems is that monitoring problems often are very specific to. To people's business problems like it's not uncommon for people want to monitor monitor something that's not you know RMSE but rather related to a KPI in their in their organization and so what we what we do here is we would this really highlights for us that a code first approach to model monitoring is almost required like it's basically what we have to offer people so there again is like a template and.
That ever that generates code for you based on your own model, but it is code. So then you take it and you just you just go with it. So if you've heard discussions, you know lately about like how will LMS make you faster like you start with something and then you can edit it like that's the kind of mental. I mean there's no LMS here to be clear, but like that's the mental model right like it gives you it gives you code that is generated and it runs and it works and we show you. Things say oh here's if you have say that feedback loop where you get true values and you can compute some kind of statistical metric or maybe you don't and so what you monitor is just the input data like the statistical properties of the predictors the input to the model and of course we want to be able to show this to you know to our coworkers so they know how these things work, but this is all code that is generated and that you have access to so you can customize it in the way that is. Appropriate. To your work.
Who Vetiver is built for
So using Vetiver is designed to be a good option for people who are just getting started with mlops so it is it is designed with with a user persona top of mind who is someone who has never deployed a model before and we want to be on them be able to deploy their first model so we want the barrier to entry to be low. So we want the barrier to entry to be low we want the learning curve to start out low at the same time Vetiver is not meant to be a toy project or a tour tool only for beginners we want to give like give people a tool that has you know great defaults easy for people to get started with but when you have more complicated needs whatever that might mean in your case maybe high compliance needs maybe it means high scale needs. Maybe it means you have a lot of different tiny models you know depending on what is specific about your particular use case we it's important to choose a tool that can. That can scale with you and your org as it grows so this is what Vetiver is built to do and that is that makes it kind of unique compared to other mlops tools that might be out there that might focus much more maybe on a high scale and they might focus much more on a on a like a like a software engineer persona thinking that's the person who is deploying the models whereas we really do think if you develop models.
You can be the one to deploy the model and then your skill can grow as the needs as the needs change so I'll I'll just remind us here the things that Vetiver does is to version deploy and monitor and this this also makes Vetiver a little unique because a lot of these other tools sit in slightly different places so it may some of them may be more interested. In being involved in the model training process the model like the model like say hyper perimeter tuning and you can't actually deploy a model with some frameworks if you didn't use it to train your model and but a lot of people don't want to use the tool to train you know this that that tool say to train your model so it is. This puts it kind of in a unique place and the other thing I think that is unique about Vetiver is that Vetiver was built from the ground up for R in Python model so there's an R package and a Python package a lot of the tools that are out there. Um, you know actually are like almost unusable for R but because they are designed in the way that they were they actually sometimes don't give a great experience for Python people as well.
So I don't know what language we're going to be using in 10 years 20 years, but the design choices that underlie Vetiver use technologies that I would be willing to bet are you know like we're talking things about how do we carefully think about smart blob storage right? Pretty sure will still be doing that. I bet rest API's will still be here and you know in that long so what and I think you know HTML and dashboards right here to stay here to stay so I think that. What Vetiver can like can be a good option to learn because it makes your skills appropriate in many uses there.
Um, so if you're interested, I know we this isn't our conference and I pretty much you know I run. I run Python code, but I don't like I don't write a lot of Python code, but if you are someone who maybe moves back and forth a little bit between this is one of our documentation sites that shows what are the functions that you used to do the same kinds of tasks and are in Python been really interesting to work on a project where we want to focus on tasks and then make. You know support and functions for people to approach those tasks in ways that feel comfortable like we don't want to write a Python package is that no Python people actually want to use and same because we have had that experience as our as our user. So it's been really an interesting project to work on for that reason.
So again, here's what mlops is I'll focus here on this idea of like you deploy and you maintain and there's these things that you need to do in order to be able to get that in a good place with your model and the last thing maybe is maybe some of you are sitting here and you're like. Well, this all sounds like totally disconnected from the work that I do. Maybe you're someone who does largely do data analysis or statistical analysis. Maybe you are someone who you know it spends all your time writing shiny apps and you think like what like why should I care at all about mlops or what it is.
So I think the first reason why it was be smart for you to learn a little bit about mlops and what it is would be how do you how do you learn how that your work can be can be lifted and moved like can you start to build some of those muscles about what does it mean to put your work? Into production to to deploy your work and the other thing is that it does when you are when you are the one who can take your work the last mile. Then that's how you can really scale the impact of your work in your organization.
Then that's how you can really scale the impact of your work in your organization.
So I'm going to put you probably have seen this URL here at the bottom, but you can use the slides are available at that URL here and you can come and see you know learn more in these sort of recommended resources. So feel free to go there and click through those and with that I will. Oh yeah, one more there you go and with that I will. I will say thank you for having me. Thank you so much and think maybe we have some time for questions.

