Building Governable ML Models with R (Tom Shafer, Elder Research)

Transcript#

This transcript was generated automatically and may contain errors.

Congratulations! Our model's in production. The session is over. Now what do we do? Back when we started this project, the goal was to build a model that provided some kind of value for our company, for the public sector thing, for the world in general, and so we've made a good start now that it's in production. Supposedly some enormous number of analytics projects never get off the ground at all, so okay, we're doing okay. But in practice, deployment isn't actually the end of putting something into production, it's just the beginning, it's the first gate.

Really what we need to do is we need to be able to keep this in production for an arbitrarily long amount of time in order for it to actually affect the changes that we want to happen. And typically, right, all of the stuff that happens after deployment falls under the heading of model governance or something like this, which is, you know, how do we keep this thing running when things change? Because a lot of stuff does change. Companies are going to change out of under us. Sometimes production systems change. We move from one system to another. Or maybe our model is amazing, and our business customer wants us to change it. They want us to add new features or they want us to retrain for some new use case or something like this. Anyway, we're going to have to retrain the model, and so we're going to have to, like, work with it again in the future.

Now, a lot of times when we talk about model governance, we think about the model object itself. We're thinking about versioning, serving, monitoring for drift, these kinds of things. But in practice, a lot of what we actually end up doing involves sort of all the scaffolding around the model. This is how the model is trained, how the model is validated. If we're responsible for writing the code for inference, for working with predictions, there's a lot of code that gets involved there, too. And so this talk is coming from experiences that I've had over the last few years putting models into production, trying to keep them there, and sort of how to think about model governance in this broader context, including all of the scaffolding code.

And so I want to center the talk on just this question, which is sort of what can we do now while we're building the model, while we're designing how this is going to work to make our maintenance job easier later? And in practice, you know, there's I think we can distill it down to a few core practices that can really help with this. Core principles that serve as a foundation for this. Let's see if you're ready for this list. These are things like packaging, documentation, testing, writing legible code. Take it in. I know what you're thinking. Wow. This is going to change my life. No one has ever told me before that I should document my code. But the trick here is that in the governance context, these go from nice things that we should do, maybe, to things that sort of all stack on top of each other in order to produce a model or a model system that then we can maintain over time and can keep providing that value, even as we have to retrain it, we have to adapt it, we have to change it, right?

But the trick here is that in the governance context, these go from nice things that we should do, maybe, to things that sort of all stack on top of each other in order to produce a model or a model system that then we can maintain over time and can keep providing that value, even as we have to retrain it, we have to adapt it, we have to change it, right?

And so even though none of these things are probably like mind-blowing new things, what we have found is that when we combine these kinds of principles together, it has made our model more maintainable over time.

Q&A

I think we have time for one question. So how would you suggest someone goes about...

Okay. So someone that wants to go about starting with a small test base just for their previous functionality or even starting to use packages for packaging their code. Do you suggest using an LLM system for that? Have you had any experience with that and good results? I have not used language. The question was, if you want to get started with any of these sort of directions, what do you think about using a language model to help? I have not used language models to do this because I got into this before they existed. But if they work the same here as they do at everything else, I say, sure, take a look at them. See what they have to say. They can certainly point you in the right direction. But in our ecosystem, we have packages like use this and DevTools that make this so easy. Install use this and then call use package, I think, or make create package or something. And it just does it for you. And you can sort of walk your way through that way. It's so easy now.

Building Governable ML Models with R (Tom Shafer, Elder Research) | posit::conf(2025)

Transcript#

Packaging as a foundation

Documentation and testing

Writing legible code with S3

Putting it all together

Q&A