Resources

Live Q&A following Workflow Demo - January 29th!

This is the Live Q&A session for our Workflow Demo on January 29th on Model cards with vetiver for transparent, responsible reporting with Julia Silge. Join us for the Demo first with Julia Silge on Jan 29th at 11am ET to learn: 1️⃣ How to get started with your first model card 2️⃣ How a model card fits in with model monitoring 3️⃣ How to use Posit Team to author and publish your model card The demo will be here starting at 11am ET on January 29th: https://youtu.be/iNtgunGg86o GitHub Repo: https://github.com/juliasilge/model-card-workflow-demo

Jan 30, 2025
30 min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hey everybody, thank you for joining us. We're going to give people about one more minute to jump over to the Q&A. Everybody if you've just jumped over here now we're going to give about 30 more seconds here for people to join us in the Q&A room.

Awesome okay, I can see people are starting to make their way over. Let me pull Julia over on stage here as well. Hey Julia! Hello! How are you? Good! How are you doing? Good!

Thank you so much for leading an awesome demo for us. Thank you for having me. It was really great to get to put that together. And thank you all so much for joining us today.

Okay, I can see 30 of us have jumped over into the Q&A room. So let me get started and people can join us as they do. But this demo today actually came as a suggestion from a customer. So I want to say thank you to them as well. So they had just deployed their first machine learning model with Vetiver and created this whole end-to-end workflow in five weeks. When they presented it, internally, they were immediately asked, how are you going to monitor it?

And so they thought model cards were a great idea, but wanted to learn more and understand where those should be published. So thank you so much, Julia, for walking us through that. It's always really exciting hearing people's real experiences using this in their real use cases, like how they actually put these things into practice in their organizations.

And I do, I think it's really, it's quite interesting, what are the questions that come up? Like, what are the sort of immediate, like, questions about more about, like, not some theoretical thing, but like in practice, you know, like in practice, what are the things that you do next? Absolutely.

And I also wanted to share that because if anybody else has suggestions for workflow demos or topics that you haven't seen from us yet, you'd like to see, let us know. So you can let us know in the chat here. You're always welcome to reach out to me directly on LinkedIn as well. I guess I should introduce myself here too. I'm Rachel Dempsey. I lead customer marketing here at Posit.

And so I host a variety of different community events like this monthly workflow demo that we host the last Wednesday of every month. And so there's been over 22, I believe, different workflow demos now. So I'll share the whole playlist on YouTube with you in the chat in just a second as well. But I also host our data science hangout, which we have every single Thursday at noon Eastern time. We'd love to have you join us there as well.

Julia, I know you introduced yourself in the beginning of the demo, but might be good to say introduce yourself here too. Yeah, yeah. So yeah, my name is Julia Silge and I work here at Posit as an engineering manager. So I lead, I've worked on a couple of different kinds of projects. If I think about like, what's the connection between them, I would say they're about people's really applied process, like the data science process, like really applied, like what does it take to have people effective in their, like how can people be effective in their real use cases?

So this was a fun demo to do, because I would say what I focused on maybe for a couple years was Vetiver, like really getting off the ground, seeing like, can we have some support for people getting started to version, deploy, and monitor their models? And what I've been working on maybe more recently, maybe in the past year, two years, a year and a half, something like that, is Positron, which is a new IDE that I showed using in this demo. And so that is what I do here at Posit. I am still the maintainer of the R package for Vetiver. I work with Isabel, who's the maintainer for the Python package. So those are, we do releases and maintenance and like do new features and bug fixes, but a lot of, I would say the bulk of my time these days is focused on Positron.

Thank you. I was actually going to ask you that because I noticed you weren't using the RStudio IDE in the demo and thought it'd be good to call that out as well. Yeah. So I was using Positron. So Positron is a new data science IDE that we are building here at Posit, and it is, there are a couple, there are a couple things that you might want to know about it. One thing is it is currently available for beta testing. So it is an early stage project. If you, it might not be the best fit for you today. If you are, depending on your particular use cases or need.

The other thing to, another big thing to know about Positron is that Positron is a, an IDE built for data science in general, not necessarily for just using R or just using Python. So it's a great, it can be a great choice if you are someone who uses more than one language or if you collaborate with people who do, it can be a great option there. It's, we're calling it a multilingual or polyglot IDE. So it's an IDE that can be used with different, with different data science languages.

Q&A: model card questions

Well, thank you so much, everybody, for starting to add in your questions into the chat. I saw there were a few questions asked during the demo as well. So I thought it might be good to get started with those. But one that came in from Gustavo was, would you include the educational level as a factor after looking at those accuracy results from the model? Really out of scope for this demo, but that might be a question raised by someone going through the model card.

Yeah. So this is, this is a question about the process of developing the model. And I think it's a really interesting one to at least briefly talk about. So the, it was not one of the predictors. So educational level, like, like is someone, how much education did one of these employees have was not one of the predictors to predict their, you know, whether like attrition, like whether they were going to leave or not. But then after analyzing the model, you're like, oh, we do a better job predicting attrition for the high levels of education and a worse job for the low level ones.

So it turns out in situations like this, if you try to put that characteristic into the model, it often doesn't help you predict any better. Like, like you still, it does not improve your ability to predict attrition for the lower level education people. Usually you, the, you know, like why, why does this, why does it happen that like some, some characteristics, it's people with certain care, like demographic characteristics, like the model performs work for them. Usually it is because there's like the, the most common thing is there's less data for those people. So putting that as a factor, like as a predictor in the model doesn't actually help you. And it may actually make the water perform worse overall for everyone.

The thing that you might want to do is to try different kinds of models and to see which one does the most, let's say even job across the characteristics. And that's exactly where fairness metrics come in. So Rachel, I am going to drop you a link for someone who wants to learn more about this. If you are in the process of developing a model and you notice that you have this kind of differential across categories and you want to say, can I minimize that differential so that my model performs fairly across different characteristics, there's support for comparing models with fairness metrics. And that's exactly what that gets you.

But often, often it, it does not help the situation at all to put the demographic characteristic as a predictor. Like it does, it doesn't even like, it doesn't, it's very common for it not to help. Like that's not a solution to the problem typically. So great question. Really great question.

Publishing dates and parametrizing model cards

So I'm going to jump over to one question that was on Slido and I'll use this as a chance to remind people, if you want to ask anything anonymously, you can use Slido as well. There is a question that was, would love to hear any thoughts or discourse on publishing dates on the model card. Would it be possible to parametrize the date to capture data changes?

Yeah. So you may notice that, so a model card is a human evaluation, human documentation about like what is, like how is the model performing at a certain time? It's about the model that you trained. It's less about the model's performance over time. Although of course, like we talked about with the dashboard, like you can kind of get those, you can present those in a combined way with a fair amount of clarity.

I think that I typically 100% parametrize the date so that it is clear that when people look at it, they know how old, you know, like when was the last time this model was looked at? If you'll notice in the Quarto file, like you can click through to the GitHub repo and see like I, in the Quarto file I used for a model card, it said like last edited or let me get exactly what it was called. It's like the date is defined as last modified. And so that means that when that is published, I don't have to manually think about that at all actually. Like I get that information automatically there.

The other thing that I think that is really nice along these lines is in that model details section, we say not only, we automatically read off not only the version from the metadata, but also the date that the model was published. So we get all that quite automatically. And I think it's interesting because like this isn't like when it comes to documenting models, some things can be automated, right? Like we can get that information. Let's automate that. Let's make that easier. But some things do require that like human evaluation, human thought, human writing. And we can make that, like set ourselves up for success for that, but be realistic that it can't be automated.

But some things do require that like human evaluation, human thought, human writing. And we can make that, like set ourselves up for success for that, but be realistic that it can't be automated.

Who writes the model card?

This question was, who contributes to writing a model card? Roles of the data scientists, stakeholders, privacy team? Yes. Oh, I love this question. I observe that the most common person to write the model card is the person who developed the model. And I think that's the healthiest, I think that it turns out that way because that's the person who has the most context for it.

Like the person who developed the model itself has, you know, knows what went into the process of deciding on one type of model versus another, who's done the exploratory data analysis on the data available at training time, and then is able to, you know, make substantive, realistic claims about the training data, about the data that has been used to evaluate the model. Like I think that's the person who has the most responsibility here. The data, the model card, the primary responsibility lies with the person who developed, the person or team who developed the model.

I do think that like you're writing for an audience of someone, right? Like someone who, some stakeholder who is going to use the model to do something, whether that is a software engineer, collaborator who needs to know how to integrate the model into an IT system, or someone who's making decisions based on the, like maybe a business stakeholder. So, I think that as those stakeholders probably are not writing the model card, but they are the ones who, like the model card needs to be useful to them. And so, contributing to like what needs to be in that model card, I think is really valuable to bring those stakeholders in.

I think Justin's question actually ties in really well here as well. So, how would you manage those like task assignments, track progress, and assure smooth collaboration among the team members throughout the process? And even like what you just said about making sure that the stakeholders can provide feedback on that model card too.

Yeah, yeah. So, I think this, I love this question because often these kinds of zones of responsibility or zones of influence often are at the boundaries of the kinds of tools people use. For example, the person who is writing the model card, who developed the model and is writing the model card, often is using like code-based tools, code-first tools, right? And so, they're like, if you're, and you end up with, I think one of the real challenges, one of the real like tensions in doing data work where your stakeholders are not people who are hands-on with the data, not people who are writing code themselves. How do you elicit feedback? How do you integrate feedback? This is for sure not a solved problem at all.

The things that I have used, this is, my answer right here is about to be super practical. The things I've done in the past that have, I have found useful, like sometimes I will publish, I have published things to connect, deposit connect, and then, you know, shared and gotten feedback on that way. The other thing I have done in the past is I have published drafts of things to something more, something more like Google Drive, Google Docs, the Microsoft equivalents, so that, and then like literally ask people for feedback on an initial draft before publishing a final version. This, I think this is, there is some real tension, there is some real tension at this intersection of people who are hands-on with the data and how to get feedback from people who are the stakeholders in the results there.

Evaluating third-party models

So, Brendan had asked a question earlier during the demo, and it was, are there plans to extend model cards to be able to evaluate third-party models? One framework to evaluate homegrown and third-party models could help the build versus buy decision that plagues many orgs.

Yeah. So, if you are, so, it depends, so, like, let's, I'm going to make some assumptions about what we mean here by third-party models. So, for example, someone else has trained a model, and you are going to use the model kind of as a black box, like you're going to send your data in and you're going to get a prediction out. You know, these kinds of models, they could be predictive models, right, like there are like models as a service, right? And then there's the sort of the more, you know, generative AI type models, right, like that it's very common to be in a build versus buy situation where you're like, we are not going to train that ourselves, right?

I am going to say that, like, the person who is responsible for the model card is the person who knows about the training data, is the person who knows what went into, what went into getting that model to the point that it can make predictions. And you as a customer who doesn't have that transparency, like, I don't even know how you would write a model card, like, because you don't have that information. I think that using model cards as a framework for do I know what I need, like, if you're someone who has read the paper, maybe you've written one yourself, and you're in this build versus buy situation, looking for either explicitly this model has a model card that I can look at or is equivalent information being shared can really help you make a call about whether you trust that model for whatever business use case it is.

I'm going to reiterate, I think the person who is responsible for the model card is the person and or that made the model, that made the model has access to the training data. And you can use those model cards to evaluate whether it's appropriate for your use case.

Model cards for Shiny apps and galleries

So, George asked, for data science applications like Shiny, how would you recommend going about drafting a model card for the app? Oh, yeah. I really like this idea. So, it's very equivalent to, like, that dashboard that I showed you at the end. So, if you have there are different ways of serving predictions for models, right? If it's meant more to be integrated into the rest of an IT system, you probably want an API or something that behaves similarly to that, right? And so, in that case, you're like, where does the model card go? Somewhere where the engineers can see it and, like, understand how to use it, right?

If you are making a model that you literally want a person to interact with, like, you've built a Shiny app where they can use sliders and dropdowns to say, here are the predictors, and then the Shiny app outputs the prediction. It's like, here it is. I would say, like, do it as another, like, do it in the app itself, right? Like, as another tab, as the first tab, maybe, that they have to read before they go to get and to use the app. And you would end up with an experience that looks kind of like that dashboard that I showed, where you have kind of some model card information before you get into other presentation of information around the model there.

So, I was wondering this one as well. So, model cards are often rare gems to see in practice. Curious about any sort of gallery of model cards or resources that might be out there? Yeah. Okay. So, actually, where I have seen the most of them is on Hugging Face. So, Hugging Face is like a platform for, you know, data science platform in general, but very machine learning oriented. Lots of models available. Lots of pre-trained models. And they have support for model cards on Hugging Face.

And I think it's kind of interesting. You can click through. You can see the kind of information that people have. How they tend to talk about it. And you can also evaluate a little bit of, like, how often are people writing them? You know? Like, how often are people looking at them? So, I think that is a really great yep, yep, wonderful. Like, that's a really great kind of what I like about Hugging Face, as an example here, is that these models are meant for you to for people to use. Like, you're supposed to use these pre-trained models, right? And so, they have the support for model cards. And then they have, you can see how and in what ways people are using them.

Positron vs RStudio

Going back to Positron for just a second, there was a question. Is Positron going to replace our studio? Very. It's always top of mind. If you're like an RStudio user or lover, you're like, wait a minute, is this going to replace it? So, the short answer is no. They are quite different. They are quite different in some real ways.

RStudio is, like, we at Posit are committed to, you know, new features, bug fixes, maintenance for RStudio for a very long time, right? RStudio has been around for, like, 10, 15 years. And there is not a world or future in which, like, we are not doing, like, maintenance for RStudio, like, within the scope of any kind of plans that we are making. And, in fact, RStudio may likely be a better choice of an IDE for many of you today and, you know, for a little while in the future. That's because RStudio is, if you only write R, if you only write R, like, RStudio is likely like a very good, perhaps best choice for you, actually. Because RStudio is built specifically for an R user.

I would say if you're interested in Positron, I think there are three kinds of, like, if you want to say, like, hey, is Positron for me? Should I try out Positron? And I think there's a couple of answers that would, if you are an RStudio user, I think there are a couple of things that might say, like, yeah, maybe I should give Positron a try. One is you are, like, a maximizer when it comes to customization. Like, you love to, like, majorly bling out your IDE and customize it. And you love setting up keyboard shortcuts and, like, really customizing. You're kind of a power user.

Positron, because of the infrastructure it's built on, has a much higher ability for you to customize and exactly make it exactly how you want than RStudio ever has and ever will because of the architecture that's built on. So, number one, you are a power user maximizer type person. Number two, you don't only use R. You maybe build R packages that include Rust code or you build really complex shiny apps that use a lot of JavaScript and, like, you're writing custom JavaScript that you integrate into a shiny app. Or, you know, you use R and Python for, you know, for your data science projects. If you are not an only R user, then Positron is a really great fit for you.

So, these are, like, these are the kinds of things that I would say in the short to medium term would be reasons to try Positron versus if you're a happy, so, if you are a happy RStudio user, do feel no, you should feel no pressure to change and you should feel no stress that your IDE is going away.

Rapid-fire questions

How do we facilitate the creation of the model card? This is great. I am going to drop a quick link to you, Rachel, of what do we have Vetiver support for? So, the answer is no. To use Vetiver in R, you need a tidy models workflow or a carrot model or MLR3 or raw XGBoost, ranger, etc. And then in Python, there's a set of, like, four things that we support. Scikit learn, PyTorch, XGBoost, and stats models. So, Vetiver has support for quite a wide variety of types of models. But it's not tidy models only and it's not R only. It's pretty broad support. And so, that means you can make a model card with any of those. You can use our Vetiver template for a wide variety of models across R and Python.

One was, would it be crazy to have multiple model cards for a single model, thinking about tailoring it to different audiences? That is not crazy, actually. And I, for example, like, in the demo, I talked about, like, check something into GitHub, publish something to Connect or to Confluence or whatever, you know? And those are different audiences, right? Like, who's gonna come to GitHub and read your readme is probably a software engineer collaborator who needs to know how to integrate the model, like, the output from the model API. That person needs different information than the person coming to Confluence to understand the model from a less hands-on mode. So, I do not think that's crazy. And in fact, I think that that shows a maturity and understanding of how our data work impacts different people in our organization.

And in fact, I think that that shows a maturity and understanding of how our data work impacts different people in our organization.

Absolutely. So, this might be a hard one to be, like, a very quick question. So, I do also want to give a little shout out to say, like, these questions, like, how to deal with certain stakeholders when you have conflicting business decisions, we talk about a lot at our Data Science Hangout. And so, we have that every Thursday from 12 to 1 Eastern time. And this week, Joe Chang, our CTO at Posit, is actually joining us as the featured leader for the week. So, we'd love to have you join us for that, too.

But Julia, if you want to take a quick... A quick stab. Yeah. This is a really, this is a big and tough one. It is about how do data people function in an organization? And they're, you know, they're, I'm going to say, I think a very common mode for data people to function is in a consultant-type role, like, where you're not the person... It's very common for data practitioners not to be building the product itself, but to be consultants of, like, how is it going building the product? Who is doing well? How are our customers doing?

So, if we... Let's just be concrete a little bit, say, of this sort of imaginary example of in an HR department, we're going to build, we're going to, you know, like, there's a data scientist whose kind of consultant role, their people they are serving is the HR department. How can we help the HR department do well? What if there are multiple people in that HR department that have, like, conflicting ideas about what to do? I think that the, like, what do we do in these situations is very related to the kind of role we have and how do we, like, how do we manage that particular kind of role? So, I think to get an answer to this, we have to have clarity about, am I building a data product for my company to sell? Am I a consultant to make my company, my organization work well? And then how do I move forward from there? So, tough question, kind of a big discussion. I think something that helps us get to that is to have... Be clear-minded about the kind of role that we have as a data practitioner in our organization.

Absolutely. Well, thank you so much, Julia. I was trying to cram in as many questions there. I know we're a little bit over. So, I did just want to add, if there's anything that we didn't get to cover today, please feel free to reach out to me directly on LinkedIn. I also just put a quick link in the chat. If you ever want to schedule time to chat further with our team, maybe you're just curious, like, do people at my company already use Posit? And I don't know that. You can always use that link to schedule more time with us as well. So, I put that in the chat. I'll put it in the description of YouTube, too. But thank you so much, Julia. I really appreciate you taking the time to join us. Thank you so much for having me. And thank you to everyone for your really thoughtful questions and reflections on these complex ideas.