Matthew McDonald @ KBRA | Data Science Hangout

Transcript#

This transcript was generated automatically and may contain errors.

Hi, everybody. Welcome to the Data Science Hangout. Happy December, everybody. If this is your first time joining us today, so nice to meet you. I'm Rachel. I lead Customer Marketing at Posit. If it is your first Data Science Hangout today, I encourage you to say hi in the chat so we can all welcome you in here to anybody joining for the first time too.

The Hangout is our open space to chat about data science leadership, questions you're facing, and getting to hear about what's going on in the world of data across different industries. So we're here every Thursday at the same time, same place, unless it's a holiday. But if you're watching this recording on YouTube later, the link to add it to your calendar will be in the details below as well. We're all dedicated to making this a welcoming environment for everyone. So we'd love to hear from everybody, no matter your years of experience, titles, industry, or languages that you work in. It's totally okay to just listen in here if you want, or maybe you're out on a walk or on a lunch break or whatever time of day it is for you. But there's always three ways you can jump in and ask questions or provide your own perspective too. So you can raise your hand on Zoom and I'll keep my eye out here. You can put questions in the Zoom chat. And if it's something you want me to read out loud instead, just put a little star next to it or asterisk so that I know. And then lastly, we do have a Slido link where you can ask questions anonymously too.

So, I mean, I'm a huge Git advocate right now. There's a really good, good paper out there, Good Enough Practices for Scientific Computing. I think Jenny Bryant, who's a Posit employee is a co-author of that thing.

There's a really good, good paper out there, Good Enough Practices for Scientific Computing. I think Jenny Bryant, who's a Posit employee is a co-author of that thing. And that's a really good, good, she also has a Git for R, I think, or something like that, or some book like that. Happy Git with R. Yeah, that's the one. And that's like, you know, even nowadays, if I'm doing something and I'm not collaborating with anyone, and I'm in RStudio , I'm starting up a new project. I mean, I'm clicking that Git repo button every time because that's like, that's my safety net. So I'd say like, I don't have really, I mean, we were lucky we had people in our organization that helped us get over the hump. But, you know, the best way to get familiar with it is use it. So it's kind of a catch-22, I think.

Using Shiny for model communication

Yeah. Yeah, we do use Shiny. One of the things that I get a little bit nervous about with Shiny is it's such an amazing tool. I would say that we've kind of limited our use of that to more of the model development piece. So when we're building a model and we want to communicate to the users or the model owners or the subject matter experts, the people who know this stuff, how this model works, if I put in this input, what kind of output, what kind of sensitivities am I seeing? Shiny has been a really great tool for that.

My concern about Shiny is if I build something that is super awesome, which is really accessible with Shiny, like you can build really amazing things, my concern would be that I don't view myself as somebody who writes production code. So if it were to become part of a core part of people's day-to-day lives, I'd get nervous about something along the lines of getting a call at 11 p.m. that the Shiny app's down and can you fix it, Matt? And I'm the only guy who knows how to fix it. So it's a bit of a double-edged sword. So we do try to really leverage our technology teams when it comes to building core functionality, software functionality that people need. So it's like with great power comes great responsibility type of a message I'm saying there. But we've had great success with Shiny. I think Shiny, it's a really great tool. But mostly in the prototyping and sort of early results stage, I would say.

Collaborating with Excel users

Well, we, we have, we have experimented a little bit with like writing sort of VBA code that we can share with people that would sort of allow them, give them access to, in a non-programmatic way, sort of to an API. Like if I were to give, if I were to publish the best API in the world, you know, I, I think there's probably 80 to 90% of the people who, who I work with just, it wouldn't be particularly helpful for them because they don't know that, they don't know how to interact with that.

JD Long did the keynote speech at the Posit conf last, last, I don't know when it was, recently. And, you know, he, he's a big advocate of like, you know, being mindful and empathetic for people because ultimately people, you know, they're just trying to do their job and they just want to get things done. And my job is to try to help them. So it's, it's definitely a challenge though, because what I, I do really want to sort of maintain some sort of, some sort of line between like my work and their work. And, you know, I don't really want to become their, you know, Excel support line type person. So that's kind of a thing I'm always trying to balance.

LLMs and security concerns

Yeah. So we are, I think that's in our future and that's something I'm really eager to, to sort of engage with and have that kind of assistance with. ChatGPT, GitHub Copilot is, I've sort of dabbled with like at home, but like, you know, we, as an organization, you know, I think there are some security concerns, you know, we don't generally like to, we, I mean, we are dealing with sometimes non-public information. So we are very kind of locked down.

I work at a law firm, so I feel your pain. We have our own version of ChatGPT hosted in Azure that's just for us. So I mean, I will tell you, it is so much better at writing Roxygen comments for functions for a package than I ever could be. Oh, I can't wait. We have not, we have not yet engaged with that. That sounds like an awesome, awesome use case of that technology.

But we haven't done that yet. And, you know, I sit right next to the head of security and, you know, I'm talking to technology and that. I do think that like at some point, everybody's going to have that tool because if you don't, then you're kind of just not performing as well as your competitors. Yeah. We're just not there yet.

Establishing trust and ownership

I mean, it's like it's a balancing act, I would say, between. You know, reaching out to people and operating on their turf. A lot of the sort of difficulties sort of we've encountered being in this role between like the credit rating analysts who are doing their job, they want to rate the, they want to do the credit analysis and technology who's also doing their job is that, you know, it really opened us up to it being just people not really understanding what our role is. Right. So like we, lots of times we really do get confused with technology. Like we'll get a call from, I'll get a call from senior leadership, people who are very senior in the company and say like, I need this thing that does X, Y, and Z. And they're not describing a model. They're not describing like really anything. I, I, I'm, I do, or I'm really good at.

So, you know, that sort of establishing, like trying to continually hammer home, like what described to people, what is it that I, what would you say you do here? You know, I, I, that's a, that's something that I, I, um, I've sort of encountered a lot like that, that clarity and making sure that people understand that. And then just that idea of like trying to really, because we're, we're a smaller team and, you know, we do have limited resources, um, just establishing those lines of ownership, right. And helping people without like, you know, doing their job is kind of a thing that, that can be difficult too.

Yeah, I mean, it's like, you know, I've, I've had the opportunity to sort of do things that are sort of have that overlap between stuff that my company needs to get done and stuff that I'm interested in doing. So that's been really good. And it's been nice to sort of be able to, to sort of call the shots along the way.

Building trust through demonstration

Yeah, I'd say like early on, and it was, I used the Shiny app to get it done, right? So early on, this was before when I was doing like the model validation stuff. So I kind of had, I was starting to get insights into like, what are the sensitivities? What are driving this model? How does this thing actually work? And I had just learned Shiny. So I was like, I took that knowledge and I sort of built a Shiny app that allowed people to like move a slider, press the button, see the graph type of stuff. And I would say that like people's socks were knocked off. They were blown away. So those types of moments are like, not just useful because now they have something that they can look at the sensitivity, but like from a more personal perspective, like suddenly people were saying like, oh, Matt kind of seems to know how to get things done. Maybe we should trust him. So I think it's like looking for those types of opportunities. And it's never like a, there's no playbook there. It's just, you got to kind of keep your head up and you got to know what you're capable of and deliver. And that's really useful.

It's never like a, there's no playbook there. It's just, you got to kind of keep your head up and you got to know what you're capable of and deliver. And that's really useful.

Matthew McDonald @ KBRA | Data Science Hangout

Transcript#

Introducing Matt McDonald

Team building and tools

Breaking down barriers between teams

Explainability and model choices

Using Quarto and publishing results

Learning Git from the ground up

Using Shiny for model communication

Collaborating with Excel users

LLMs and security concerns

Establishing trust and ownership

Building trust through demonstration