Data Science Hangout | Unity Health Toronto | Deploying & Monitoring Models Across a Hospital

Transcript#

This transcript was generated automatically and may contain errors.

Hi friends, welcome to the Data Science Hangout. If you're joining us for the first time today, it's nice to meet you. I see a lot of familiar faces, so I know a lot of you have been here before. The Data Science Hangout is an open space for the whole data science community to connect and chat about data science leadership, questions you're facing, and what's going on in the world of data science. The sessions are always recorded and shared to YouTube as well as the RStudio Data Science Hangout site, which we can share in the chat here too. So you can always go back and re-watch or find helpful resources too. We do also have a LinkedIn group for the Hangout if you ever want to continue a discussion with someone or ask feedback or see a summary of the past week's sessions.

I will encourage people to post in there and start conversations so it's not just me talking there, but together we're all dedicated to creating a welcoming environment for everybody. So we love when everyone can participate in these sessions and we can hear from everyone, no matter your level of experience or the area of work that you focus on. There's always three ways that you can ask questions. So you could jump in by raising your hand on Zoom. You can put questions into the Zoom chat and feel free to just put a little star next to it if you want me to read it out loud instead, or I can call on you to introduce yourself and add some context too. We also have a Slido link, which I'm sure Hannah is sharing here in the chat, where you can ask questions anonymously too.

I am so excited to be joined by two co-hosts for today for the first time ever. Today we are joined by Derek Beaton, Director of Advanced Analytics, and Jamie Beverley, Director of Product Development at Unity Health Toronto. I would love to have you both introduce yourself and maybe tell us a little bit about each of your roles, the organization, and maybe also something you like to do in your free time outside of work. Derek, do you want to start first?

Introducing the team

Sure, sounds good. So I'm Derek, the Director of Advanced Analytics in a group here called Data Science and Advanced Analytics. The overall group, and I'm sure Jamie and I will share different details on this, it's a data science unit in Unity Health Toronto, which is three different hospitals here in the city. We have four different teams across the Data Science and Advanced Analytics team. Advanced Analytics is the team I'm part of, Jamie's is the Product Development. We have a Data Engineering team and a Project Management team. For Advanced Analytics, my team uses lots of exciting tools, less exciting tools, to understand data, bringing predictive analyses to different clinical problems or different resource problems in the hospital, scheduling assignments, alerting, a whole bunch of different things. And we're diving into kind of new domains, including medical imaging and a few others that we can chat about in a little bit. Yeah, and then I guess for me, stuff I like to do, I like to go running, I like to go hiking. My internet right now is probably very bad, because I'm actually at a cottage and I'm about to go hiking after this.

Sure, yeah, thanks. Yeah, I probably don't have a ton to add to the broader structure of our team. So I'll talk to my particular team, the Product Development team, which is in Data Science and Advanced Analytics. So our team is more focused on the software engineering and design side of things, so not so much data science. But once we have a model that's developed by Derek's team, how do we get that model into production and into a frontend that our end users use? So for doing that, we work a lot with sort of HTTP APIs, like wrapping models in Plumber or Flask or FastAPI, and then building frontends, Shiny or React or some of the other platforms that are supported on RStudio Connect. And then we also do some design work, so creating mockups that work with our end users to converge towards the design, if that makes sense. About my sort of like background and interests, I was more sort of on the humanities and cultural studies, science and technology studies side of things in school, and then pursued a master's in computing to get more oriented towards the software side of things. And then in my free time, I spend a lot of time with music stuff. I have a lot of synthesizers beside me here that are quite visible.

Cool, thank you both. So I know this is the first time we've ever had two leaders on together, so I thought it might be good to ask how both of your teams work together and how you started to have this great relationship, and why it's so important that both of you are on the Hangout together.

We actually have three of us here on the call. We've got someone in the audience here as well. So Derek mentioned we're kind of comprised of four teams, which starts at our data integration and governance team, which is led by Shetland here. So composed of data engineers, ETL developers, data governance specialists, all the folks who know how to get the data from a very messy legacy source systems into a format that's accessible for both modeling and for real-time applications. And then once we have data in those formats, that's where Derek's team does all the work on the modeling side of things, so training models on top of that, and analytics and reporting. And then once we have a model that's ready for deployment in a production application in one of our hospitals, that's where my team comes in to build those front-end applications. And then there's a fourth team that's composed of project managers that keep us moving on all pieces.

No, that covers it. I guess one point to add to the project management team, our VP Mohamed is frequently reminding us that data scientists aren't good at the management part. So the project management team are fundamental to making sure we can get done what we do.

That's great. And Sevnam, I didn't know you were joining us as well. A special welcome to you. Thank you for joining from the team as well. If you want to jump in and introduce yourself.

Yeah. Well, I'm not one of the speakers. I'm actually here to listen, but I was pulled. Hi, my name is Sevnam. I'm the Director for Data Integration Governance at Data Science and Analytics. I work with Jamie, Derek, and our indispensable project management team. Yes, Derek is right, and Mohamed is right. We need somebody to keep us in sync. Otherwise, we would probably fall apart. What I want to add to Jamie's really good summary is that the things he talked about, like getting the data modeling and then front-end development, it kind of looks like a linear process, but it's not actually. There is a lot of overlap. So as soon as we start looking at the data, Derek's team starts looking at the modeling possibilities, while Jamie's team starts looking at the deployment options. So we kind of work in parallel, and it's the project management team that actually kind of makes the pieces click together. So that's why we have to keep in touch all the time with great communication.

MLOps and model monitoring

That's great. Niall, you asked a question in the chat. Do you want to jump in and ask that?

Sure. It's pretty simple, but maybe you can go into some more depth. I'm just curious about how your teams are managing MLOps. You've got these models in production. How are you monitoring them? And then does the user development and product development teams play a role in creating that monitoring process, or is that fully with the advanced analytics group?

I'll jump in for part of this. So on a lot of the monitoring side, we have, I think a lot of it is on our side for now, where we have a lot of reports that come out to us. We're building a dashboard to actually centralize all of our different deployments, so we can watch what's happening with data, model performance, any sort of drift or changes. I would say for a lot of the MLOps type things, we're mostly focused on the practice as opposed to any particular tool sets right now, but we are trying to rally behind quite a few tool sets. I think a lot of these are largely out of some of the movements we've seen recently in Vetiver and the other RStudio packages that are coming out from Julia, where monitoring models, getting model cards, and having these quickly accessible to us is where we're moving. And we're moving in this direction because I think there's been a significant expansion in a lot of what we do across a lot of different data sources in the hospital with a lot of different problems.

Yeah, I would echo, I think, what their sentiment was there. I think we're kind of growing in that domain still and figuring out what that looks like for us. And I think simultaneously, partly overwhelmed, skeptical, and excited by all the things out there. Lots of things that run on a Kubernetes cluster that we don't really have the use case to have a massive Kubernetes cluster. That said, I'd say we have a pretty core set of requirements for most of our projects where we have a model, we want to version that model, we want to save predictions, all the predictions that that model generates. We typically want to save some data alongside that model. We want to have monitoring on all the predictions that model generates and saves to a database. And we want to host those models behind some HTTP API, typically. So those set of criteria, I think, are good criteria for engineering a system that can handle our use cases.

Lots of things that run on a Kubernetes cluster that we don't really have the use case to have a massive Kubernetes cluster. That said, I'd say we have a pretty core set of requirements for most of our projects where we have a model, we want to version that model, we want to save predictions, all the predictions that that model generates.