Resources

Chris Bumgardner, Children’s Wisconsin || Healthcare Meetup || Posit

Cultivating an R-based Analytic Practice in Healthcare Supporting the advanced analytic needs of an active academic healthcare organization requires tools and practices that enhance the application of statistical and algorithmic approaches. To positively impact care, system operations, or even well-being at the community level, these tools need to support solutions that can be rapidly deployed and communicated as well as reproduced when studying longitudinal trends. At Children’s Wisconsin, we use R and Posit's suite of tools to enable forecasting, modeling, and data mining among other data science activities. We communicate the results of our efforts using interactive applications built with Shiny as well as reports and push analytics created using RMarkdown. This talk will discuss how we have developed this capability and provide a few examples of the applications that have been created to support our vision that the kids of Wisconsin will be the healthiest in the nation. Agenda 1) Children’s Wisconsin Introduction 2) Data Science Tools and Supporting Infrastructure 3) Example R-based Projects [Community: Missing Youth, System-wide: COVID-19 Response, Operational: Patient Placement Planning and Optimization] 4) Challenges and Future Plans Speaker Bio: Chris Bumgardner leads the data science efforts at Children’s Wisconsin and works with teams across the health system to improve decision-making. He is focused on applying statistical methods to data sets large and small to discover and visualize insights that will help ensure Wisconsin’s kids are healthy, happy, and safe. Chris can often be found awake far too early thanks to an insubordinate rescue dog named Dutch. R in Healthcare Slack Group: https://join.slack.com/t/rinhealthcare/shared_invite/zt-sc7lc4k6-K9zb~kX826dOXMcaj~Wt~w RStudio Enterprise Community Meetup for future events: https://www.meetup.com/RStudio-Enterprise-Community-Meetup

Jul 1, 2021
53 min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

I will turn it over to you, Chris. Well, thank you very much. I will share. Can everybody see everything okay? Looks perfect. Looks good. Great. So yes, thanks again for RStudio for hosting this. I think it's really exciting that we can kind of get together as a healthcare community and start to share some of the work we're doing with R and with Shiny and some of the things that we'll see today. I hopefully can spark some conversation going forward among all of us.

So just a little background on Children's. Children's Wisconsin has been around for about 125 years, a little bit more. We had our anniversary a few years ago. It was originally founded just by seven philanthropists in Milwaukee who just, they rented a house and it had 10 beds and it was this very humble beginnings. And then throughout the 125 years that we've grown and grown, we've had a couple moves. We were on a building on Marquette University's campus for a while and now we've moved to the medical Milwaukee Regional Medical Center, MRMC, which is in the background of this slide. And there we are also joined by Fray Dirt Healthcare. It's an adult system as well as the Medical College of Wisconsin.

So Children's is a learning hospital. And so their attitude, I think really helps foster a lot of the innovation that we get to explore using R and Shiny currently. And so we're definitely well supported and curiosity and innovation are two of our core values. So hopefully that'll shine through today as I start to show you some of the things we're working on.

As such, we get to, you know, we get to work with everybody across the system. And so I think this is one of my favorite quotes. John Tukey not only gave us box plots, he gave us this good quote where as a statistician or a data scientist or an analyst, we get to work with, in all different areas, things you would never expect in other disciplines. I think Children's lets us, especially as part of the enterprise analytics team that my data science program is part of, we get to work with quite a few different people across the system. And you can see Children's is really more than just a hospital for acute things that are chronic conditions. We do have two hospitals. We have a level one trauma center and emergency department. But we also do research with the medical college.

Children's Social Service Society of Wisconsin came under the Children's Wisconsin umbrella in 2004. So we also have a foster care and and child and family well being. And that's part of what I'm going to show today was one of the projects that I worked on with regard to youth placed in group homes. And we also have primary care, especially care clinics. We have interesting programs with school nurses where we put our nurses will be in Milwaukee public schools to make sure kids are getting the care they need, when we can get that when we can have access to them. And then we also have a health plan. So we actually have a part of Children's that actually provides a health plan and make sure kids are covered and get the preventative care that they need.

So it really is more than just a hospital. And this gives us all the different backyards to kind of play. And so it's kind of exciting thing as this slide just gives a quick background. We have over 90 locations, all generating tons of data that we try and collect as best we can in our enterprise data warehouse. So we do have quite an impact on the state of Wisconsin, especially for kids, because we know that they're unique, you know, you can't put a kid in an adult size bed, you know, they have different, different needs. And so we really try and cater to that.

The missing out-of-home care youth project

So I thought I would jump in and give a kind of a quick example of some of the work we've done with R and Shiny and then kind of come back around and say, well, how did we get here? Because this certainly isn't the first thing we attempted to do in R and we kind of grew the our R practice as we went along. It's definitely been probably a seven or eight year journey, as it's people commonly say, for us to kind of get to this point where we actually are can rapidly deliver some really interesting insights using R and Shiny.

So the missing and missing OHC use is out of home care youth. These are our youth who are generally placed are removed from their homes in some kind of dangerous situation, maybe there's safety concerns, maybe there's something going on in the house that isn't safe, and they've been have case managers assigned to then we place them the best we can either in foster care, or in sometimes group homes. So this project originally kicked off where we were trying to understand youth who are leaving that placements when they go missing. So these youth are generally 12 to 18 years old, they're active in the community, they have friends across the community, and they go missing and we're responsible for them, but yet we don't know where they are. And so that's not a good situation to be in, of course.

So as we started to work with our social services partners, we looked at reasons why kids would go missing. But we kind of put a twist on it. And we started to look at reasons why they stay the ones who stayed, why are they staying, and we were trying to do some analysis with respect to that. Our case management team serves about 1200 youth annually. And that's just in Milwaukee County. Milwaukee County is a very unique area in that the county itself doesn't handle the case management services. There's two organizations contracted and Children's is one of those to provide that. So we each get about half of those kids. And then we also look for permanency, of course, for these kids as we're placing them in these homes, but hopefully it's in a safer situation, which is what we're going after.

So what does it really mean to be missing? It's anyone younger than 18. That's the federal law, right? They're unknown to their legal guardian. For DCF, Department of Child and Family for the state of Wisconsin, it's if they're missing more than eight hours, and we don't know where they are, it has to be reported. The good thing for us then is that gets recorded in the state's Child Welfare Information System, the SACWIS. And so we can then actually use that data to try and understand when they went missing. We know kids where they're staying for both our agency and the other agency that's managing care in this county. And then we can actually kind of have rosters of kids so we know who was staying together at a time, who went missing at the same time, and we can build out timelines. And so that's part of the interesting work that you'll see here in a second.

What you'll notice here in this plot is sometimes there can be up to 40 kids who are missing. And so that's quite a bit, right? We're going, oh my gosh, there's 40 kids that we're responsible for as an organization. We don't know where they are. They really needed to have us be involved to help understand what the trends were because it's been increasing. So as a good statistician or good data scientist, the first thing we did was we tried to look at some of the factors that were involved this year as an example of a survival analysis we did of looking at if a child is placed into a family setting either with a relative or in a foster home where it's a smaller setting, not a group home, they have a much more success in not going missing as well as staying in that setting for longer, a longer period of time.

Once we found these kind of interesting factors that got us some credibility to keep working on this problem. And it certainly led to more questions as we went along. And so the extra questions that we had, there were too many and it was too dynamic, right? We were kind of going, well, what if this, what's going to happen if someone does go missing? How many are going missing? Where are they leaving or going missing from? And where were they possibly removed from to where they were placed? And so we started to think about, well, what could we do to answer these questions in a more dynamic fashion? And that's where Shiny came in for sure.

Shiny app demo: the missing OHC youth explorer

So this is called, we called it the missing out-of-home care youth explorer. And so we really were trying to explore this population of kids who have been removed and then placed under our care. And so we pulled the information from our state's child welfare system, which is that SACWIS I mentioned. We also pulled a bunch of other data together from the Census Bureau and understanding that the environment we're placing the kids into. And then we started to think of ways to visualize that.

So I can, just from this dashboard view, we gave a quick view of the kids who were missing. So people, these are case managers will use this application now to look at kids. And I should mention before I go any further that this is all anonymized data. So everything you're seeing here, we've truncated some of the data and we've anonymized names. So none of this is real, real children. It represents the same relationships though underneath.

So these, these are the kids that could be currently missing. So they can look on a dashboard. We can also look at placement providers who may be having a lot of kids go missing from, maybe those aren't the right places for us to be putting kids who are, we deem at risk. And at risk means maybe they aren't a known trafficker. In this case, a lot of the times we're looking at kids who maybe are involved in human trafficking. If you've seen the news and heard about Jeffrey Epstein, you know, trafficking is rampant. Milwaukee is one of the areas that it really has had a big impact. So we actually start to look at where can we safely place kids that maybe aren't as risky of a situation.

So the first thing we thought of is, well, why don't we look at kids when they're placed together? So co-locations. So we know when kids are together, and then we know when they go missing. So let's look at like a social network of this. And so let's try and build out what it might mean for these kids to be missing. And when they leave together, how does that visualize? So if you look here, this is using iGraph and VizNetwork, if you know some of the packages that are involved. And it's an interactive graph then of all the kids and the relationships. When have they been placed together? Thicker lines mean that they've been placed co-located together more often. So if I highlight a line, for instance, you'll see that there's been nine co-locations between Toshika and Rialta. And then the bigger the dots mean that they have more connections in the network.

And this is just the network of kids who are either currently missing or are related one step away from someone missing. Like if we looked at the overall network of placements, you'll see it's much busier. It's like this is just a big web of movement. But what we're really interested right now is in these kids who have gone missing and trying to investigate. So if I come back here and take a look at some of the kids, I'm going to click on one right here. So I clicked on a name and see I get information about them. We started to create a score of influential kids in the network. And so I clicked on a blue link. And this is someone who's at risk of being trafficked. They also have a lot of connections in this network. So it gives us an ability to kind of score them and say, yeah, there's some risk here.

But what I can do then is if we pick someone interesting in the network, we can actually go look at their individual network. So this is regardless if missing or not or how many steps away, this is all the people they've ever been placed with or associated with. And you'll see, again, I can kind of zoom in. And then if I look at the color coding of this network, suspected means that we actually have had someone, a case manager, and we have liaisons with law enforcement. We somehow have tagged them as it's suspected that they're involved in trafficking.

So again, we can look at that. If I click on a person here, like Magali Nibbs, or if we click, you can click away here and click on another one, you'll see it'll change here, Anna Lee Mody. So they'll have their information of where they're currently placed. And if someone goes missing, that might be a person that we might want to reach out to, or we might want to investigate where is Anna Lee. And so we can also add her. So if she's suspected of trafficking, we can add her to this network. And now you'll see, we'll start to get a picture of how do they know each other? How have they been placed together? When were they placed together? We can actually look at timeline views as well of this. And so these are some of the people they know together. So if someone is missing, and then like Enid was, is there someone else missing in this group? And this might be a reason why they've gone missing, or maybe we could help find people. And now you can also see, we're starting to get some of the red dots. And these are confirmed traffickers who we're still responsible for. They're placed in a group home in the city right now. And you can see they're one, two, three steps away as far as relationships go. So there definitely could be some trafficking involved.

So there definitely could be some trafficking involved. So this is one view, I guess, of what we had done to understand what's going on and to be helping the case managers in making better placements. You may not want to place someone who's at risk if there's a confirmed trafficker at a location, and you have choices of where to put them in a group home, for instance, different group homes, you could place them in a different location.

So this is one view, I guess, of what we had done to understand what's going on and to be helping the case managers in making better placements. You may not want to place someone who's at risk if there's a confirmed trafficker at a location, and you have choices of where to put them in a group home, for instance, different group homes, you could place them in a different location. So it's really helping to inform that case management and placement services.

Geographic and census data views

Our second view then, so that was one view is a social network. If we look at it from a different perspective, we could also think about it from a geographic perspective. And this is where we're thinking about, where are kids getting placed? And where have they been removed from? Where are their support structures? As we're starting to think about placing kids in safe places, but also in places that they still have friends, you can't just take them out of their environment, plunk them down somewhere else and expect them to be successful, put them in a new school.

So as we collaborate with the social workers and the case managers, they help us to understand that there are some effects that maybe we could help inform them again, of where the best places are to place these kids. So I'm going to grab so I'm here, we're using leaflets, and I'll kind of do a recap at the end of all the technology behind this. But we're using leaflet markers and marker groups to kind of hold together all the placement providers. So this is a list of all the placement providers in the city. So this is Milwaukee, you can see there's kind of a geofence around the different markers of where these placement providers are.

And then you can kind of see Milwaukee is kind of bisected by a east-west freeway I-94 is right here. So we can kind of see north of 94, there's a lot more placement providers. And these either, as we scroll and you'll see it's color coded markers that are either a foster home, maybe a relative or a group home. As we get to some of the green dots, you'll see these become like, these are group homes that are in the city. The names are changed, but these are actually different group homes across the city. And we can track who is there right now, if they've ever had a trafficker, and those pop-up bubbles on the markers will explain that.

So now we can start to get a good look at the city. And one thing the case management noticed right away is how many more placement providers are to the north of I-94 versus the south. In a second, I can show you the removal map. And you can see we remove equally from both sides of the highway, but it seems like there's more placement providers being licensed, more group homes being licensed to the north of the city. So we're taking kids out of the south side and putting them on the north side, and then they're going to go missing. And it's just this kind of circular events that we know is going to happen. So now we're trying to think about how we can creatively put more of those group homes to the south side if neighborhoods can approve them, certainly.

One thing we did, so first we started thinking, okay, geographic placement. Then we started to think about what other factors are involved, what's important to know. So we actually brought in data from the gun violence archive, because Milwaukee also has, you know, many people do as a violence. Gun violence is a big issue today. We brought in gun violence, and we can actually lay the gun violence events on top of this. So this is the gun violence for the last six months on the map. And here you can see we're putting these group homes right into such a dangerous area, right? All the group homes that these kids are being placed at are going into the same area of the city where a lot of our gun violence is occurring. So this is also very informative, because we do have people who advocate for us at the state and federal level. Showing information like this is very helpful to us from an advocacy perspective.

Q&A: data collection, org structure, and tool choices

Pat asked a question. How is all the data collected, cleaned, stored, and retrieved that fuels this great work? Yeah, that's a great question. So certainly the data science team is a small team. We don't have a ton of data scientists. We sit within the analytics group. And so we actually have data engineers in the analytics team who get shared. And so we use them to actually bring in a lot of this data, and we store it. We have a data warehouse that it gets stored in. And then we pull that what we need into like an R data file for Shiny to use. But the data engineers do that cleaning and scrubbing and bringing the data together for us. I think the data scientists a lot of times will take that first pass and just take a rough cut of the data, think about what we need and look for quality issues and distributional issues. And then they'll automate that and productionalize it.

Jake asks, what are you using to link iGraph to the selected youth info panel? Is that Crosstalk or something else? Yeah, when you click on a point, it then populates that info chart. How you were like doing the selected point and having Shiny. So Crosstalk is a package that will help link together. But I was curious how you're getting. I think we're capturing, I would have to look at the code because this is, I should have said at the beginning, this is something I worked on two years ago. This is one of our first Shiny apps. But I believe I'm just capturing the point and then I'm popping it up. I don't think I even use Crosstalk. There's events that come off of that iGraph off this network. And I believe I'm capturing that and then looking at the panel. But I could follow up with you after for sure to show you what I did or talk about it.

The neighborhood stress model, it's a scoring model that looks at stressful areas. The city, it was developed at the University of Massachusetts. This was the first use case we did of this. And it actually proved very useful. So it was interesting, again, to see where we have our homes, our group homes and our foster homes mostly placed are in very stressful areas of the city. And what does it mean to be stressful? If you look at this, it looks at poverty, it looks at unemployment, public assistance, and then education factors from the community survey and the census. And then we lay that back on that map then for them, again, to understand as we start to license more group homes and foster homes, we maybe don't want to always put them in the stressful areas of the city, right? That's like, I think it was an interesting way to illuminate that. And this is down to the census tract level.

So we've run that stress score now. And that's part of our ongoing modeling. Every year when the ACS gets updated in December, we rerun the stress scores for the city. And then we use this in multiple apps and analyses as we go as we look at population health inside Milwaukee and Milwaukee County.

But this was definitely, as I was mentioning, and when I was answering some of the questions, this was really our first big Shiny app as a data science team that we developed. And so it was really over the course of, after we did that initial factor analysis, it was kind of during the course of 2018 that we worked on this. So it's really came together well, and it gave us a foothold to kind of work with other teams throughout the organization. Once they saw this, we had a data symposium as an analytics team, and we were able to demo this and talk about the work we've done. And it opened up a lot of doors across the organization to partner with other people. So one success kind of just rolls into the next.

Growing the R practice at Children's Wisconsin

Kind of our analytic journey as an analytics team. I report to our director of analytics, and so this is our development as an analytic team. I thought it was interesting to kind of to lay this out and then also then think about our growth of how we started to use R during that same time frame is when we started to really bring R along in the organization. So it started back in 2013 before we could even think about using Shiny or anything else. I know Shiny was very in its early stages at that point, but it was really just a limited number of people who kind of brought it forward and maybe evangelized it a little in the organization, got some small wins, or thinking, oh, that looks cool, that ggplot looks cool, how did you do that? And then we started to kind of grow that.

At the same time, you know, our SAS and SPSS licenses come up for renewal, and we start to think about, well, could we use R for this? Why wouldn't we use R for this? Or we see something cool like the Shiny app that you think, oh, my gosh, look at how you can really stand on the shoulders of giants as far as all the packages that are available. And then we just kept, it's kind of kept snowballing to the point now where as we start to look forward, we think, well, is this a scriptable first solution? Is this something we're going to have to redo? And then, you know, we actually are bringing along other analysts in our analytics team to understand and be able to use R and apply it to the same requests that they would normally possibly use Excel or standard SQL reporting tools for as well.

We, you know, push out Office documents as we need to, generating them with rmarkdown and Word or OpenXLSX and generating spreadsheets if we need to, and then schedule that all in RStudio Connect and push it out. So we really have gotten in this, looking back, it feels like it's a short amount of time. I know it's probably six to eight years, but we really have come a long way as far as using R. And I think the whole community has as well.

Just a quick snapshot, I thought just to kind of level set everyone where we're at. We have really at this point now in 2021, integrating ourselves into the IS environment. So we're actually using our Active Directory for logging in either to RStudio Workbench or to Connect. Our users use it. We use all the data warehouse and we have the secure links. So we don't have to kind of sit to the side as we were when we were first kind of doing the prototypes and proof of concepts. We actually have a full-blown kind of environment that's integrated. Again, we still have, this is just for the data science group, we still use like Tableau and ClickView and things like that, SSRS for reporting, as well as the line of business applications and their internal reporting.

COVID-19 response work

The COVID ones, everybody has a COVID dashboard. I was joking yesterday with Rachel and Shannon. And we were no different, right? We were trying to get all the information we could because there wasn't really good information at first. I know Johns Hopkins and their dashboard, it became the standard, the gold standard people kind of lived by or went to and referred to. But at the early parts of it, we were scraping the same kind of data they were and trying to get a picture of what's going on in Wisconsin, especially with kids, because at that point, we didn't know what the impact would be on kids. So we started to build out what I was thinking was going to be just a public or a population health kind of dashboard. Once we started to show this, then people became interested in adding in internal information.

And so a couple of use cases popped up during our response, including that dashboard I just saw the screenshot for. We also started to think about during lockdown, we still had to serve the kids who needed to have acute cases or needed surgeries. So we wanted to make sure that they got the care they needed. And we had a very, very restricted list of people who could come to campus. So we generated Excel spreadsheets every day and pushed them out to the people. These are things that we couldn't quickly do on our EHR, but we were able to do that using R and RStudio and just kind of push and create that spreadsheet and kind of give them exactly what they wanted. You know, instead of saying, well, here's a report you can run from a reporting server, we generated, and there's other ways to do this certainly, but it kind of fit well with what we were doing already within RStudio Connect to extend that.

Same with surgical cases, elective cases were being canceled. The concern then became, are we, are we sure all the kids who had an elective case canceled were getting the care they needed? You know, at some point, did they reschedule? And so we started to think about how do we track that recovery? And then we also were monitoring patients being tested. When they came to campus, they had to have a test 24 hours before and also employees as they were working, we were doing testing for that too. We wanted to understand and inform our leadership that this was occurring and that the rates were low enough to feel comfortable. And then we did a bunch of recovery work. Also, we had to update our, you know, our time series models. And we start to think about how we were projecting volumes that come into different clinics and things like that. Obviously the volumes crashed during lockdown. So we updated those models, tried to predict best we could, because we didn't know what the new environment was going to look like.

Telehealth was a new thing for us also. We didn't have a lot of telehealth presence before COVID and the pandemic. So we started to monitor that and we gave them a couple of views of interactive charting, either with high charter here or else ggplot just wrapped in Plotly. And then we could kind of see, you can see the huge spike we had in telephone encounters, for instance, right away. Those stopped being reimbursed, so they dropped off, but we could still continue to monitor that.

Testing. This was another nice ad we did. We already had lab testing and lab reporting, but here we extended it quickly with some existing ggplots. And then we just pushed it out with Blastula on RStudio Connect again, informing all of our hospital leadership of the testing rates, again, for patients as well as employees. Here's an example of that surgical case recovery email that was going out every day. At first, it was very scary, right? We were saying how many cases have been canceled and how many have been completed. And you can see it was, it definitely took a while to tip back over past 50% or into late last summer, where we started to actually start to recover that casework and make sure that the kids were, had the surgery or had any kind of procedure they needed and had it performed. And here again, we pushed out information by service as well at the bottom of this email. And this went to the entire surgical services leadership, so they could say, well, this service is performing well, but maybe GI, for instance, wasn't getting all of their cases recovered that they should have, or they should have some follow-ups. So we would offer resources to help do that follow-up and contact patients and their families. And for this, we used, again, rmarkdown Blastula, pushing it from Connect. And then we used, this was one of the first uses of GT, that new table package, which I think was a, which is a real nice package to use for creating tables if you're used to using cable.

And then finally, we built another Shiny app, again, to kind of update and look at ways we could think about modeling what our response looks like. This, the black line you can see is kind of that crash where orange is what we were expecting for volumes. Black is kind of that true crash. And then we were trying to model what it looked like in the, in the days or months after the pandemic, when we started to come out of lockdown. And we use external data. So it was an interesting use of using like Google mobility data and Apple, you know, we had their mobile mobility data. So we could start to see movement in Milwaukee and Southern Wisconsin, Southeastern Wisconsin. We also got consumer spending data from Wamplee and then job postings and from Burning Glass, I believe it was. So we could look at activity in the environment and try and look for correlations or see if there was any kind of, kind of variation we could tease out of that to see if that maybe could help us inform our models.

Q&A: buy-in, tool selection, and data access

One of the most upvoted questions right now is more around the org structure of your data science program and how you've gotten buy-in to build R into the analytics infrastructure. Yeah, that's a great question. And I regret that I don't have a slide on that. So our analytics team, we have kind of an analytics team back in 2012, 2013, we brought the majority of the analysts together into an enterprise analytics group. So instead of having them dispersed among the departments within children's, we brought them together to try and understand. It was at the same time we were bringing in our new EHR. And so it was a perfect time to kind of bring everybody together, get them all level set on data and data sources. And then at some point we would maybe spin them back out, right? But we used an analytics center of excellence instead where we would reach out to any kind of power users maybe to help share that load. But most of the analysts still sit within this enterprise team. And that's the team that data science is part of. And so we have plenty of analysts. And then we have like two to three data scientists, including myself who work on kind of these more strategic or longer term projects.

To get buy-in, I think, again, I touched on it a little bit, but I really think it's those small wins, like where you can actually get in kind of deep dive, learn all you can, and then produce something that is like, wow, or find some insight. And I know it's hard to predict that that's always possible. But I think we've been lucky in that we've had some very curious counterparts on the business side at the hospital that worked with us to help us get that success. And then once we did, they wanted to share it, right? And so we used the open source Shiny server at first. But then once we got to a point, we needed to be able to have that tied into Active Directory and share security because not everybody can see every application or analysis we do. And that's where I think we started to get that toehold to bring R in more fully.

The next most upvoted question is, how did you decide when to use Shiny, as well as like R versus Tableau or other UI tools? Yeah, that's a great question, too. I think, because I always questioned that myself, right? It's like, how much do we want to take on as a data science team when I'm thinking about the program as a whole, because we need to do these on an ongoing basis. And we do have BI developers who are specialized in QlikView. And so I think the choice usually comes down to, at first, we'll do mostly static and ad hoc kind of analysis. And then as we start to see if there's going to be a maybe a little bit longer term need to have some interactivity, we'll bring in Shiny. You usually, though, it starts out with maybe rmarkdown, maybe some interactive, like a Flex Dashboard that's maybe a little lighter weight. If it's something that's truly line of business, counts, volumes, things like that, or we have an existing QlikView app for, we'll push the work towards that side. So that's kind of our decision point, I think.

Inpatient unit modeling with Monte Carlo simulation

So I'm going to spend maybe just five more minutes talking about something we're currently working on. We're thinking about, in the hospital, adding or kind of moving around some of the units and what kind of cohorts of patients they're responsible for. And this is coming out of the need for, as we know, there's kind of a mental health crisis maybe coming our way. And it's certainly true for children as well. So we'd like to think about maybe, do we add a mental health or behavioral health unit at the hospital, for instance, or do we partner or what do we do?

And so they had a couple ideas about what maybe we could do with groups of patients and placing them on different units in the hospital. But we didn't really have a good way to say, all right, well, A, we know that outpatient volumes are increasing and inpatient volumes are probably decreasing. That's a standard trend in healthcare. But we didn't know how, what percentage of maybe a base year, like say we take 2019 volumes and we think, well, what percentage of 2019 volumes will we recover in 2021 because of the pandemic? And so we created a tool that used to kind of a Monte Carlo simulation of different ways that we could look at slicing up these volumes. And we did it by cohorts of kids. And then we placed cohorts of kids on units and projected out a year, basically using the different parameters that I was just referring to as far as sampling percentages.

We know that, let's just say 2019 volumes, we know that maybe we're going to see 90% of the volume we saw in 2019 and 2021, or maybe 2021 and 2022. But we know that that volume isn't going to be exactly the same kinds of kids. It isn't all going to be cardiac cases or just, you know, going to be different kinds of cases or maybe broken bones or whatever. So we wanted to try and simulate a bunch of times just using Monte Carlo, of course, simulate a bunch of times and look at what would be a likely scenario. And then will it fit with our new placement model, knowing that some of the beds are going to be carved out for this new unit. So that's what this app was for.

And this is one that's still ongoing because we're working with the hospital leadership to understand the different models they're suggesting and to see if it'll even make sense from this, looking at it kind of historical data and this modeling. So the very first step we did, of course, was show them what's going on with the hospital, right? Where do kids get admitted to? Where do they discharge from? And we actually track this more granularly to look at all the units they might stop. And if they go to imaging or lab, but we want to understand the movement throughout the hospital. And this is just kind of a retrospective look at that using Plotly with the kind of in a Sankey diagram here.

At the same time, we also looked at all the historical data. And you'll see now it's just kind of an interesting chart because we're showing what we call tarp levels. So if we get up to like 17 or 18, here we go yellow, that might mean we might need to change staffing or resource planning, or we may think about having to move kids to a different unit, possibly. And this distribution here is an example of all the hourly census values that we've generated for the hospital. So and we've done this by unit, we did it by service, and we did it by kind of cohort of kids based on like diagnosis and attending providers. And again, so we're looking at how can we find out what is a representative way to look at the populations before we ever talked about doing any kind of simulation.

So again, this is a retrospective look, look at that. And then we decided, all right, now that we have all this data, and we have these distributions, and we understand the distribution of kids and where they fit, that's kind of do this Monte Carlo simulation, right? And then project these cohorts going to planned units and look at kind of the activity map to say, when are we going to be at that red tarp level, where we know that we won't have enough staff or enough beds for a given unit. And then we can kind of score the different units and the different models to understand that this model is going to fit or not. And here's your likelihood. And here's your confidence interval, saying this is what we think it might look like for a given year. And so we have a couple of different models created. And then we can kind of just run these scenarios.

And so we've been helping them in this kind of a unique way, I think, of doing this simulation just to understand, does this even make sense to suggest a model like this, which is what we ended up doing.

And so we've been helping them in this kind of a unique way, I think, of doing this simulation just to understand, does this even make sense to suggest a model like this, which is what we ended up doing. So we're just currently in this, we're working on this right now. This is kind of just like a work in progress. But I thought it was interesting that we started to apply, like at first we did the initial heat maps where we just looked at volumes, but it wasn't a lot of meaning until we applied that tarp level on top to say, yeah, when it gets to be this dark, it actually is a red tarp. And so we should be concerned at the end of the year. And this is like flu season, right, where it's starting to be worse.

And so we started to think about different ways to visualize that same thing. We actually went and also did it, instead of a heat map, say how green is green and how red is red. And so we actually put that just across a histogram like this. So we could kind of look at how deep into the red we were going or how green we really were to know if we had capacity, or if a unit could maybe take patients from a different unit as we start to think about kind of pop off units for when we get to high volume situations.

There's definitely more work. We did some work on patient churn. I think maybe that would be a great segue for another talk at a different time. But we definitely are using Shiny heavily. And this one is, like I said, one of the most recent apps we've done. And so a lot of our learning has kind of fallen into this one. We've kind of switched over to Plotly now for most of our interactive charts away from High Charter. I think the licensing is a little nicer and the flexibility we're getting with Plotly, where we can do the Sankey diagrams, we can do these kind of charts, and we can do a lot more manipulation of what we show in visualization.

Closing Q&A

I did have just one final slide, probably just talking about scoring and measurement. When we think about data science, I'm always thinking about how to score things or how to put things in context for people. And so we actually do look at scoring the individual units and then providing a rolled up score for the entire facility, which is our Milwaukee campus hospital. So we actually do that same kind of level of scoring here. And then we actually will build up some kind of report cards for the different models that get suggested.

Thank you so much, Chris. That was an amazing presentation and awesome just to see all the different ways that you're using R. I know a lot of people would probably love to be able to use some of these Shiny apps as well and fit them to their own use cases too.

Matt had asked, how long does it take to go from concept to production for a project like this? Does the engineering work occur before or alongside your build? Yeah, I would say it's very iterative. I mean, the COVID app we brought up within a week or two, because that could build on the other dashboards. So once we have something scripted, we kind of have these templates or skeletal apps that we can quickly pull something together. We definitely are leveraging our great analysts and our data engineers who pull a lot of the data together for us before we ever have to work on a problem. So that inpatient modeling I just showed you, that work, it started for me at the end of April. So really, that app came together and the simulation and runs since April. I mean, so I think it's very rapid. And I think that's the power of R, right? And it's like, we just can be so iterative and responsive in a short amount of time once you're, you know, fluent in it, of course.

Mara had asked, what type or amount of clinician involvement do you have on your team? So we always, yeah, that's, hopefully I made that clear. I mean, for sure, I don't know all this stuff. We definitely are partnering on every step of the way. We have a CMIO who works very closely with analytics. Like I said, our enterprise analytics is not in the IS group. We're outside of IS. We're within our health management group, which is like a population health team, similar to it. And so then we have actually have nurses, RNs on that team as well. But then any kind of request that comes in, we're always working with providers side by side, which is so enjoyable. I mean, that's one great thing about children's, but it definitely, you know, we give a lot of involvement and that makes us successful, I think, which is also a kind of a metric we use as we start to look at new projects too, is how invested the business side is in it before we go too far with it.

Yeah, so we have created an R in Healthcare Slack channel so that, you know, the conversation can continue after this session and beyond. I know that there are a lot of great questions for Chris, and I think some of those questions might be good topics for broader discussion as well. So please feel free to join that Slack group. And we're also going to be sending out a Google form if anyone wants to suggest speakers or volunteer to speak. We'd love to do these more regularly. I think this was an awesome talk, and I'd love to see more from members of the community as well.

Thank you, Shannon. And I will echo everything that Shannon just said. We'd love to have these more regularly as well. So thank you all so much for joining, and thank you so much, Chris. This was amazing. Oh, thank you very much. Thank you, everyone.