
Brendan Graham | Using R Bridge the Gap in Electronic Medical Record Reporting | Posit (2020)
Electronic Medical Records (EMRs) are a treasure trove of information, but tend to fall disappointingly short when it comes to visualizing and reporting data in a user friendly and intuitive manner. Building reports in an EMR can be a frustrating experience; the developer is at the mercy of how the data is stored within the EMR and the available EMR reporting tools can be bland and uninspiring. But reporting on data in the EMR doesn't have to be this way! Combining the data-rich EMR with R's robust reporting capabilities benefits both developers and consumers of data. This talk will describe how a cross-departmental project team uses an internal R package, RMarkdown reports scheduled via R Studio Connect, and an interactive flexdashboard app to quickly implement solutions to gaps in the reporting capabilities of the EMR. The flexibility of R relative to EMR reporting tools facilitates a design thinking approach to reporting allowing for more user input, customization and quick iteration. Furthermore, the web-based app we developed is able to be embedded within the EMR itself allowing for a more streamlined workflow
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Let's welcome our final speaker, Brendan Graham, from Children's Hospital in Philadelphia, who's going to talk about shiny new things and using R to bridge the gap in EMR reporting.
So hi, I'm Brendan Graham, and I'm a clinical data analyst at CHOP. Today I'm going to be talking about a project kind of centered on scheduling hospital admissions and how I used R not only to kind of explore and analyze the data, but to develop new and useful kind of workflow tools that are being used every day in the hospital that kind of go above and beyond like the reporting capabilities of the electronic medical record software.
So before I get started, this is the medical room, so I'm going to quiz everybody. Like why do we need to focus on hospital admissions? So like kind of by show of hands, who here kind of thinks that hospitals kind of are operating optimally around like 60% capacity? Show of hands. 70% occupancy? 80? 90? All right, I see a few hands. What about like 100% occupancy is like the ideal occupancy?
But that's not, is that optimal? So really the kind of the range is really like we think of as like 80 to 85% optimal. You know, you always want some excess capacity to be able to handle like an influx of patients if you need to.
So with that in mind, here's kind of what CHOP's hospital occupancy is kind of looking like over the last few years at the monthly level with that optimal range kind of shaded in. So this is kind of why we're focusing on projects around hospital admissions. And you know, nine out of the last 12 months we've been operating at or above 90% occupancy in the hospital. And so what the data is showing us and what people are really feeling is like this new, this high occupancy is really like our new norm, and we kind of need to do a better job of like managing our capacity given this like sustained increase in the demand for the services.
Understanding elective surgical admissions
So as you know, operating at such a high occupancy really impacts lots of areas around the hospital. I kind of don't need to go through all of them as we're all probably familiar with these. But to address these, we have a whole series of projects under our capacity management portfolio in terms of focusing on operational ways that we can help manage our capacity. And the one I'll be talking about today is really focused on understanding our elective surgical admissions. And the goal of this project is really to find ways that we can use data right now to help operational leaders plan for these high occupancy days ahead of time and act if they need to.
So just in case anyone's not familiar with sort of hospital lingo here. So a hospital admission is really just like a visit where a patient spends one or more night in the hospital. And so by, you can probably guess by extension, a surgical admission is a visit where a patient comes to the hospital for surgery and then spends one or more night. And we do have sort of non-surgical admissions as well, but those are out of the scope of this work.
So if you have a surgical admission, we can classify that as sort of an emergency admission, meaning that a patient had to have emergency surgery that required an admission. So think of getting your appendix removed or fixing a broken arm, things like that. But we also have the other classification of these elective or scheduled procedures. So these would be things like scheduling a knee surgery for next month or scheduling your child to get their tonsils removed in the summer when they're out of school.
And so we're kind of focusing on these elective surgical admissions because we think since these are being scheduled and not as urgent, we should be able to do a better job of scheduling when these are happening given our capacity constraints that I mentioned earlier.
And so we kind of dove into the data and what we saw was that the surgical admissions are highly variable. And this high variability really can exacerbate a lot of the issues I mentioned before. Looking at August of 2019, we had one day where we had 10 surgical admissions and a different day in that month where we had almost three times that amount. And so you can imagine if you're an operational leader, we have flow facilitators that are managing how patients are moving through the hospital. A day with 10 admissions feels a lot different than a day with 30, especially when the units are already really full.
And so we kind of made this our mission to understand the causes of this surgical admission variability and try to find ways to maybe contain or minimize it or make it a little bit more smooth, a little bit more predictable so that we can really help with some of the capacity constraints.
The scheduled admissions app
So since we're looking at scheduled admissions, we thought we should jump into how things are being scheduled and the scheduling process. So I'm not sure if this is familiar to some of you at your institutions, but at CHOP, surgical scheduling is highly decentralized. So what that means is within the Department of Surgery, we have several subspecialties. So general surgery, orthopedics, urology, neurosurgery, so on. Each of those subspecialties has a group of schedulers who are scheduling for those surgeons and those surgeons only. And their real concern is feeling that that surgeon's OR time. And there's not a real good mechanism to kind of calibrate what people are scheduling in terms of admissions across all of these surgical divisions.
Kind of even if they wanted to, that tool didn't exist in the medical record or outside of the medical record. And so as part of this work, we were thinking, you know, we have access to all of this scheduled data. Like we could make an app that shows across all the divisions what's being scheduled and present it to people in a way that can highlight potentially problematic days in the future. And so to that end, we developed our scheduled admissions app.
So now for the first time in the institution, our scheduled surgical admission volume is now exposed across the hospital and is available for anyone to see. So how this is sort of oriented is like each facet you're seeing there, each kind of group is a month. And each row is a different unit. So the top row there is our main surgery inpatient unit. And the lower row is our PICU, our intensive care unit. Each cell and every cluster is a calendar. So each cell is a day. And the number in each cell, if you can see, I know it's kind of hard to see, but each cell is numbered. And that number represents the number of scheduled surgical admissions for that day for the next four months. And we were able to color code the cells based on a combination of kind of the historical average for that month and also kind of a relevant operational threshold.
So our PICU is like a special case where it's like very full. And so we wanted to be alerted earlier for potentially problematic days. And so that threshold is actually much lower than like what sort of the average would be. So the idea is a scheduler can use this tool when they're scheduling to be able to see maybe a good or a bad day to schedule a patient. And remember, the goal here is to really minimize that variability. So in an ideal state, we would never have like a green day and then a red day and then a green day. Like we'd kind of all want it to be as smooth as possible.
And so the app is interactive. You can kind of hover over each day to see the actual dates that you're looking at. And if you're scheduling for a surgeon who only operates like on Wednesdays, you can filter just to see the Wednesdays and all of the other days sort of disappear and all the noise is gone. And you can see like what do the Wednesdays in the next four months look like. You can see like where there might be gaps in the schedule that you maybe want to fill in or red days that you'd maybe want to avoid.
Building the app with R
So I'll get kind of into the impact and kind of how this all played out. First I want to kind of talk about how we built this app, why we built this app in R, and kind of how we use the electronic medical record data at CHOP to go from the medical record itself into R.
So every day, we have the schedulers who are scheduling in the medical record itself. And then every night, all of that data goes into our data warehouse. Not only the scheduled data, but all of like the hospital data for that day goes into our data warehouse. And as an analyst, we're writing SQL queries against that data warehouse to build up sort of the data set that you need for an analysis. In my case, I need to know the scheduled surgical admissions for the next four months by unit and by day.
And so our data engineers at CHOP have set up Apache Airflow, which we kind of use as like analyst-owned ETL. And this lets analysts at CHOP, who are sort of the content experts, create and own data marts in the data warehouse with little to no data engineer involvement. And so Airflow kind of runs my scheduled admissions SQL query every morning and then creates a table in the data warehouse of just those results for me. So every day, this is updating for the next four months. It's sort of like a rolling table. And so that table now exists in the data warehouse as its own like standalone table. And that's the table that I bring into R. It's like almost set up. It has all the raw data that I need and just the data I need.
But I need to bring it into R to kind of organize it a little bit and get it ready to present. So I bring it into RStudio. And I use a combination of some tidyverse packages, namely like dplyr and luberdate, as well as the Rocky package, which none of you have ever heard of. And I'll get into that in a sec, to kind of organize and get the data ready. The app I showed earlier was actually a flexdashboard, which is kind of just like a fancy R markdown document with some shiny under the hood powering the reactivity. And it kind of provides us like a nice simple layout for everyone to use. All of the charts that you saw were made with the high charter package. It's a really great package for creating interactive charts in R. It's like they're very sharp and very snappy. And they look great, like right out of the box. And that allows the reactivity with that hovering tool tip I showed, you can kind of customize that. And they're actually heat maps, just designed to look like calendars. And we're able to kind of discreetly color code in the days kind of using that heat map style. And then finally, it's published on RStudio Connect, which allows anyone in the institution with like the link to go to the app and see what this is.
The Rocky internal package
And so to get into Rocky, I couldn't come to RStudio and not talk about Rocky, which is CHOP's internal package that we developed to help us gather, explore, analyze, and present data. And it's called Rocky because the office that I'm in used to be called the Office of Clinical Quality Improvement, or OCQI. And then we had R, OCQI, and then Rocky. And then we're based in Philly. And then the Rocky thing just kind of snowballed. And it's called Rocky. And we have a nice little hex sticker there with our Rocky holding up the Q.
So like I said, it's available to anyone in the organization. And it's like really easy to install. We have it hosted on a CRAN-like repository within CHOP so that we don't have to install it from GitHub. We just have, we edited everyone's R profiles to kind of point to this repository to install, which is great.
So how did I use that in this specific instance? Well, I really just used it to get the data from our CDW to R. So before, if you're kind of familiar with getting data from databases, you kind of have to do a few steps, right? You have to set your connection. You have to reference your SQL file. You have to execute the SQL file. You have to disconnect from the database. And then you can start doing all your stuff. With Rocky, and specifically the run SQL file that we implemented, all of that sort of abstracted away from the analysts. And I just have to remember one function and where I save the file.
And this is great for a couple reasons. As a new hire coming in, you don't have to have all of these steps as you're learning everything else. And then you have to learn these little steps to connect the database. We kind of abstract all that away. And secondly, there's no more passing around this sacred connection string from user to user or from yourself copying and pasting it from document to document, right? It's like just one function to remember. It's very easy. So this is the end of my rant. But if you have the time, I definitely recommend developing an R package internally to help solve some of these institutional or administrative problems.
if you have the time, I definitely recommend developing an R package internally to help solve some of these institutional or administrative problems.
Why R instead of the medical records software
So kind of getting back to the app. So why did we use R for this and not the medical records software itself? So the medical records software might be kind of preferred because it's in the scheduler's workflow. They're in that software doing their work. But it's kind of a closed environment. And we can't develop in there easily or quickly. It would require, like, outside resources of the other developer's time. But with R, you know, we can do this quick iterative test of change and invite the users to the table and do this sort of design thinking approach where we're kind of asking how do you guys want to see it?
And the other great thing is if it turns out not to be useful, we don't mind just walking away from it. Because it was very quick and easy to start up. And you can imagine starting a whole big development project with outside resources and then halfway through saying, actually, never mind. That's not going to feel great. But with this, because our startup costs are so low, we can get things up and running and then, you know, quickly pivot if it's not working out.
So the other good reasons we did this in R is that we have full control. You know, we can have control on RStudio Connect in terms of who can access it. What it looks like. So the app in its final version that I showed you, it looked nothing like that at the start. But because we invited the schedulers to come and look at it, we kind of got their feedback and we were able to make some changes. And finally, it kind of serves now as a single source of information. We don't have any more conflicting reports that people may be generating themselves in the medical record. You know, you could inadvertently be filtering for things you're not aware of and data doesn't quite match up. And it kind of solves that problem.
Impact and dynamic email alerts
So as far as the impacts of the app. So remember, before this app, there wasn't really a place to see this information in one place. And now there is. And while our initial audience really was the surgical schedulers, and at the point of scheduling, we had other operational leaders, like medical directors, nurse managers, kind of picking up on this. And finding it useful. These were leaders who were used to seeing things maybe 24 to 48 hours in advance. This very near-term planning. But now they're able to see at least the surgical admissions up to four months in advance. And it kind of took on life as this new early warning system.
So now, rather than trying to maybe influence how things are being scheduled, we're showing the decision makers how things are already scheduled and having them react to it to be able to kind of plan for potentially high admission or high occupancy days ahead of time.
And this became kind of apparent one day last May. I started to get emails from nurse managers and medical directors, the Perriott medical directors, saying, you know, we have a day two weeks from now where we have 22 admissions to the floor. Which is way too high, and we need to start thinking about rescheduling or maybe canceling some cases. Which is a big deal. We never want to do that. You never want to cancel an elective case, especially for capacity reasons. That's not even a medical reason to reschedule something. Which is, we never want to do that. But if we have to do that, we should probably be doing it as far in advance as we possibly can. It's better to do it two weeks in advance than two days in advance. Even though obviously we never want to do it.
So I was like, sure, I can get you this data. Here's a patient list. Here are those 22 patients. Here's surgeons doing it. You can now start the conversations, do stuff. And so they were able to kind of tweak that day. And then I started thinking, well, that's happened once. It's probably going to keep happening. And I should set up a system so that I'm not sort of responding to these as these days are coming up.
So to kind of get around that and to do that, I scheduled these dynamic emails with R. And so basically, based on a threshold, an email gets sent off to these operational leaders with a patient list for these problematic high-occupancy days so that if they need to take action, they can. And so it kind of gives them the information when they need it and only when they need it.
So there's an example here with some of the patient information blacked out. But it's very simple. It's just a PDF. It's an HTML document that's attached to an email that says, here's the patients for this day. And it just gives them the information to act if they need to. And so kind of between this email and the app I showed, we have these operational leaders really making daily operational decisions using these workflow tools that we made with in R. And I think it really highlights the fact that R is more than just an analysis kind of language for modeling. You can really make workflow tools that are used every day. And it's been really kind of rewarding to have people use this stuff.
I think it really highlights the fact that R is more than just an analysis kind of language for modeling. You can really make workflow tools that are used every day.
So why did we use R for the emails? Well, really, it's all about the conditional sending of the email. So there's a code chunk on the screen. The very end of my script, there's just a simple if statement that checks if there's data in a table. And that table above basically says, if a threshold is met, show me the patients. If it wasn't met, show me nothing. So if that table has data in it, we send the email. And if we don't, we just suppress sending the email. And that's it. It's really simple to set up. And with Connect, it's really simple to kind of schedule this.
So this is an example of the scheduling pane. So I just schedule the report to refresh daily. And I send the email after the update. And that's sort of it. And so I hear that people are using this a lot. And there was a day where we had eight admissions to the PICU, which wasn't going to happen. So they were able to use this email to identify who those patients were. And they were able to successfully reschedule two of them ahead of time, which kind of made that day flow a lot easier.
Takeaways and next steps
So kind of for big takeaways here, I think, as we all know, but it's really highlighted in some of this work, is that R really allowed us to have this fast iteration and user testing. And while the initial audience was those surgical schedulers, and we were really trying to narrow down that variability, we didn't really have sort of the tools or a parallel process outside of that app to really provide a lot of governance to that and make that really stick. And this project really highlighted that.
Who's in charge of saying no? That wasn't obvious before this. And this kind of brought that to light. If we have more admissions than we want, who makes that call? Is it surgery? Is it anesthesia? Is it hospital operations? What do we do here? So that was kind of a bad and good thing that came out of this app. And we're actually circling back to address that now.
These workflow tools are really allowing people to take action and make decisions. Because for the first time, all of this information is kind of exposed throughout the hospital. And what I'm finding is seeing this data in this new way, this long time horizon, is now leading to kind of new questions being asked of the data. We're seeing the schedule now four months in advance, and we're thinking, how can we take what we're seeing and schedule new admissions in a way that really does smooth that variability out?
And then to that end, and to the next steps here, really about intentionally scheduling things. And to do that, we think we need to be able to try to be able to predict some sense of like have some sense of the future hospital occupancy, both the scheduled and the emergent population. We need to align all the scheduling processes, which are very different between surgical and non-surgical. And then we think if we can do all of that, we can almost provide like a recommendation system to the schedulers for like the quote best day to schedule that admission from an operations point of view. And then there's obviously lots of other things to consider. Is the patient available? Is the surgeon available? Does it need to happen within a certain window? But we think as a starting point, like we definitely have the opportunity to provide some decision support to the schedulers using all of this data that we now have access to and have been showing people. So fingers crossed, but maybe next year I'll be back here talking about the results of this project.
Thank you for coming. And if you have any questions, and also we're hiring. So come find me afterwards if you are interested in joining the team. Thank you.
Q&A
So we do have a number of questions. First from Matt, given that the occupancy is highly variable, does the historical data provide a relatively consistent picture year over year?
I mean, there's definitely some seasonal patterns in terms of like we have the surgical and the medical populations are kind of flipped. So surgical is like very high in the summertime because we're a children's hospital, so kids are off school. They're having all their elective procedures then and it kind of dips in the winter. But at the flip side, the flu season is kind of peaking in the winter. So there is definitely some seasonal and like year over year, like larger, longer term trends. But day to day, it is very variable.
So Nick asks, is high occupancy important in the PICU? And are there departments where you don't want to see high occupancy? For example, it could be an indicator of something else that's wrong.
I don't know. I think we never really want a unit to be fully maxed out. You always want some wiggle room. As far as something else going on, I think there's definitely some opportunity to look at has there been changes in like the length of stay for any certain population or procedure in each unit. And that's something we're considering too when we're thinking about like scheduling patients in the future is like thinking about length of stay. So that's something we'll definitely have to look into.
So Steven asked, given that surgical scheduling is highly decentralized, how do you incentivize or force surgical schedulers to use your app in their workflows?
Good question. You have to get somebody really important involved. It is hard. But I think for better or worse, we never really had to think about it until now because now we're so packed, we have to start making these decisions. So the lack of governance was sort of hidden behind the fact that we never kind of needed to worry about it in the past. But now that we do, it's kind of forcing us to deal with it for sure.
So Gabriel asked, do you adjust for length of surgery? An orthosurgery can last 12 plus hours. Others can be much shorter. We need to know how many hours of surgery are scheduled.
That's a good point. I think when it comes to suggesting a day to have the case, you have to make sure that there is the OR time available for sure. So like, I don't know, like ear tubes take very short, but orthopedic surgeries take very long. So you have to factor in not only the patient availability and the surgeon availability, but like that time and the projected length of stay. So there's lots of definitely lots of things we need to think about for sure.
Tiffany has a technical question here. Is there a way to have a Connect RStudio plus Jupyter Authenticate via LMS Canvas via LTI, for example, similar to JupyterHub? I don't know many of those words, so I don't know. I'm sorry.
All right. So last question. Does rescheduling happen very often for surgeries?
Not often. And when it does, it's usually because of like a medical reason, the patient's sick, they maybe ate food too close to the surgery and we have to reschedule it. So we do track the reasons why cases are being rescheduled. And so that's actually one of the things we want to use as sort of like a metric here is like how often are we canceling or rescheduling cases due to capacity reasons specifically. So luckily we do have the ability to track that. So it doesn't happen often, but when it does, we are able to at least know why. Well, thank you. Well, let's thank Brandon and all of our speakers.
