Kolbi Parrish & Andy Pham | R Markdown + RStudio Connect + R Shiny

Transcript#

This transcript was generated automatically and may contain errors.

Welcome to our talk on R Markdown, RStudio Connect, and R Shiny. We're going to serve you a three-course meal today on how we automated our data processing, error logging, and alerts.

So it's time to meet the chefs. My name is Kolbi Parrish, and I'm an informatics specialist at the California Department of Public Health. I'm working there through UCSF. Hi, my name is Andy Pham. I'm also working at the California Department of Public Health, also for UCSF.

So now that we have our log set up and our logging is happening, I'm going to cover a little bit about how we implemented error alerts, and it was relatively easy to do by just adding a little extra code at the end of each R markdown chunk outside of the try catch function, and in the event of an error, we essentially used blastula to send out an email.

Process monitoring app

So I'm going to turn the floor over to Andy to cover how we took those log files that were in JSON-based format and turned them into something a little more digestible. All right. Thank you, Colby. So now you have your automated processes set up and you have your email alerts for error logging. Now let's say we get an email saying that there's an error. The next step is to troubleshoot, right?

However, errors rarely occur in isolation. In order to fix an error efficiently and to prevent it from occurring again in future states, you want to be able to find if this error is occurring in your development and production environment, has this error occurred multiple times throughout your processes?

And so before the process monitoring app, what we would have to do is we have to go through the error and process logs. And as Colby mentioned before, these are in JSON format. And there's probably hundreds to thousands of lines of this. So you'd be pretty much abusing your Control-F to try to find what errors occurred and where it occurred. And that's also not to mention that since we're logging almost every day, we'd have to go through all these files to see if there's some kind of historic pattern coming on from these errors.

And so going through all this, I would equate it to sort of trying to eat a melting ice cream sundae. Everything's mixed together, different ice cream flavors are mixing, and you're tasting like two, three different flavors, and you're not sure if you like it or not. It's very interesting. However, our approach was to season in some R magic and make a more beautiful ice cream sundae where all the components are easily seeable and you can get to where you need to be quickly and efficiently.

And so we did this with this was our process for our process monitoring app. We took a bunch of messy logs, applied some Logit, R Shiny, Plotly, data tables, any R visualization packages, and turned it into a dashboard where we can view both how our processes were doing and the errors we got within one single dashboard.

However, we first needed to be able to read all of these logs into something that was R-readable. Fortunately for us, Logit provides this function already. There's a function called read underscore logs where if you pass the path of your error file, it'll read in that error file and return an R data table that's easily filtered. So now we could filter based on if we wanted to find an error and then by which specific process and then within a specific time frame.

So this was very useful for putting up the dashboard initially. And so we fed this data into a data table call and built error data tables where we could view all of our error data errors in relation to each other, when they occurred, and if they occurred within any other processes. We additionally fed our process logs into another data table. And because we had set up our Logit calls within each R Markdown chunk, we can now see how long each R Markdown chunk took within the process and be able to identify any bottlenecks as needed.

What we also did was we also fed some of the process log data into Plotly. And so we created this graph where we could usually be able to see how our processes were doing in relation to each other and in relation to our development and production environments. We could also see if any processes were running much, much longer than we expected so that we can get in there and fix it before anything broke.

And so we took all of these different pieces, the error data tables, the graphs, and we used R Shiny to put it all into a dashboard. And so now we could get where we needed to be really quickly and be able to see both of our development and production errors and processes in one setting.

And so the outcome of this is that we had all of our error viewing, troubleshooting, and process monitoring in one place, and we could get to where we needed to be quickly. In addition, because of R Shiny's user-friendly interface, we could have new users on our team be able to dive into being able to fix our errors without having to know the intricacies of our monitoring system because we already had an interface built over that layer. And so new hires to our team could also start contributing to this process monitoring app, again, without having to know too much or spend too much time learning about how our error logging system works.

In addition, because of R Shiny's user-friendly interface, we could have new users on our team be able to dive into being able to fix our errors without having to know the intricacies of our monitoring system because we already had an interface built over that layer.

And so errors happen. We're all human. We hope these tools and recipe that we have made available to you today help you be able to create a full three-course error-friendly meal that's easily digestible and worthy of praise. Thank you.

Kolbi Parrish & Andy Pham | R Markdown + RStudio Connect + R Shiny | Posit (2022)

Transcript#

Background and motivation

Automating data processing with RStudio Connect

Error alerts and logging

Process monitoring app

Featured software#

blastula

rstudio

Shiny