Dr. Carson Sievert | Reproducible Shiny apps with shinymeta | RStudio (2020)

Shiny makes it easy to take domain logic from an existing R script and wrap some reactive logic around it to produce an interactive webpage where others can quickly explore different variables, parameter values, models/algorithms, etc. Although the interactivity is great for many reasons, once an interesting result is found, it’s more difficult to prove the correctness of the result since: (1) the result can only be (easily) reproduced via the Shiny app and (2) the relevant domain logic which produced the result is obscured by Shiny’s reactive logic. The R package shinymeta provides tools for capturing and exporting domain logic for execution outside of a Shiny runtime (so that others can reproduce Shiny-based result(s) from a new R session)

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Cool. Thank you, Winston. And, yeah. Thanks, everyone, for being here. I'm excited to show you another project that Joe and I have been working on this year, shinymeta . And I want to start off sort of by first sharing that it's pretty crazy for me to be up here doing this talk and standing in front of you as a RStudio software engineer, because it seems like not that long ago, I was a grad student struggling with learning R and really, you know, not having a ton of fun with it and certainly didn't expect to become a software engineer at any point within the next, say, ten years. And in my second year of grad school, I discovered like ggplot2 and things that became the Tidyverse . And right around that same time, you know, I was becoming more productive with my R programming. And I then discovered this thing called Shiny , and it just kind of blew me away, like how magical it was to, like, take my analysis and, with very minimal changes, turn it into an interactive web app.

So that was really the point where I sort of became sort of, like, really into R and really into programming. And I was just sort of blown away at how I could enable others to sort of tease apart the data that I was working with and also found it really useful for teaching when I was a grad student as well. So this interactive power that Shiny brings to the table is really, really great. But we should acknowledge that whenever we make the decision to use an interactive tool for our work as data scientists, as statisticians, that decision to use an interactive tool over reproducible code is going to come at the cost of reproducibility.

The reproducibility problem with Shiny

So if we focus in on Shiny, really the core problem here is that in order to do its thing, Shiny sort of depends on the user events to sort of drive the application logic. And this Shiny runtime execution that sort of determines, like, what R code to execute in response to, you know, what user events. And at least for a number of years, we've been able to mostly work around this problem in focus of user events through Shiny bookmarking. So this at least allows us to save the state of the application at any point in time and in theory be able to come back to the state of the application sort of assuming that, you know, the app is there available and allowing us to sort of restore the state of the application.

But especially if we're not, like, if we don't have control over the application, if somebody else is hosting that for us, our bookmarking URL that we go to to go back to the state of our application, if that app is no longer available, we're kind of out of luck. So in that sense, we still lack some sense of permanence that we get from reproducible R code. And even if the app is working and we can sort of replicate the user events and get back to the state of the application, we still sort of lack some sense of transparency in the sense that those outputs are still locked behind a graphical user interface. And if, even if I could get at the source code behind the application, sort of the core R domain logic that I might be interested in verifying is correct, that's still sort of intertwined with Shiny's runtime engine and it's sort of hard to know exactly, like, what R code is being executed at any point in time.

it's sort of hard to know exactly, like, what R code is being executed at any point in time.

So the goal here is we sort of want to enable you to eliminate that Shiny runtime dependency in some sense, where we want to enable you to write Shiny apps that can generate R code to essentially mimic the code that it's executing so that you can run that code without Shiny and without having to, like, duplicate that logic yourself in multiple places.

So the project that we're working on to help you enable this feature is the shinymeta R package. And this provides tools for you as the Shiny app developer to capture the core essential domain logic that is powering your application and exposing it as code to users that they can run outside of the Shiny app.

A basic example app

So to demonstrate how shinymeta works, we're going to work through this real basic Shiny app example. It just has a text input box where I can type in a R package name. And as I type in different packages, I can get a new time series of the number of downloads for the past year for that given R package.

So the goal is to modify this Shiny application and add an ability for this code to update every time I enter a new package and give me reproducible R code back that I can copy and paste into a new fresh R session and essentially mimic the execution of this Shiny application.

So here is the entire implementation of that basic Shiny app. As with any Shiny app, there's two main components. The UI object at the top is for the user interface. It just has the text input for where I put the package name. And then the plot output sort of container for the time series plot.

And then the server logic below that is going to determine how to take that input from the text input box and create the ggplot time series for the plot output.

Adding shinymeta integration

So we're going to zoom in on the server functionality. And the first step with adding shinymeta integration is determine where in the logic of my Shiny app is the core domain logic that I want to provide to my user.

And we're going to walk through these three different steps here. So let's focus in on step one. This downloads reactive. Essentially, it's taking the current value of this text input, and it's referencing that as input dollar package, feeding that into this CRAN downloads function to give me the previous, like the downloads for the last year for this package.

And then in the downloads rolling reactive, we take the return value of the downloads reactive. And if it's a sensible value, we will use dplyr and this zoo package to compute a weekly rolling average, mainly just for visualization purposes.

And notice that I haven't highlighted this particular part of the reactive expression. This part is mainly just here to improve the interactive experience of the Shiny app and doesn't really have any sort of, you know, domain context in terms of actually producing the plot output. This is assuming we have a valid R package name is not really necessary to be here.

So we'll talk a little bit about that in the next step. But in the last step here, we'll just take the return value of the downloads rolling and feed that into ggplot.

So domain logic here identified in blue.

Then the next step is to use shinymeta's meta reactive building blocks and capture that domain logic that we've highlighted in blue. And those shinymeta reactive building blocks are highlighted in pink.

So the downloads reactive has just changed from reactive to meta reactive. So meta reactive will assume that the entire expression is domain logic that you want to export to the user.

But in this case of the downloads rolling, where we have just part of the reactive expression that we want to capture as domain logic, we use meta reactive 2 instead, and then we can just target a particular part of that reactive expression with this meta expert function and just tag that as the domain logic that we want to export.

And then finally, for rendering functions, since these are a little bit different in the sense that anybody can implement their own rendering function, we meta renders kind of like a decorator function where it takes the rendering function as the first argument. Here it's rendered plot.

And then the second argument is the reactive expression.

Right. So now we've in some sense captured the domain logic with these meta reactive equivalents to Shiny reactive building blocks.

Marking reactive reads with the dot operator

And the next part is to, within sort of the domain logic, identify any reactive reads. So at the very top here, I'm reading the current value of the text input.

And then down below, I'm reading the return value of the downloads reactive expression and reading the return value of the downloads rolling reactive.

So this part is probably going to be the hardest part to see, but once we've identified the reactive reads in blue, then we're going to mark those reactive reads with a dot operator. And this is going to essentially provide a signal to the shinymeta code generation mode and let it know that when we go to generate code from these reactive components, I want to essentially let shinymeta know that these should be replaced with either a static value that it represents or a variable name that will represent the value that it really represents.

So we'll see that in a second.

I also want to point out that this dot dot operator, you can think of it in some sense as the, like the bang bang from Rlang , in the sense that you can use it like a unquoting operator where if you wrap any sort of R code that's not reactive, it will just evaluate that code instead of returning the expression.

So we'll see how this can be used to essentially like hard code a dynamic value.

Using expand chain to generate code

And then really the last step to actually generating the code to mimic the code that the Shiny app is actually executing when it goes to like fill in the outputs of the Shiny app is to use this expand chain function. And here I'm just providing it the plot outputs object. But if you had multiple objects, you could provide multiple outputs to this function. And not only will it return the domain logic in a way that can run outside of the Shiny app, but it will also figure out sort of the minimal amount of code necessary to do so.

So when we actually call expand chain, it will return all of the logic that we've captured with the metareactive building blocks. And before returning, it's going to walk through this code and it's going to look for these dot dot operators.

And it's going to find this first dot dot operator and it's going to kind of recognize that it's a marked input value. And in this case, since this is just, well, what it will do is return it with the current value that's in the text input box when we go and actually call this expand chain function.

And then this part, there's nothing really special in a reactive sense about this R code inside this dot dot operator. But I'm essentially just putting the date as it stands today, the day that I'm generating this code in a human readable format. So this dot dot operator can in some sense be used to not only like hard code a dynamic value, but also if you kind of master metaprogramming in some sense, if you're familiar with bang bang from Rlang, you can use it in ways to make your code that you're generating a little more human readable.

So now it's going to go to the next dot dot operator and it's going to recognize, oh, this downloads reactive read. This actually represents a value that now actually exists in this code that I'm generating. Because it's going to recognize that, you know, the downloads reactive that we now have static code for, this is a variable that now exists. So I can just simply replace this reactive read with the variable name.

And then it will do the same thing for the downloads rolling. Now that we're essentially creating a new variable, downloads rolling, I can reuse that in the ggplot visualization call.

Setup code and the RMD bundle

So for most serious applications, your reactive code is going to rely on some setup code, sort of like some subset maybe of the setup code that, you know, you have above your launching of the Shiny app to do things like loading libraries and reading in data, these sort of things. And at least one way that you can approach this today is you can quote that subset of code and provide it to expand chain and it will just return it verbatim. And that's all we really need to get this application spitting out our code that I can then copy and paste in a fresh R session to mimic the code that the reactives are actually executing.

But I don't actually recommend just inlining code like this in your Shiny app. We should really think about providing a better user experience. There's a couple ways to do this with shinymeta, but I'm going to show you the more powerful approach, which is to have something like a download button and have this button call this build RMD bundle from shinymeta, which allows you to provide a zip bundle to your user with both like an RMD source file with the code that's influenced by the user input as well as the rendered output of that R markdown source.

So if we start with our expand chain that's generating the code to mimic the code that we need for the time series visualization, we can slip that inside a download handler and take the results of that expand chain call and feed it into this build RMD bundle. But this build RMD bundle will also need an R markdown template and here is where you can specify where in your R markdown template you actually want to place the code.

And this can be pretty useful when you have, say, lots of different outputs in your Shiny app. You can place them in different R markdown chunks and also like specify the output format and customize to your heart's content.

What's possible with shinymeta

So just to give you like a sense of what you can do with this sort of thing, here is basically the same data but a better interface where I can go in and select different R packages and compare them in the same visualization. I can choose different transformations to the data and then finally click a download report to get that RMD source and the output or like the rendered output of that RMD source.

So and now that we've sort of enabled our user to interact with the Shiny app, find something that they're interested in and get a permanent artifact generated by the Shiny app, in some sense what we can do is have like fully programmable dynamic reports without any programming necessary.

in some sense what we can do is have like fully programmable dynamic reports without any programming necessary.

So here what I've done, this is especially interesting if you work with dynamically updating data, I can go into my Shiny app, like find the packages that I'm interested in and then I can use something like RStudio Connect to schedule this thing to run, say, every day or every month or every year and then I have like a historical permanent artifact.

Give you another sense of what's possible here, here is a Shiny app where I can upload some data and run an ANOVA analysis on that data. So I've uploaded a data set choosing like a response and some predictor variables, checking that the assumptions of my ANOVA analysis are okay, that I can actually run this analysis, choose some different predictor variables, actually run the model, get my results of the test statistics, and then finally at the end of the day, download a report with both the source and the output that I see in the application, that's all driven by the data that I've uploaded and the inputs that I've chosen in the Shiny app.

Summary

All right, so in summary, interactivity is great, but whenever we do this, we're kind of, sort of, it comes at the cost of reproducibility, but with shinymeta, you can, with some effort, at least allow your users to get some reproducible code back out of that interactive application. And to add integration to your Shiny app, you want to identify and capture the domain logic, use this dot dot operator to mark reactive reads, and then expand chain to get the code back out of the outputs that you're interested in, in the Shiny app, and then optionally, to distribute that code to users, you can provide it as a zip bundle with the RMD source and results via this build RMD bundle, so thanks.

Q&A

How does the use of shinymeta affect performance, since it seems to be doing a lot of extra evaluation?

So it really shouldn't affect the performance at all, really, in the sense that there's two modes of execution in shinymeta, so when you call expand chain, it sort of opts into this meta execution mode, and that's, when that happens, it's doing some computation to generate code, but it's actually, you know, not necessarily evaluating expensive code, and that evaluation is actually totally separate from the normal execution mode that is normally, the normal execution is what allows, like, the output of the time series to actually be generated in the Shiny app, that's a totally separate execution model from the generation of the code.

Okay, I think that's all we have time for questions. Thanks a lot, Carson. Thank you, everybody, for coming, and have a great break.

Featured software#