Shiny New Tools for Scaling your Shiny Apps - posit::conf(2023)

Transcript#

This transcript was generated automatically and may contain errors.

So, thank you all for having me out today and joining me for my talk. Again, my name is still Joe Kurinsic, and I'm here to talk to you all about scaling your Shiny apps and some cool tools to do that.

But before we get into that, I'm going to talk about something a little different, which is Rec League Sports. In these low-stakes competitive arenas, there's a number of player archetypes that emerge. But one that shows up often is the overextender. This person is putting it all on the line on their team. They're doing everything, everywhere, all at once, and with an unrelenting fervor to boot.

We love this player. We appreciate what they do. But what becomes apparent very quickly is that they reach a point where they're doing too much, and their performance just starts to get weird. It starts to lag.

And truth be told, our Shiny apps are very much the same way. You know, the process goes, you develop a prototype, you share it with a particular team. Team loves it. Word starts getting around. Other teams begin to leverage your app as well. Adoption continues to increase. But you reach a certain critical mass where your Shiny app is serving a lot of these teams But the performance starts to lag. You know, the complaints start to roll in. My plots are lagging. Or my report's taking too long to run. Or you just get that spinning wheel of death that seems to be taking just a little too long.

The problem with scaling Shiny

So this talk, okay, this talk is going to be a story about two things. One is going to be about a toy app that struggled to scale to a large number of users. And then we're also going to talk about how we use JavaScript to augment Shiny to overcome that struggle.

So this is the app that we start with. It's a very simple Shiny application with two interactive hex bin plots. Simple easy and nice. So with an app like this, you know, what seems to be the issue? The issue happens when we go from one concurrent user, which is me operating this app locally, to when we have 100 concurrent users accessing this app at the same time.

And that problem is underscored by this plot that I'm going to show you all right here. Now for some of you that aren't familiar with kind of like load testing tools for Shiny, this plot may be a little bit opaque. So let's try to break it down a little bit. So this is what's known as a session duration plot. And it's showing 100 simulated users using the Shiny app.

The ticks along the Y axis are basically representing your simulated users as they go through the workflow on the previous slide. And then each of these segments that are stretching across the X axis, that's the amount of time that it took a particular step in that user's workflow. And then we have, it's a, sorry, it might be skinny for those in the back, but there's a red line here that is kind of a reference point that indicates how long the total workflow took for a single concurrent user. And here, for this workflow, is about 93 seconds.

So there's two things that stand out in this plot. One is that we're noticing some inconsistency in when our users are finishing the workflow. From a UX perspective, the same workflow should take people the same amount of time if they're accessing the app at the same time. That's just fair.

Another thing to notice is that the amount of time that it's taking to go through this workflow is, has gotten longer. And in the worst case, we're seeing that some users are finishing close to 150 seconds, as opposed to the 93 seconds for the one user. There's no bueno.

So we have this problem, why is it happening? To understand why it's happening, we want to quickly just run through some basics about how Shiny works. Whenever we update one of the dropdowns in those hex bin plots, the browser is going to send a signal over to R and say, hey, can you please generate another plot for me? And then R is going to be like, sure, buddy, I got you. And it's going to generate another PNG, which then the browser goes to render. So we have this nice bidirectional communication going.

And that's nice when you have just one user, like when you're developing locally and whatnot. But this becomes a problem when you have many browsers connecting to your app simultaneously. Most of the time, when you have your Shiny app deployed, there's ultimately a single R process that's underlying that app. And when all of these browsers are connecting to it, that single R process is doing the brunt of the work to serve up your plots, tables, et cetera.

Now, R is a very fast language. But R is still a single-threaded language. And so because of that, every Shiny app out of the box is going to reach a certain critical point where the R process becomes saturated. And it develops a backlog of requests that it has to mow through. And that backlog is what causes that increase in the session duration that we saw in the plot earlier.

And so because of that, every Shiny app out of the box is going to reach a certain critical point where the R process becomes saturated. And it develops a backlog of requests that it has to mow through.

We've achieved upwards of a 40% speed improvement. If you consider that in the worst case, users were finishing close to about 150 seconds before, we've brought that back down to 93 seconds, which is good.