
Joe Cheng | Managing long-running operations in Shiny | Posit
It’s been years since Shiny evolved to allow asynchronous operations within applications, improving scalability. The introduction of the {promises} package enabled concurrent processing between multiple Shiny sessions, a significant step forward in handling background tasks. However, this did not address the need for intra-session concurrency, where users expect to interact with the application while long-running calculations are executed in the background. Recently, we added a new ExtendedTask feature to Shiny to allow for such intra-session concurrency. This new feature provides a different approach for developers to incorporate asynchronous tasks, enabling smoother user interactions during intensive computations. Alongside ExtendedTask, this talk will also discuss newer methods for launching asynchronous tasks, besides the usual {future} package. The focus will be on the practical application and integration of these features into Shiny applications. Links mentioned in the video: ⬡ Shiny in Production: Principles, practices, and tools, https://youtu.be/Wy3TY0gOmJw?feature=shared Timestamps: 0:20 Make your slow code fast 1:43 Long-running operations are a problem 3:28 Inter-session concurrency and intra-session concurrency 4:24 Introducing ExtendedTask 5:17 Demo of a slow API using ExtendedTask 6:13 Slow code example (R) 7:16 Fix slow code with ExtendedTask (R) 8:55 Slow code example (Python) 7:16 Fix slow code with ExtendedTask (Python) 10:46 Links to get started 11:06 ExtendedTask backstory intro 11:28 ExtendedTask vs. Shiny Async 15:50 How reactive programming works in Shiny 21:31 How ExtendedTask works in the reactive process 25:38 What we’re still working on 26:35 {future} alternatives 31:47 Wrapping up
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hi. In this video, I want to talk to you about a new feature that we've launched in Shiny for both R and Python that has to do with taking really slow, long-running operations and making them a lot nicer for your users. So, to talk about this topic, we have to talk about slow code. This topic is not interesting unless your app has some part of it that is running slowly. Ideally, you wouldn't have any slow code. Ideally, any code you put into your Shiny app should be fast and responsive. And that's where I would start. If you have something, some code in your app, some operation that is slow, the first thing you should try to do is make it fast.
And there's lots of tips and tricks that you can use for making your Shiny apps fast that I have talked about in a talk I gave in 2019, and I'll put the link right there. But despite our best efforts, sometimes things are just going to be slow. We might need to call a slow API, for example. Or maybe you're training a large model directly within your Shiny app. Or you're compiling a huge dynamic report that's driven from your Shiny app that you then want to present to the user to download. All these things, they might just be slow and there's no way to make them faster. And when that happens, that can be a really big problem.
Demonstrating the problem with slow code
To illustrate that, I want to show you an app that I wrote that builds on an example app by Virla van Leemput, who has a really nice repository on GitHub showing some different ways of tackling slow code in Shiny. And the link is there.
So here is an app that lets you add stock tickers. And you can put in one after another and it will add individual plots onto this page. Now, this code is actually not that slow. So we are going to make it a little bit slower with this text, this numeric input here. I can just tell it how slow to make it. Just making it simulate doing some operation. That's very slow. So if I do this, it's now going to take 10 seconds for this plot to appear. Now, unfortunately, because R is single-threaded, there is no way for us to do anything else while we're busy calculating this, you know, whatever this operation is that shows this plot, that fetches the data and shows the plot.
So if I were to do that again, if I set this delay to 10 seconds and add this while I'm waiting, I might want to add another. So clicking this doesn't seem to be doing anything. Oh, and no, it actually did register my mouse clicks. It's just that it was so busy coming up with whatever is behind this plot that all of my mouse clicks couldn't respond at all until suddenly R was free and then it responded to all of them. So now I'm in a pretty bad state. I've got all these things that are going to take 10 seconds each to load. I'd like to reload. So I'm going to hit the reload key and that's not doing anything. So R is so busy that it actually can't even process a reload. So this is not a great state of affairs for us.
Clearly, long-running operations can be a problem with Shiny apps. If you have something that's going to take 10 seconds, it's going to result in a miserable experience, not just for your own user, but for other users. This second piece of blocking other users from connecting or interacting with the app when they don't even know that there's somebody else doing some long-running operation, we call that intercession concurrency. So the inability to have two users using the app doing long-running operations is called intercession concurrency. And it also, as we saw, stops the user from doing other things in the same app. While this long-running operation is going, they want to do something that feels like it should be a quick thing that can happen on the side, and yet it can't. And that is called intra-session concurrency.
Introducing extended task
With the latest version of Shiny, we have now introduced a feature called extended task that will make this experience much, much nicer for your end users. This is a new feature in Shiny for R version 1.8.1 and Shiny for Python 0.7.0. And both of those versions were released in the first quarter of 2024. Extended task allows you to run long-running tasks for a user while preserving this inter and intra-session concurrency, which is exactly what we want. Unlike some previous asynchronous programming stuff you may have seen for Shiny for R especially, this extended task feature does not have some crazy steep learning curve. I'll get back to that in a second.
So I've taken that original app and I've now rejiggered it a little bit to use extended task. And let's see what a better experience we can have. So I'm going to change this delay again to 10 seconds to make it really painful. And I can add one. And then now when I try to add my second one, it happens right away. And I can do this to a third. And eventually these begin to load. And what's nice is that even while this is loading, I can play with some of these controls and you can see it's changing my plots very quickly. And it also means that if I am stuck doing a long-running operation, it doesn't stop a reload from working. Meaning it doesn't stop other sessions from arriving and doing their own thing while this long-running operation is happening for a particular user.
Code walkthrough: R
So a much much nicer experience. Let's take a look at what this code looks like. What it's like in the code to use extended task. And I'll show examples in both R and in Python. Now I'm not going to show the example that I just demonstrated. That's a more complicated app. So I'll use the simplest possible code snippet to show how you can implement this yourself.
Here is the code for not using extended task. Just doing a normal long-running task. We have an event reactive. It calls sleep because it's not an actual long-running task. We're just simulating. And then it returns a simple message. And an event reactive, like all reactives in R, you call it like a function to get the value out. So this is without extended task. This is just what it looks like when you do a normal shiny app that has something slow in it. And this will exhibit all of the problematic user experience behaviors that we saw with that first app.
Now porting that to extended task looks like this. First we take that long-running operation and instead of having it in a regular reactive or event reactive, we put it inside this new extended task R6 object, which we hand a function to, which does the slow operation. And the slow operation will also use the future package to actually have this not perform this task in this R process, but to launch a separate R process so that this R process can move on and do other things. So those two pieces are very important. Creating the extended task wrapper and then inside of your logic using future to wrap the long-running task. That creates the task. It doesn't actually run it. In order to run the task, you need to take this object that's returned, this task object, and you need to call invoke on it. And if you want to, you can pass arguments in, especially reactive arguments, into the function. And then elsewhere in your code where you want to show the result of this task, you want to show the result of the calculation, instead of calling this object directly like a function as you would with a reactive expression or an event reactive. Instead, you call this result method on it. And this will do the right thing regardless of whether the task has even started running yet, if it's in the process of running, if it's completed successfully or completed with an error. Result will handle all of those cases the correct way for you.
Code walkthrough: Python
Now let's take a look at the same exact logic in Python. So again, this is without extended task. This is a naive long-running task. In Python, instead of using a reactive expression, we have this decorator called reactive calc. And instead of event reactive, you add a separate decorator called event. So this is exactly the same idea as that R example. So we have a long-running task. It sleeps for five seconds. Just imagine, again, doing something actually long-running there. And then returns a simple expression. And again, elsewhere in your code, if you want to access that value, you call it like a function. And again, this is going to be really slow and janky and not feel great.
With extended task, we take that same long-running logic and this time we put it, number one, in an async function instead of a regular Python function. And we use this decorator, extended task, instead of reactive.calc. So that creates an extended task object called message task. This doesn't result in a function called message task. It is actually a full extended task object that we can call methods on. Once again, just the definition of this does not cause it to start running. We have to explicitly invoke it from somewhere else in our code. And you do that using a .invoke method. And you can pass in arguments as you see here. And finally, elsewhere in the code, if you want to do something with the results of this long-running task, we call .result and that gives us back whatever value is returned from here.
So I hope that helps illustrate why this feature is necessary and whether it might apply to the Shiny apps that you're writing. And if so, you can get started right away by clicking on the links in the description below that lead you to our documentation for both Shiny for R and Shiny for Python. But if you're interested, we can talk a little bit more about how we got here. How did we end up with an extended task feature in Shiny? And what are some of the attempts we made along the way to address the same problem that have some different trade-offs versus this new extended task feature?
History of async in Shiny
Back in 2017 to 2018, we, the Shiny team, introduced async programming to R and then to Shiny. And this was an incredibly technically challenging feature to build and resulted in something that I think was pretty conceptually elegant and very satisfying that we were able to bring these two pretty complicated concepts of async programming and reactive programming and to merge them in a way that felt consistent and logical and held together pretty well. That approach was to use the future package by Henrik Bengtsson to perform long-running operations in background R processes, not in the current R process. Future is a package that existed before async programming existed in Shiny. We didn't write it. We just took advantage of the fact that Henrik had already created this great package. But what we did need to create was a new package called promises to handle the results of these long-running operations back in the original R process in a way that wouldn't block that host R process.
This wasn't something that was really possible to do back in those days. And after creating that promises package and integrating it with future, then we had to rewrite big chunks of the Shiny internals to support async programming. And this was a process that took months and involved me coding at the very edges of my ability to get this working. And at the end of it, we ended up with a feature that worked, but there was always something a little bit unsatisfying about it.
Maybe more than a little bit. Number one, and most importantly, it had this incredibly steep learning curve. Learning to use Shiny async circa 2018, it was just a really difficult learning journey you'd have to go on. First of all, you'd have to be a pretty expert Shiny user to make sense of it. Number two, you'd have to get to know future pretty well. And this idea of executing code that was going to happen in a background process. And then number three, by far the worst, is promises was just really complicated to work with. Promises in general are a complicated concept. And one of the most difficult things about promises is that they're infectious.
And one of the most difficult things about promises is that they're infectious.
That once you have a function that uses promises, well that function now has to return a promise in order to really work correctly. And that means that any function that calls a function that returns a promise, those functions also need to return a promise. And you end up in this world where you want to introduce this sort of asynchronicity at this particular point in your code where something slow is happening. And then these promises sort of ripple through the whole rest of your code anywhere that directly or indirectly relies on that long-running operations result. They all become this promise-oriented syntax that frankly is a pretty weird syntax to begin with. So that was the biggest problem, I think, is that it was hard to use because of this sort of infectious property.
But also, even if you went through all of that, if you learned how to use this syntax and learned how to deal with this infectiousness that happens, all of that still didn't solve the problem of intercession concurrency. The way Shiny async was designed, it really only helped you with intercession concurrency. So your app, your experience using an app, was going to still be sort of slow and blocked. But other people could connect and have their own experience that was not going to be blocked by you. That was the best that Shiny async could do.
How extended task changes the reactive graph
And to explain a little further how that worked and how Shiny extended tasks solved this problem, I'll need to use some visuals. And for that, I'll use some diagrams from the amazing book Mastering Shiny by Hadley Wickham, which is the best source for understanding reactive programming in Shiny that we have today. Mastering Shiny is only for R. We don't have a Python version right now. But hopefully you still get the idea from reading the chapter on reactive tracking.
So this is a diagram that we use to illustrate how reactive programming works in Shiny. On the left, these shapes here are reactive values or inputs. They are pointed on the right to show that they have a value that can be read by someone coming in from the right. On the left, we have the opposite. So these are outputs or reactive observers or reactive effects if you're in Python. So these are either outputs or code that is going to execute just for side effects. And in the middle, we have what are called either reactive expressions or reactive calc, depending on whether you're using R or Python. And these are things that can both read values. They can read reactive inputs or reactive values, or they can read other reactive expressions, and they can be read. So they have this sort of shape on both sides. They can both read and be read.
So this is the view that Shiny has of your app when a user first connects. It knows that there are these reactive expressions or reactive calcs, and it knows that there are these outputs, and it even knows that there are these reactive values or inputs, but it doesn't know how they are related. And what happens is Shiny will look for the first output or observer or effect that it can find, and it just starts executing it. This is denoted in this diagram by turning orange. So something orange is executing. So this thing starts executing, and pretty soon it makes a call to this reactive expression here. That has not executed yet, so it needs to execute. So it turns orange, and that is going to read from this reactive input. It's also going to read from this reactive expression, which reads this input, and pretty soon it's done executing, and it turns green. Everything to the left here has turned green, and next this output is done executing, so it turns green. Repeat that for the next output and the next output, and pretty soon everything is green. Everything has finished executing, and this is what we call being at equilibrium. So Shiny is done doing all the reactive things it knows to do right now, and is just waiting for you to do something at this point. And everything from the first diagram to this one is called a single tick, a reactive tick. Tick meaning like a tick of the clock.
And that's because somewhere in the depths of Shiny there's some source code that looks like this. There's a loop, an endless loop, that's sitting there waiting for input to appear from the user, taking that input and recomputing anything that's reactive that needs to be recomputed, and then taking any changed outputs that result from that and sending it to the browser. So all of that is called a reactive tick. One trip through this while loop is called a reactive tick. And notice that only at the beginning of this tick do we check for input changes, and only at the end do we send outputs to the client. So in between we cannot respond to the client or the browser in any way.
That means that our long-running tasks, like in this case I've drawn one of the reactive expressions really large to indicate like it's going to take a really long time, that code executing and taking a super long time is going to block this part of the reactive tick. It's going to block the recomputation, meaning we can't get to the part of the reactive tick where we wait for more input changes or we send outputs. We are stuck in the center of this reactive tick. So in order to fix this problem we need this while loop to be able to keep turning over and over and over without getting hung up in this recompute all affected things step, and yet still have the operation occur, still be able to use the results reactively. So how do we separate the task from the tick?
So ShinyAsync solves this by running multiple graphs concurrently. So they each have their own while loop going essentially. But you can't run multiple tasks within a single graph because the shape of the graph is unchanged. Like without changing the shape of this graph we will never escape this inability to do intra-session concurrency. So extended task changes the shape of this graph and it does it by taking this long running operation and actually splitting it into two parts. There's the part on the left which is an observer that is going to launch the operation and a totally separate but related piece that is going to hold the result value that can be read. And I've put a little emoji here to represent a background R process that's going to actually do the work for us.
So when the Shiny app starts the background process is sleeping. It hasn't been told to do any work yet and none of these relationships again have been established. So remember what Shiny does is when it first loads up and it sees all these observers and effects and expressions and things, it picks the first output it finds and starts executing it. And in this case it's going to kick off the extended task and the emoji changes to the sweaty guy because he is now working and super tired. He's working very hard. While that background R process is executing, the rest of the reactive graph is able to proceed as normal and we quickly get to this equilibrium. Everything is done executing as far as the reactive graph is concerned. Although this background task continues to execute in a separate R process. So this is great. This means we've completed the tick. Any outputs that are ready can be sent to the browser and we can start waiting for the next input from the user.
So let's say the user touches some unrelated slider or input or button or something like that. That's fine. That part of the graph can respond, can re-execute, can get back to equilibrium, can send results back to the user and then start waiting for the next input. So this is like when we added a new Microsoft stock quote to the application that was working with extended task while it was still busy. This still works which is exactly what we wanted.
Pretty soon this background task is finished so it changes to this smiling angel emoji and takes its result and brings it back into the reactive graph via this second piece of the extended task, this reactive value. And that causes reactivity to trigger in all the right ways and the outputs that depended on that result are now updated and everything is now at true equilibrium. Not only is the reactive graph at equilibrium but we have no more background task running. So now we truly are just waiting for the user to do something else.
Summary and alternatives
So to recap, with this approach we're able to achieve both inter and intra session concurrency unlike the previous approach of ShinyAsync. Extended task doesn't require you to learn a strange syntax like ShinyAsync so we sincerely hope that more people will be able to adopt this strategy for their long-running tasks than ever did with ShinyAsync and it will not be as invasive a refactor as ShinyAsync would sometimes force you to do.
Now keep in mind that for R, extended task still relies on the future package to put the task in the background. If the task is not happening off of the main R thread for the Shiny R process then nothing we do is going to achieve inter or intra session concurrency. We're just fundamentally limited to doing one thing at a time so we really need to use that future. There are a couple of limitations you should be aware of also if you're going to use this extended task feature.
In R there is no support currently for canceling a long-running task. Once you have invoked an extended task you can't or there's no built-in way to tell it to stop executing. There's also no built-in progress reporting. There is the ability to use a button that you click to start the operation and that button will say like processing but having like a progress bar that shows you how far along you are that is not currently supported either for R or for Python. We would really like to add both these features. On the R side it's going to involve working with future to make that possible.
Now one more thing I said you have to use future to launch your long-running task and that's sort of true. You do need to use future or something like future. There are a couple other alternative or complementary ways to run R code in the background. Now future is great. It's very convenient and it has like a very magical API I would say like you just say future and then put some code in it and even if that code refers to variables or packages that are outside of your future code block it'll just sort of figure out how to bring in everything it needs or tries to you know based on some crawling around in your environment and will work pretty hard to automatically make sure all those things are available automatically in the background R process that it launches. That's pretty cool and the other really great thing about future is that it's popular and has been been used for years by many people so there are pitfalls but those pitfalls are somewhat known.
Now the downsides of future are it has like pretty high runtime overhead so you do lose some performance whenever you invoke a future and the other thing about future is it's quite ambitious as a project. There are tons and tons of options. There are tons of extensions. There are lots of different um policies you can use with future for how it schedules so it's a lot and it can be its own thing to to learn which which can be uh make it a little harder to get started with. And finally it's pretty complex because of that automagicness because it automatically like teleports the data and packages you need to your background R process. That's a little scary and it can be doing things that like you weren't aware of in terms of you know maybe copying more data than you thought it would or accidentally depending on an object that really shouldn't be transported across. So just when it comes to that kind of automagic stuff sometimes it's hard to form a mental model for what's actually happening.
So one promising alternative slash complement to future is called Mirai, M-I-R-A-I, which I believe is just Japanese for future. And that it's a new package by Charlie Gao that is similar to future in that it can run code in a background R process. The good thing about this Mirai package is that it's super low overhead compared to future. It's very very fast and it's it's designed to be very simple and easy to understand so there's very little magic that it does for you. It doesn't magically slurp in whatever variables you happen to be using inside of your code chunk. It won't automatically load packages for you that it that it can see that you need. Instead you give it and basically expression that's going to be evalved in some R process and if that expression uses variables or even functions that are not just going to be provided by R then you need to provide them yourself. You need to tell Mirai and here are the variables and functions that you're going to need. Please make them available over on the other side.
The downsides of Mirai are it's still relatively new so it is not as battle-tested as future so there might be things that you know failure modes that we don't know about, might have bugs, and the fact that it has the simpler model and doesn't do all this magic for you is a kind of a double-edged sword. It's nice in that it's fast and simple to understand but it can be less convenient if you do have a lot of functions that you're using or a lot of values that you need to transport to the background R process. You then have to do that all explicitly.
And finally there's a package called CREW by Will Landau that you might know from the targets package and CREW builds on Mirai so it uses Mirai underneath and it adds basically a convenient way to launch multiple or many tasks both locally and on like standard HPC clusters that you might have especially in like pharma environments. I haven't used CREW that much but it is designed now to integrate with extended tasks somewhat and it has some examples in its documentation about how you can do that so if you have access to a big HPC cluster or you have many many tasks that you want to launch and manage together CREW might be the way to do it and you can do it using extended task as well.
So to wrap up avoid long running tasks in your Shiny app if you can. If you have slow code first try to make it fast or eliminate it altogether and if you can't then look to these tools like the new extended task feature in Shiny for R and Shiny for Python which will let you run these long running tasks in the background and provide you with inter-session concurrency and intra-session concurrency which will be a much nicer user experience for your Shiny app users.
If you have slow code first try to make it fast or eliminate it altogether and if you can't then look to these tools like the new extended task feature in Shiny for R and Shiny for Python which will let you run these long running tasks in the background and provide you with inter-session concurrency and intra-session concurrency which will be a much nicer user experience for your Shiny app users.
If you have any questions about this or anything else with Shiny please drop by our Discord or our forum and we are always happy to meet users and hear about how Shiny is or isn't working for you. All right till next time!

