Resources

How does Shiny render things? | Gordon Shotwell

Discussion on the Shiny Python Discord channel from Gordon Shotwell. Gordon talks about how Shiny renders things with reactive programming, how other frameworks work, and how Shiny scales for complex applications. 0:00 - How Shiny renders things - reactivity 2:31 - How do other frameworks work 3:51 - Event driven programming, what's a better way? 6:53 - Runtime tracing 8:25 - Declarative programming 9:20 - Drawing a graph 12:05 - How reactivity scales 13:30 - Reactive calculus 18:45 - Question and answer Shiny for Python: https://shiny.posit.co/py/ Shiny for Python Discord server: https://discord.com/invite/yMGCamUMnS

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

OK. So the plan today was just going to be to talk about kind of how Shiny actually renders things, which is, I think, often we've sort of thought about it as maybe not exactly an advanced topic, but not something you would sort of start with right away.

And I think overall, it's actually really useful to have a just better understanding of how things actually get rendered and how Shiny does its tricks. And this is more important for Python in a lot of ways. Like, what I'm going to talk about is totally applicable to both R and Python. They work exactly the same way.

But on the R side of things, like when I was starting as a Shiny developer, it was just kind of like the way that you developed web applications. I didn't really think about it being a weird way to develop applications or a magical or great way of developing them, but it really is.

So these are slides that I'm just going to use from the workshop that I gave at PositConf. And so this is a little Shiny app here. And if you've kind of used Shiny at all, you might be familiar with sort of how the experience of things re-rendering. So I have these two plots that are both driven by the slider. So there's the slider whenever I make a change. Both of these plots re-render.

But if I change this slider, the only plot that re-renders is this one. So nothing actually happens with this plot. In the same way, if you have a multi-page Shiny app, the only plot, the only part of that, that will render the ones that are actually active.

And the question is, how does Shiny do that? How actually is it working? In particular, if you look at, this is the Python code for that application. This is the sort of total amount of code for how those plots are rendered.

There's this kind of weird quality here, if you're thinking about this from a traditional web development framework, which is that we told Shiny what to do, but we didn't tell it when to do it. So there's no callbacks here. There's nothing saying, when this changes in this way, fire this function, or anything like that. It just kind of knows. And it actually seems to usually do a really, really good job at that knowledge.

It just kind of knows. And it actually seems to usually do a really, really good job at that knowledge.

How other frameworks work

So how does it work? And maybe we're thinking about how other frameworks work. So Streamlet has this approach of re-rendering everything all the time. Some of these other Python frameworks, like Dash, Panel, and Gradio, focus more on defining a callback function, which is often referred to as event-driven programming.

So what event-driven programming means is that you're the developer responsible for telling things when to fire, when to re-render. So you're responding to an event. And it's great in a lot of ways, because it's a very natural way to think. You sort of are thinking about connecting the wire from one particular trigger to one particular output. And if you do that enough times, you can get really wonderful interactions.

But there's a few different problems with it. One is you have to do it. It's a little bit more work, I think. It's pretty easy to get wrong, especially if you have something like two elements are both should be updated by one callback, and you forget to do one of them. You can end up with your application in this kind of weird state. And it's easy to make those types of mistakes.

And most importantly, it's really easy to, it's sometimes hard to tell that you've gotten it wrong. So with event-driven programming, a big failure state is just that one of the numbers in your application is incorrect, or one of the values is incorrect, rather than getting a big, loud failure. So those things are some of the classic problems with them.

Runtime tracing

So a better way, and this is how Shiny does it, is to infer the relationships between components, and then to build a computation graph to figure out how those things should be rendered. So if I knew the relationship between all the different components, I could draw basically a directed acyclic graph, kind of like Airflow or something, or make, and just update the downstream elements whenever the upstream elements change.

And I have a really good model for being able to render things super efficiently, and render large things super efficiently. So if you ever hear somebody tell you that I have this magic way of inferring some sort of thing from your application, like check your wallet or something, it's often a little bit of a fool's gold. If you find this suspicious or don't trust that, that's probably a good sign.

Because for it to work, it has to work all the time. It's not good enough to just have something where 80% of the time it gets the right relationship, but sometimes it misses one. It needs to be really reliable. And we know this works because it's worked so well on our side for so long. It's a long time of people using this exact pattern to build very, very complicated applications that have lasted a long time.

And the underlying mechanism, once you get a handle on it, is obvious in some ways, and it's easy to work with. But one of the things is, it's not static code analysis. So this is one of the common misconceptions about Shiny is that Shiny is actually reading your source code and drawing those reactive links that way. And that's not how it's working. Instead, it uses something called runtime tracing.

I just want to give you an example of why static code analysis wouldn't work. And the reason is, there's a bunch of reasons. One is it's hard, and there's a lot of ways you can get it wrong. But let's say we did a really, really good static code analyzer to generate stuff. There's still some cases in Shiny that it just is impossible.

So this is an example where the user is actually writing text that is Python code to generate output. So this is a select thing, but you could easily imagine this being a text input. And Shiny is able to write an application that responds to that text. One way of thinking about this is that the way this application is running is actually depends not just on what's in the source code, but actually also how the person is using that source code.

So this is the code that generates that. And if you're an R user, you could do the same thing with some kind of like with eval as well. Like if you're doing any kind of metaprogramming or anything on R, you would run into that same situation. So you see here, if we did add a static code analyzer, this is what we would analyze. We wouldn't be analyzing what's in the user input because we don't know what that is. But this application for it to run properly actually needs to respond to that user input. And there's a lot of kind of dynamic user interfaces that you could build with Shiny that static code would fail.

So runtime tracing is a different strategy. And basically what it is is you just want to watch what all the components ask for and keep track of that. So whenever a component runs, it's going to ask for some things. If you keep track of those things, you know that when those things change, that component needs to re-render. So basically, you just watch for those components, keep track of those relationships, and then build this computation graph.

Drawing a graph

So here's a little example of some of this. So we have this kind of highlight with a little code. So you can sort of think of these as like recipes. So there's like a connection between this output and this rendering function. So this is like text and text. And there's also a connection between the inputs and the input elements.

So this is basically this renderer is saying like, so we know that basically when the user opens up this app and it's asking for the output of this text output, we know that that's connected. They're sort of asking for this text renderer. So we can draw one link there. And then we know that when this text renderer runs, it's asking for this input n. And this can be dynamic, right? So we don't need to actually look at the code for that. We can sort of keep track of the component when it's running to see what it actually is asking for.

So this can give you a little reactive graph like this, a slider recipe text. And then I'm just going to shorten that to be input to output.

So that's kind of like the overall strategy. And this kind of leads this, the term for this programming, this type of programming call it declarative programming is what Hadley calls it in the Shiny book, which I think is really good. And so what you're doing is you're like telling Shiny what should be generated, but you're trusting that the framework is going to keep everything up to date.

And I just watched the Bear, which is this great TV show about menus. And so this is like, you're setting the menu and you're not cooking. And if you've ever seen that show, there's like when the head chef is like in the weeds, cooking, yelling, yelling, very stressful. And then at some point later in the show, he becomes, he like gets the restaurant in a good enough state that he can kind of like just handle setting the menu and not worry so much about the sort of details, right? So in this case, like you want this to be you, you want to be this person, not this person, right? And that's how Shiny works. It's expecting you to just tell us what to do and then trust it to re-render things properly.

So this is that application again. And if we think about this, there's like two inputs, there's this slider and this checkbox, and then there's two outputs. There's this outputs, there's a scatter plot and a distribution plot. And if you look at it, it would look something like this. So you have two outputs, two inputs,

And then at the beginning, Shiny doesn't know anything about the relationships between these components. It just knows that they exist. And the first thing it does is it just picks one of these outputs and it says, let's calculate this output. And when it calculates the output, this is gonna ask for checkbox and slider. And Shiny is gonna watch that and say, okay, last time you ran, you asked for checkbox and slider. So I know that you have a dependency on those two things right now.

And it's gonna go over to the disk plot and see that the disk plot only asks for slider, doesn't ask for checkbox. And here we have the DAG for this application. It's a very, very simple DAG, but the same process scales up to kind of an arbitrarily large number of nodes.

Okay, so what happens when a slider changes? So when the slider changes, we know that there's a dependency here. So we can follow those down and say, these two things need to, are called invalidated. So they need to be recalculated. And then it forgets all the dependencies that goes into it. It says, okay, whatever I knew in the past might not be true anymore. So I'm just gonna totally forget about it. And then it recalculates. In this case, it generates the same graph, but it doesn't have to. It could generate a different graph depending on your application. Like at runtime, it's sort of on the fly figuring out all those relationships based on what things have asked for what.

Okay, so then everything's updated. If we ask for the checkbox, we change the checkbox, does the same thing. It follows this down, says, oh, there's a scatterplot there. Last time we ran that asked for a scatterplot. So that's gonna be out of date. Need to rerun the scatterplot renderer. So I'm gonna forget the dependencies from the scatterplot, but I don't, since this diskplot is fine, I don't need to change any of those things. I don't need to do anything with this one. So I'm gonna leave that part of the graph intact. Then it's gonna recalculate. It's gonna get both of its values. And then we're up to date.

How reactivity scales

So the great thing about this is that this is a pattern that scales really well. So a lot of the other Python web application frameworks, they actually usually have like several patterns embedded in them for different levels of app complexity. So you write, you know, a dash app one way for when you're doing a simple application, and then another way when you're using a more complicated one, you might be using much more caching or something like that. You write a Streamlet app in one way at the beginning, and then over time, you need to kind of manage state yourself.

And I think similarly for all of them, this panel has three different, basically like APIs for updating different components. But for Shiny, every app uses the same exact pattern. So when you kind of understand how this works at this little toy level that we were talking about, you actually are really far along to understanding it at a much larger level.

But for Shiny, every app uses the same exact pattern. So when you kind of understand how this works at this little toy level that we were talking about, you actually are really far along to understanding it at a much larger level.

It works for dynamic AIs, UIs, and it's also lazy. So that's kind of one of the reasons why some Shiny apps can be so efficient because if we're back here, right, this is only because the user is trying to ask for this. So if the user actually is not asking for scatterplot, like it's not on the screen, there it's on a background tab or something like that, Shiny just doesn't get around to calculating it, never bothers to, it only does that when it actually needs to, or if you are the developer, like say it needs to recalculate all the time.

Reactive calculus

So that kind of leads to these types of efficient things. But where we're at now is like, just with what you call like flat reactive graphs, where they all go input to output. And there's another pattern that adds a lot of depth to this thing, which is reactive calculations or in our reactive expressions.

So the decorator in Python is a reactive calc, and it basically like creates calculations that whose results can be used by other renderers. So it's not rendering itself, it's something that you're gonna be able to pass on. You can think of it just like a reactive variable.

So here's an example. So this is an application, this is again a kind of silly toy application where we have a one table and we have a plot of some random data. And up here, this kind of looks fine, but if I just sort of test it a little bit and say I produce like three values, you know, maybe like a few more values, you see we have like some of these are not actually, oh, maybe I've actually, yeah, so this is a good example. So here I have like 0.13 and 0.12, but over here on the plot, I have 0.8 and basically zero. So like, what's going on? Like why these are different and that's like a very confusing user experience.

And the reason they're different is because I'm actually taking the sample twice. So this is the code for that application. And I have basically in the renderer, this NP random ran. So this is a NumPy function that takes a random sample and this is being produced here and it's being produced here. So these are gonna be different, right? Because every time I do this thing, it's gonna take a new, the whole point of that function is it takes a new sample. And this is kind of what that application looks like. I have this one slider that's changing and I have sample and table over here and sample and plot over here and they're not gonna be the same.

So this is both, there's a lot of problems with this code. And in this case, it's defined in multiple places. So I have to change it in multiple places. The big problem is it's taking the sample twice. So it's not the same, right? And that's gonna cause me a lot of headaches if I'm trying to like build up intuition about how this sample works. And they're not using the same sample.

So if you use a reactive calculation, what you do or reactive expression in R, is you take that code and you define it in a function and you decorate that function with a reactive calc. And then when you're using it in your renderers, you refer to that function down here. So you're not actually taking the random sample there, you're referring to this bundle.

And if this was a regular function that would just result in the same behavior, right? It would run through these things every time it was called, but a reactive function, the first time it runs, it caches its value and then returns that cache value instead of rerunning.

And the main sort of thing that it does is adds nodes to this reactive graph. So I'll just show this to you in a picture cause it's a lot easier. So this is kind of like, again, Shiny starts out here with this table, with this reactive calculation, and it doesn't know the relationship between any of these things. I'm putting this hierarchical, but it doesn't even know that it's hierarchical. It just knows they're things.

So table runs and when table runs, it calls the sample reactive calculation. And then you run through the sample reactive calculation and it says, oh, I need slider. It goes out and gets slider. And then Shiny has the relationship between these three things. When the plot tries to render, it goes and tries to get sample, but since sample is a valid reactive, it hasn't been invalidated. It's Shiny knows that it's correct, right? Slider hasn't changed since the last time sample ran. It just returns the old value for plot.

So this makes it really good for some of these situations with sampling where you need to make sure that it's only happening once, but it's also great for just reducing computation. So this could be a long running model step or a database query or something like that. And it would sort of automatically perfectly cache the results for you.

So then we have a deep reactive graph and these things can be stacked on top of each other. You don't need to have just one layer of reactivity, of reactives, you could have as many as you want. So when the slider changes, it does the exact same process. So it follows that down. It says, okay, sample's invalidated. And since we're now in like a deep reactive graph, sample has to do one more thing, which is, or it has to do one more thing, which is like tell all the downstream things to invalidate as well. And so it does that and then everything's invalidated. And then it does it again. So it goes to the table, gets the sample, sample runs, goes and gets the slider, plot runs and so forth. And then we have our updated graph.

And that's the end of what I was going to talk about today. So I think that's kind of like the basic idea of how reactivity works for both Shiny for R and Shiny for Python. And I think we'd kind of master those little like small things as your app grows, you can kind of run into some bugs where really what's happening is you're, usually you're trying to use an event-driven pattern for when you should be using a reactive one, but also just helps you debug and figure out like, why is that not rendering? You know, it's probably because there's some sort of, one of these things is being detected in an unexpected way.

Question and answer

So a question from Ahmed, are there limitations on the number of intermediate calculations allowed by Shiny? No. And this is one of the great things about it is that you can have as many of these as you want. I would say that, I mean, there are limitations, just the more calculations you do, the slower your app is going to be. But for the most part, so long as you're like following like a non, like do not repeat yourself, you can have as many of them as you want. Like the, having something in a reactive calculation is the most efficient way that Shiny can render it. So that's your best strategy rather than having logic sort of split across renders or anything like that.

So it's pretty hard. So the question from Paul, what happens when your graph contains cycles? So there is a function which is like, so you're not, it causes a crash, right? Cause it's like, it doesn't know how to update things. There's a function, show the screen again.

So there's a way of kind of starting to, so this is kind of like the basics of reactivity, but there's a lot of ways of managing it manually. So one of them is, sorry, reactive isolate. So reactive isolate is basically a way of like getting the value of a reactive, of an input or reactive without drawing that reactive link. So this is kind of like if you ever end up in a place where you do have a cyclical graph, usually like adding and reactive isolate just breaks one part of that to get out of that pattern.

The other way that you can work, the other thing that's handy is this function, this decorator called reactive event, which basically is a way of manually drawing one of those things and saying like, only react to this one thing and not anything else that I might be calling. So that we do have patterns for avoiding that, but yeah, you can't, you're not allowed to have, it'll cause a crash whenever you have a circular grid dependency like that.

Ah, okay. I do have the answer. This is from Calvin. When will the Shiny package get its next big update and what will be in it? Should be soon. I don't know exactly when we're doing the release, maybe this week. This week or next week. The main thing that's happening there is we've had these components in this experimental section over here. Just while we were kind of sorting out some of the, like some of the, one of our goals is we want to have like the R and Python packages, like reuse as much of the same code as we can so that we can make sure that when we're developing something for R, it comes to Python quickly and vice versa.

So we had a lot of conversations about how to do that. And we're done with those. We're able to bring over basically all of these experimental things into the main package, which makes you a lot easier to like use them. So that's the main thing that's coming next in this release. I think it's the sort of mostly what's in there. And then, yeah, we have some other things around, I think tables is our next main thing we're going to keep working on, as well as making about sort of like really early user experience of building Shiny apps easier to kind of like simplify some of this stuff.

We've sort of gotten some feedback where it's like, they like Shiny for complicated things, but for simple things, there's a little bit too much boilerplate, too many decorators, sort of trying to figure out how to simplify that, but that'll probably not be next release, but the one after. But we should have a October release up soon.