Resources

Barret Schloerke || {reactlog} Rundown || RStudio

00:00 Introduction to Reactlog 00:44 Viewing Reactlog using an Old Faithful Shiny app 02:07 The Reactlog interface 04:31 Walking through a reactive graph with Reactlog 05:14 Downstream dependency invalidation in Shiny 06:43 How Shiny "grabs" data 09:41 How the Reactlog timeline works 10:46 Switching between idle states in Reactlog 11:58 Reactlog interactivity - clicking a single item 13:21 Reactlog with the Pythagoras Theorem app 15:45 Adding a UI and server value to add Reactlog to your Shiny app 18:05 Walking through the reactive graph using the Pythagorean Theorem app 21:07 Append-only behavior of Reactlog 21:18 Marking a time point in Reactlog 23:17 Using Reactlog to debug reactivity 26:55 Resetting our app and testing logic changes 28:01 Reactlog with a large Shiny app, CRANwhales 34:10 Freezing reactive values 36:19 Calculating click count in a Shiny app 37:10 Click the button, render the plot is bad - see why Shiny is an R package from RStudio that makes it incredibly easy to build interactive web applications with R. Behind the scenes, Shiny builds a reactive graph that can quickly become intertwined and difficult to debug. reactlog provides a visual insight into that black box of Shiny reactivity. After logging the reactive interactions of a Shiny application, reactlog constructs a directed dependency graph of the Shiny’s reactive state at any time point in the record. The reactlog dependency graph provides users with the ability to visually see if reactive elements are: - Not utilized (never retrieved) - Over utilized (called independently many times) - Interacting with unexpected elements - Invalidating all expected dependencies - Freezing (and thawing), preventing triggering of future reactivity There are many subtle features hidden throughout reactlog. Here is a short list quickly describing what is possible within reactlog: - Display the reactivity dependency graph of your Shiny applications - Navigate throughout your reactive history to replay element interactions - Highlight reactive family trees - Filter on reactive family trees - Search for reactive elements You can read more about reactlog here: https://rstudio.github.io/reactlog/articles/reactlog.html And you can learn more about Shiny here: https://shiny.rstudio.com/ Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/ Content: Barret Schloerke (@schloerke) Design & editing: Jesse Mostipak (@kierisi)

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

So, reactlog is a memory leak by definition, so we have to opt into it, and we can opt into it using reactlog, reactlog enable, and so I'll go ahead and enable that now, and then we can go ahead and run our Shiny application, you know, no problem, just like we're used to doing, and this one is the, you know, the old faithful example that we're used to seeing of all of the waiting times, and so we can change the bin count and the plot updates, great.

We can then view the reactlog by doing command FN3 for me, and it might be like option or control FN3 for Windows, I can't quite remember, but your reactlog will show up in a separate window, and there's lots going on here, but for just a quick look, we can see our input bins that we had on the left, and on the right we can see our output of our distribution plot, our dist plot from the application, and let's take a look.

The reactlog interface

So, the way that reactlog works is it flows from left to right, and I kind of made the pieces so that they kind of fit in like a puzzle, so you can have your inputs here on the left, and they can join into your reactive values in the middle, so it joins in there and needs to be finished off with an output value on the far right. In this case, there is a theme counter here, but that's not having any outputs listen to it, so that's why it's all the way on the right, so we kind of align right on all the values, but theme counter we can ignore, we're mainly just interested in the output of the dist plot.

The reactlog, if we look, let's just kind of walk through the top of this status bar. So, we have the eight navigation buttons at the top. The first and last are traversing between marked steps. We haven't done that yet, and we'll come back to that later. The next ones is a tall green bar. That's this tall green bar here, which is when Shiny was idle for at least a millisecond. It declared that it was idle, and then the short green bar is an output, so an output calculation, and you can jump between prior and previous outputs, so you may calculate five outputs, and then Shiny becomes idle, so you may want to jump in between to see how your reactive updates, but the ones that I use a lot are these middle two, which is previous step and next step, and I don't like to use the mouse that often, so we can actually use the right arrow and left arrow, and you can hold it, and you can hold it back.

Reactlog defaults to the first time that Shiny was idle, because it doesn't actually know where you want to start. I find starting at the end was where I launched Reactlog was too far away, because most likely I want to build up into the unexpected error, or build up into what I'm trying to learn about the application, so I started it at the first stable location, and stable is kind of when most of the outputs, or in this case all of the outputs and reactive values are green, they're in the ready state. If they were not in the ready state, and they're actually performing some calculations, they'll be yellow.

It's kind of useful to see these different states. In the future, I'd actually like to show more states, like a red for an error occurred, or something like a rec false, but I was just trying to show currently what Shiny is aware of, because technically a rec false is a valid result, and so it's not calculating anything, it is actually ready to move on, even though the value is not what we want. So it'd be better in the future if we could differentiate those.

Walking through a reactive graph

So now we're in idle, and let's take a peek at what happens when, remember I changed bins in the plot. So in the sidebar, I changed the width actually changed probably due to the width of the plot actually was changing probably due to this little showcase mode on the side, and so two times the plot was redrawn because the output width changed. So let's walk actually through when I changed the int of the bins, and so I changed it from 30 to 21, and the way Shiny works is all downstream dependencies will be invalidated, and they will remove all connections both up and downstream from their connections.

So it's not fully invasive, but it is invasive for everything downstream of this input value. So let's kind of walk through it. This will tell, so the input bins will say, hey, all my downstream listeners, you are now should start invalidating, so plot output, or plot obj will start invalidating. He's going to tell of his downstream dependencies, which is going to be the output dist plot. Output dist plot says, okay, I will start invalidating. They will start invalidating by removing all upstream dependencies, so the black lines are now becoming thin little lines. The thin lines I kept in there to say a connection has occurred in this recording, but it is currently not there, where if it's a thicker line, then that means it's actively a dependency.

Okay, this plot has removed all connections, and it is invalidated, so now we are backing out of our invalidating process, and plot obj will remove its last dependency and is now finally invalidated, and then the input value is done, and it will just automatically jump over into the next step, which is the output starts calculating.

How Shiny "grabs" data

Shiny works in a model where it finds all the data that it needs to render. It is not necessarily a push model, so I did not push input bins into the plot object. It's actually the plot object is reaching up and finding input bins. It's a little different than the way we think about it. It's because of the two different steps, where the first part was input bins tells all downstream to invalidate, but then it's a free for all. Finally, now output dist plot can then work its way upstream until it's satisfied.

Shiny works in a model where it finds all the data that it needs to render. It is not necessarily a push model, so I did not push input bins into the plot object. It's actually the plot object is reaching up and finding input bins.

So output dist plot starts calculating, says, hey, I need the plot object. Plot object says, I haven't calculated, so I need to start calculating. Then it says, well, I don't have everything I need. I need a connection too. In this case, it's these unexpected three items here. The client data output dist plot width, client data output dist plot height, and also client data pixel ratio. These three pieces of information are necessary to draw an image in Shiny. And I think it's really neat because if you squeeze the plot, we should redraw it because it doesn't really look like it's going to fill the proper area.

If you, you know, make the plot taller, we will want to redraw it. So your output height and width will change. And then your pixel ratio is if you actually do like command plus or command minus, and you know, you go to retina display or non retina display, the quality of the image will need to be adjusted. And so in this case, plot object is going to be retrieving those values via an isolate. That's why it has a dashed line. So it's isolating on the value and then it removes the connection because an isolation does not have any reactivity. And then we get the pixel ratio value and the input bins. At this point, plot object said, Hey, I got all my values. I'm calculated.

So we can go back to the dist plot and continue the calculations. In this case, dist plot does have a hard dependency on the, um, uh, output dist plot width, the height and the pixel ratio. Uh, mentally thinking, I thought it would actually be inside the plot object, but it just happens to be in the output dist plot. Dist plot is now satisfied and it is done calculating. All of the outputs have now been calculated. So therefore Shiny is idle.

How the reactlog timeline works

Uh, yes it was. It happened at 11.45 and it's finished at 11.51. So it's six hundreds of a second. Yeah. So the timeline across the top does not represent actual time. It represents steps taken. And so you can have many steps taken in a very short amount of time and you can have hardly any steps taken in a very long amount of actual time. The time up here represents the time since recording or starting the application. And then the session represents what Shiny session it belongs to. Typically when you open up a reactlog, that session will be isolated to any global sessions such as Shiny idle or any session that I am running in. So in this case I'm running in the Shiny session two D seven, blah, blah, blah. And between these two steps, about 10 seconds had elapsed so that Shiny was idle at the one second mark. And then we started changing the value 10 seconds later.

Um, and just a quick to, uh, to show about switching between idols. So if we click the button here, we can go back and forth between idle sessions. Most likely that's something where you know that you've done something within the application. Uh, the graph doesn't necessarily change because the app is fairly small.

Reactlog interactivity

And the last one that I want to show in this, uh, is the searching. So you can search for these labels here. Um, so in this case I can search for, let's say, um, plots and all the things that show up or even capital plot. Um, these three items all contain a capital P and plot. And this is a smaller app, so the searching isn't quite as effective. But if I did the lowercase plot, um, it's still the same graph cause the ancestors and descendants are the same. Um, another way to go around this, uh, I like to hit escape to escape that searching that was done.

Uh, another fun one is that you can double, you can single click an item so it kind of shows up, uh, while you move around and do things so you can see that one specifically, uh, escape to get rid of it. But what I like to do is to double click and it will kind of make it bigger. It'll highlight it and it will find all of its ancestors and descendants, its whole family tree. And when you move back and forth, it will only be for that tree that is displayed. So if there are other reactives, you have a thousand other reactives above or below that aren't shown, we'll skip over all of them. So we'll only look at the reactivity for the sub tree that is displayed.

And that is really useful because when you have an app like gradient, um, I think on load it has about 2,500 reactive, uh, IDs created. Yeah, it's, it's big. It's a very big app. Um, it's wonderful for testing by the way. Um, and, um, has a lot of different corner cases, but, uh, yeah, it's a very large graph, very shallow graph. Actually I was very surprised about that. Not very many, uh, you know, dependencies, you know, something like this, like three layers deep. Um, but useful nonetheless.

Reactlog with the Pythagorean theorem app

So let's look at a more direct example. Um, looking at, uh, Pythagoras, um, app. And I like to look at this one, um, as an app that people aren't necessarily familiar with because we know the Pythagoras theorem, or we know how to calculate it, but we can do this through reactivity because if I change a or B, then my output C should also update, you know, just, that's how the definition of the triangle is done.

And let's look at the, um, how we would calculate Pythagoras theorem. So to calculate that, we need to first calculate a squared. We can label it with a squared. And I highly recommend you giving anything that you can a label. Um, because most likely you'll have a multi-line expression and possibly your first few lines might be very similar to other reactive expressions. And so your label default label is not going to be that useful. Instead, we know this is a squared, so I can give it a nice value of a squared. Um, this is underutilized, but it's very effective when viewing the reactlog. So future you will thank you for your work, uh, that you do up front.

So we have a squared calculate. We then have B squared and we calculate it. Great. Given these two reactive values, we can retrieve them, add them together to make C squared. And then finally, given C squared, we can take the square root of it and have our answer C. Um, this value is then displayed via render text and also a render print just so that we have more outputs to look at, but it could just be done with a single render printer, render text.

Adding reactlog to your Shiny app

Okay. So let's run this application in the top. I have a reactlog, reactlog enabled to enable the reactivity. And then finally I will call command FN F3 just to open up the viewer. So let's run the application. Cool. Ah, so this is a fun surprise. I skipped over it in the code. Um, you don't necessarily need to always open up the viewer. Instead, what you can do is add a UI value and a server value. The UI value is reactlog, reactlog module UI. So it's a UI module that you can add into your application just so it's inserted it. The reactlog is inserted in and you don't need to open up in multiple pages. But to pair with this, we need to have the reactlog module server so that we can update those values in the UI. It won't magically just occur. Shiny needs to be told that these things can be done.

So that's where that's coming from. So in our application we have our a, we have our B and we have our C, you know, output C and output C two different outputs and the reactlog looks like this. So we have a to a squared B to B squared a squared and B squared makes C squared. C squared can be square rooted into C and C is then stored in the output C verbatim and C text. And that matched exactly how we had in our application. A input a to a squared input B to B squared a squared and B squared point into C squared and the square root of C squared makes C and that matches our labels. Everything is nice and tidy and we can see our images right in the app.

Very exciting. Um, you'll notice that there's this extra item of a theme counter. Uh, that's for things like a bootstrap lib where we update the theme and do things like that. You can feel free to ignore it. It's not a big deal. And then this extra line here is the, um, reactlog module UI and the iframe because this is actually an iframe to the reactlog information. Uh, so we can just kind of feel free to ignore that.

Walking through the reactive graph

Okay. So let's look at what happens when we change this to five and 12. We get the five, 12, 13 triangle. Great. Not too big of a surprise. Let's refresh our reactlog because that more reactivity has occurred and it only loads it up to the first time when we did, um, the application when it was idle. So we can walk through this application, um, using our arrow keys. And when we change the value, a went from three to five. So all the downstreams get invalidated and they're invalidating by removing all of their dependencies and marking themselves as an invalidated.

And then they work their way back up. And B squared was not a direct dependent or descendant of a. So therefore B squared will keep its value, but it will just remove the connection from B squared to C squared. C squared is actually initiating that removal of the connection. And so B squared is going to stay calculated. And if you were to ask its value, it would return immediately. It would not need to recalculate because it was not invalidated.

And then once all of those invalidations occur, Shiny sits there and goes, okay, what outputs need to be calculated? And we'll walk through that again. So C verbatim says, Hey, I need a value. It cat captures C C's needs to work its way up because it doesn't have a value. A squared works its way up, gets a calculate its value. C squared is now happy with a squared, but it also needs to be squared. B squared in this case has already been calculated. So it returns its value immediately. No time is spent calculating B squared up the tree back down to C verbatim.

C verbatim is happy, but we still have more outputs to calculate. So we'll move on to C text. C text will then look at C and say, Hey, I need your value. C will be like, great, here's my value. We're done. There's no more calculations needed. Shiny has this automatic kind of caching value already done. And so if a reactive value has not been invalidated, it will return its value immediately. And no time is spent calculating that. And I think that's just great. Like it's a neat little trick hidden underneath. And if you were to add an output dynamically like right now, and all it needed was the C value and B squared, it would be instantaneous. You wouldn't have no extra overhead. And that is great.

Shiny has this automatic kind of caching value already done. And so if a reactive value has not been invalidated, it will return its value immediately. And no time is spent calculating that.

And then we're done. Shiny is idle. But we also then updated the value to 12. And we ran through the same process again, but we went up the B branch instead of the A branch. Done. Shiny is idle.

Append-only behavior and marking time points

And then the next steps actually are me clicking the refresh button, and then the iframe updating. When you click the refresh button, though, there, it will always go back to that first idle location. And the reactlog is append only. I don't cut it. I don't do anything special. I just depend on because I don't want to hide the truth from what's there from you. So what you can do is actually mark a time point. And I find this really useful because most likely in your Shiny application, you will have to do many steps to possibly get to your bad reactivity. And then you'll say, hey, I need to like mark this time point because I know the next steps that I do may be a little wonky.

So to mark a time point, you can do command FN F4, or a little bit more reliable, given your browser, you can do command shift FN F3. So very similar to command FN F3, but you add in the shift. And what this will do is it'll mark a time point. So when we refresh our application, it won't load to where it's first idle, it'll load to the left to the last marked time point. So you can have many marked time points, but it'll load the last one. And this is really nice because if we go back to like a three, four triangle, right, so three, four, five, and then I refresh, it was loaded up before I hit that before refresh. And so we can go, I'm sitting in a five, 12, you know, 13 triangle. And if I were to move forward, this is now running through and converting it back to the three, four. So right where I last marked my time point.

So for completeness, I'm going to do another marked time point and I'm going to refresh the reactlog. So you can see the marked time point down here and I can actually jump between them using the black bar because that is a marked time point.

Debugging with reactlog: the broken Pythagorean app

Okay, so let's look at a situation where reactlog will help us debug our reactivity. We had our Pythagoras application here and now let's look at a broken one. So the difference between Pythagoras and Pythagoras broken is not that much. The app will run and no errors will be produced. This is actually terrifying because this means it's a logic error and it is not actually a Shiny error. So let's go ahead and run the application with our reactlog enabled.

All right, so we're running our application. Great. So what is going on here? Our C value is 4.2. It is not a 3, 4, 5 triangle and C squared is listening to B squared sometime because there's a connection being made, but it's not solid. And what is B squared is listening to A? This is weird.

So let's change it to a 5 and a 12. Still 7. Okay. So odd. All right, so let's look at this application. So did, did actually it change? Yeah. If I change B, nothing happens. So this isn't good either. So walking through the reactlog, let's kind of see what happens here. So this is when I was changing it into that 3, 4, 5 triangle. So A changed into, or a 5, 12, 13. So A changed into five. Great. A is telling B to invalidate. That seems odd.

So, Oh, there it is. Input A is B squared. We should switch that. So that is now B. So personally, that's a better logic change. Let's just, uh, rerun the application here. Um, so we'll run app again. Okay. Things are looking better. Let's switch this to a five and then we switched this to a 12 and B is still broken. That's not good, but we can see that B squared is taking an input from B. So that seems better.

So we're doing better there. Um, so let's, um, walk through that adjustment. So given that adjustment three turned, A turned into five, A squared, C squared. Okay. Everything's good. Seems right. Output C verbatim is recalculating, getting it's A squared and C squared is isolating on B squared odd and moving on. Okay. So the, so there is that, that's how that connection was made at some point, but it's not connected right now. And I guess that's why it makes sense that when B changes its value all the way up to anything like 20, C is not changing.

So, ah, isolate value here. We removed that B squared. Great. So we'll save it. Let's kill the application because we made a, a known logic change. That's better. And things I believe will be good. So five and 12, five, 12, 13 triangle. And if we refresh our reactlog, we can see that A makes its updates and we can also see that we change B and it'll make its updates and C was actually updated twice, which is what we want for all the changes in B and Shiny is idle. So that's how we can use it to help debug and it's helped debug logic changes. And this works really well when your application is very large because this can work with a couple of thousand of these different reactives. No problem.

cranwhales demo app

Just to show a bigger application, I'd like to look at the cranwhales demo app that Joe showed at one of his keynotes. So again, we're going to make sure that, um, uh, we're going to make sure that reactlog is enabled. We're going to run the get hub one, uh, application and this application shows the top end downloaders of all of a crayon that's hosted by RStudio. And it's usually the 80 20 rule where like six people are responsible for a lot more than 20% or 80% of the traffic, you know, depending on the day. Uh, so we're thinking about it here and downloading that data.

Great. All right. So the data has shown up. We have some numbers up at the top of the bandwidth consumed the files downloaded the unique downloaders. So out of the 120,000 down unique downloaders, the green bars are represented by six people. So these are the whales. And so it's the cranwhales. And, um, yeah, that's, that's, uh, almost 50% by those six people. Um, the biggest whales, uh, are given random names. And so the full puppy, uh, he's the biggest whale. And by hours, actually very interesting. There's different behaviors. Um, full puppy looks kind of like internet traffic. Um, the glamorous parakeet pretty consistent and, uh, the glamorous fish, very spiked behavior and, uh, immense zebra looks very constant.

So, um, moving on now. Uh, so let's look at the reactlog. So the command FN F3, this is a lot bigger. Um, we have a, so this is the first time that Shiny was idle and we are looking at the, remember on that home screen, we had three numbers up at the top and a plot. So we have our outputs that have calculated. We have the, um, we have the total size. Okay. We have the total count and we have the total number of downloaders. Great. And then finally there's this plot output called all hour. And I'm recognizing this as a plot because it's listening to the client data pixel ratio, the height and the width. And there's a plot object as well. So that's kind of like the shape of an output plot. And these all listen to a data set called whales, which listens to the top end or the top count and also what date. And I think that's really kind of neat that you can see your data. Everything's kind of pointing to these whales and then things come from there.

Um, so we can actually look, um, yeah, so it's neat. Then the next thing that I did tab. So the second tab will start and, um, let's see here. And this will calculate output downloaders. So output downloaders then says, Hey, I now exist. So let's make a plot because the output downloaders wanted to look at the biggest whales. It's just the single plot and it runs through, gets the plot object, plot object, gets all of its information and it pulls out the whales. Whales is already calculated, so we don't need to recalculate it. So it just takes the value right away and then the output downloaders is done.

And then this happens again for the next two or three tabs. And then the, uh, detail size, detail count, detail unique because whales by hour detail view. That was a very big plot. Took a, took a little bit of time to show up, but we had the given full puppy. This is the traffic that they were doing. Very funny. Uh, not very many 1800 unique packages, but downloading them constantly. So someone had a wild true download file.

Um, yeah, so, so we're going to go through here. So that's how the, the plot information will show up, uh, for that one piece of information. And if you noticed there was actually a blue state of frozen state, and that's what happened with one of the input values that we froze. Um, so let's zoom back, see if we can find it quick. There it is frozen. This input details was frozen because, uh, we're trying to avoid a situation where if you ha if you open up a new view, but you haven't had all of your selected values like actually selected, then you may want to freeze that input value so that your plot doesn't error out. Um, you can do this with maybe like a rec false instead. Um, that would cause it to update once the input value does, but you still may need this special case of frozen. So if this does like, if you have this funny situation with plots and you can see it just for a quick second, uh, maybe check out net, uh, something about freeze reactive values and you can see more information in the docs there. It's a rare situation, but it does happen.

Uh, and then last I want to show that, uh, uh, uh, ability to do the subsets. So we were looking at all hour to begin with. That was that home screen. So we can double click on all hour and all of the right and left keyboards, you know, so I'm actually going to jump many, many steps in the reactlog because nothing occurred on the subset. But if I back up just a little bit, then Hey, everything starts updating quickly. Um, white, uh, just to see these here, white values are reactive values that will exist eventually, but do not currently exist. So input count has not been touched up to this time point. Um, and now it finally has at this time. Uh, that's why I kind of like to show it when it's first idle because everything's kind of stable just for a little bit.

The observer anti-pattern

One thing to look for with reactlog because you'll find some shocking behaviors every once in a while. And this is what, uh, Joe referred to is in one of his talks as an anti-pattern. And basically what happens is if you observe on something like an input button and you use a render method inside an observer, this is typically bad. There are, there are use cases like you add a new tab where like it has completely independent reactivity. Cool. You should do this. But in this case, if I just want to render a plot to bad pattern, something that's a static name and this expression will actually be evaluated many times that is actually making new render plots over and over and over again. And that is bad.

Instead, what we should do is calculate some data such as a click count and call render plot and wait until the input button has been clicked. So like I require the input button to be not zero and then I'll give you the input button. Like that was the goal of this thing, such as when the button is clicked, render the plot. So instead we can say make a reactive on input button that says you require it to be something truthy or non-zero. You'll have click count and then we render the plot given the click count. If you really wanted to be dynamic and random, you could do it using a dynamic output ID. That's okay. That's kind of like inserting a new tab or doing something that has completely independent reactive logic. But the typical use case of like click the button, render the plot, this is bad. And we'll see it in reactlog plain as day.

So let's click the button. Okay. Click the button. Okay. We got some new plots. Let's actually take a look at this reactlog and you can see that the reactlog will get taller and taller and taller. And this is because we will have more and more outputs of output dollar bad pattern. So let's move, move forward in the reactlog. And so we can see that output, this output bad pattern will be calculated at some point. And then it's finally calculated there. Great. You know, you don't really notice it off of one click, but what if we clicked it one, two, three, four, five times.

And now let's refresh. Let's mark this time point marked. Great. And let's refresh our application. Now our reactlog is a lot taller. And if we zoom in, we can see output, bad pattern, output, bad pattern, output, bad pattern. There's four of these, five of these. And so I can search and they're all making different plot objects and they're all listening to the same inputs. So this is not good. And none of the output bad patterns exist at any one time. Like they're not coexisting at any one time. And so this is an anti-pattern and something to avoid if you see it. Um, it's not the worst, but it does, um, it will cause issues after hours and hours and hours.