
Shiny Programming Practices || Joe Cheng || Posit
Have you ever wanted to sit down and talk with Joe Cheng, the creator of Shiny and CTO of Posit (RStudio) and ask him how he approaches programming? Look no further - we've got that conversation for you right here! Shiny makes it easy to build interactive web apps straight from R or Python. You can host standalone apps on a webpage or embed them in Markdown-style documents or build dashboards. You can also extend your Shiny apps with CSS themes, htmlwidgets, and JavaScript actions. Learn more about Shiny: https://shiny.rstudio.com/ Check out Shiny for Python: https://shiny.rstudio.com/py/ Explore our interactive Shiny for Python examples: https://shinylive.io/py/examples/ Content: Joe Cheng (@jcheng) Producer: Jesse Mostipak (@kierisi) Editing and Motion Design: Tony Pelleriti (@TonyPelleriti)
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
This is like my favorite subject. You know, I spent like 15 years of my life building UIs in the traditional event handler way. It just felt so, so hard that it was almost offensive to me, you know, like that I should need to use every single ounce of my brainpower to hold in my head this pretty small dialogue in a Windows app, you know, like it's not the most complicated thing in the world. And yet I am straining all of my, you know, brain cells to make sure that all these relationships are correct.
It seems like everything you add makes it exponentially more complicated. That's exactly right. All non-trivial software. This is the number one problem. How do we manage the complexity? Because everything we add has the potential to interact with everything else.
Reasoning locally about the reactive graph
So what I want to say about this reactive graph issue, where you start out with an input and an output, right? Not even a reactive in the middle, just input, output, multiple inputs to one output, multiple outputs to one input. Like all of that is pretty understandable. And then you add one or two reactive calculations in the middle, but it doesn't take much for that to start becoming very crazily looking, especially when you have multiple levels of reactive calculations.
I think like a most important thing to understand is that you're not supposed to understand the whole graph. The goal is not to create a graph that you look at and say, oh, that's definitely right. Think of this as the same exact thing with functions in a normal piece of software, a normal package, which functions call which other functions. And you could draw a graph with that, right? You can draw a function called graph. And people do. There's software that does that for you, helps you make those connections. The goal is never to be able to look at that entire graph of functions pointing at each other and be like, oh, yep, it's right.
That's not the point. If your software is well-written, the point is to be able to look at any one node, any one node in that graph and prove to yourself that that node is correct. To look at one output and just look at the things that are calling into that output. Just one level, right? Like don't go beyond the arrows that are going directly into that output and say, are those correct? And if the answer is yes, then you can move on. You can move on to look at each of the other nodes in the graph.
So the point is not to be able to reason globally. The point is to be able to reason locally. This is exactly the same thing that we say about functions, right? When you write a function, the goal is not to be able to take the entire rest of your code base that might indirectly or directly call or be called by a function that's under a question. The goal is to be able to look at this function and to be able to confidently say, if the functions that this calls are written correctly, then this logic is correct.
The goal is to be able to look at this function and to be able to confidently say, if the functions that this calls are written correctly, then this logic is correct.
So you need to be able to look at this function and reason about just this function in isolation. Now you might be wrong. Like you might be assuming that those other functions are written correctly and that might not be true. But that problem is a much better problem to have to tackle than the possibility that all those functions are correctly written and you still don't know if your code is correct. Because not only do you have to know that they're correct, but you have to know about really subtle interactions that those functions have with each other.
With reactivity, this is a way that that same quality of I can look at a small piece of code and just by knowing that the reactives that are being called, if those just assume that those are going to execute at the right time, then this is going to execute at the right time. If there's something that I'm calling and I don't want to update when it updates, then I need to know to call isolate around that. So not calling isolate when I don't want a reactive relationship, that would be a bug in this code, right?
But if I don't have that bug, if I, my relationship with my immediate things that I depend on is correct, then I'm satisfied that this piece of code is correct. And now I can go look at another piece of code and make sure that it's correct. The bottom line being, if you have an app that's written out of 10 such pieces of code, then you have to make sure that 10 pieces of code are correct. And if you add an 11 piece of code, you have to make sure that 11 piece of code is correct. As opposed to having to make sure that each piece of code and all of the possible interactions it could ever have with any of the rest of the pieces.
Like now that's not 10 things you need to check. It's like 10 factorial or something like that, right? Or I don't know what the map is, but it is something that's certainly non-linear in terms of each additional thing you add, you now have to check all these, all the possible things that it could interact with. Oh my gosh, I'm going to add a new thing and it's going to be so complicated to know whether this is going to blow up something that, you know, I already have. That was the world that we lived in before reactivity, when it came to writing interactive UIs.
Modules and the puzzle piece analogy
This might go nowhere, but I think when people think about the Lego analogy, which like that comes up a lot when you talk about components in software, right? Like a lot of people think of them as like Legos. At a high level, you tend to think, oh, I want the ability to snap anything. Anything should be able to snap to anything else, right? I just want to take two software components and snap them together.
I don't get that much pushback on reactivity, like, oh, why didn't you do reactivity this way? It's not been that much of it. I think because it's sufficiently complicated that it's like, it's not the most natural thing to bike shed on, but modules are like the opposite. Like everybody's got an opinion on modules and I mean, for most of the Life of Shiny, people have had opinions on why I didn't do it the right way, you know, and the right way meaning something so different to every person.
One really common refrain was basically, people don't like that you have to pass in, like if you have something like a module that takes some kind of input that needs to be provided from somewhere else in the app, right? Like you've got a module that does something with a data frame, but it doesn't itself want to have any opinion about what that data frame is or where it came from or how it's specified, right? It doesn't specialize just on some output or whatever. How do you communicate with that module? How do you tell it, this is the data frame you should use and here's when it changes.
And the way that works in Shiny modules is when you define the module, you take that data frame as an input. You take a reactive calculation as an input. A lot of people push back on that and they're like, I don't want to have all these things that I'm passing into, you know, this Lego brick, right? Like why can't all Lego bricks just look the same, right? I don't want to pass in these four arguments that it needs. I just want to give it a dictionary. How about that? Every module just takes a dictionary and then you just have to put the right things in the dictionary and then like, look how clean the code is for snapping together all these Legos.
I just make a dictionary, put some stuff in it and then everybody gets the dictionary and oh, I can use the dictionary for both input and output. Not only can I get data frames out of it or other, you know, reactive things, I can also push values into it and then other modules can consume it. And look how easy it is to snap these Legos together, right?
This is where it's like, it's not enough that the Legos go together. They have to go together correctly, right? So Lego minifigs have heads and torsos and legs, right? You can't put a head on the legs directly. Like they have to, they only connect one way and those connections are obvious. With a lot of components, it's actually more like that. You don't want Legos that will snap together whether they belong together or not. You actually want it to be very obvious that this thing needs to connect with something that's like this. It's more like puzzle pieces.
So people arguing for, oh, let me just have a dictionary that I pass in. What they're really saying is like, oh, it's so annoying putting together this puzzle with all the pieces are all jagged. Wouldn't it be easier if they were all just squares? Like let me make this, you know, put together this 500 piece puzzle, but it'll be so much easier if they're all just squares, you know?
It is actually, you're actually making it so much harder, right? Because that problem still exists. Like you still need to make sure that that data frame that you need is provided to you. It's just now provided in a very hidden way that you'd have to go seek out that's not obvious. So you've removed the signal of how these things connect. You've removed these sort of weirdly shaped borders and replace them with flat borders, even though they can really still only connect in one way, if that makes sense.
So you've removed the signal of how these things connect. You've removed these sort of weirdly shaped borders and replace them with flat borders, even though they can really still only connect in one way, if that makes sense.
So again, just another example of that principle of small pieces that we can locally reason about and then combine them in reliable ways. And when we have just like these dictionaries that we pass around, it's removing that part of like, we need to be able to combine these reliably.
Building Shiny apps: top down vs. bottom up
There's so many different ways I want to answer this question. Only between building the front end and the back end, I don't think there's a right order. If what is going to get you excited about the project enough to sort of build that momentum is that you have this cool UI concept in mind, or that just motivates you to look at something concrete and see it sort of one piece after another light up, absolutely go crazy with the UI first. You know, I don't think there's anything wrong with that.
If instead like you really have some kind of R code on a server that you're really excited to see come to life for the first time and that is going to be what excites you and propels you through the rest of the project, I don't think that there is a wrong way to do it.
If you want to end up with a bunch of well-named, well-reasoned, nicely factored reactives and outputs and whatever, is that something you sort of do on a whiteboard first and then sort of stub each one out and make it so? Or is it more something that you just start coding and then extract out the things that you need? Again, I don't think there's a wrong way to do it. But I would draw a direct analogy to when you're writing either a package or maybe a complicated script. When and how do you introduce functions? Because it's exactly the same process that applies here as well, I think.
Maybe it's a little bit more urgent because it's not going to react unless you find somewhere to park this code that is, you know, reactive. So in that sense, yeah, you are going to be forced to confront it. But some people might write straight line code that's, you know, I'm just going to get this thing working and get the answer I want and then let me take a step back and see where are opportunities to refactor and turn into functions. I think that's probably the most common, right? And for that reason, you might go a long time without ever learning how to write functions if your job is more about finding that answer the one time and then moving on.
So I would call that sort of like a bottom up approach, right? You've got the logic and then now you're trying to extract structure out of it. I think that's probably the more common way people think their brains work. I happen to be the other way, whereas I will start with if I could wave a magic wand and have whatever functions I need to exist, if they could exist, what little piece of code would I write that would solve the problem that I have here?
So for some really simple example, I might say, I'm going to wave my magic wand and say load data set, remove outliers and clean. Maybe that's another function. And then let me, you know, generate a set of outputs using these two functions that I wish existed. And I will just call them even though they don't exist. And then I will stub them out. So that's working exactly the opposite way. Like I'm calling functions that don't exist. And each of those functions is potentially going to call functions that don't exist until eventually they start calling functions that exist, right? And then you basically create a skeleton for the structure first and then fill out, you know, go back and fill out all the pieces.
If you can get to this place either way of small components that you can individually reason about and combine them reliably, that's ultimately what matters. So yeah, in this second, like the top down way where you write, you call the functions that you wish existed, making sure that they're functions that you want to be, well, functional, right? You tend to want them not to use global state if you can help it. I guess in the Python world, it might be classes that you wish existed. You new up an object that doesn't exist yet. You call methods that don't exist. And then you can go back and fill those things out.
It is harder to do the top down if it's an unfamiliar domain. Like I think the more speculative and the more like, oh my gosh, how is this even going to work? The harder it is to go that sort of top down approach. So I'll sometimes go the top down approach until I realize, oh, I'm way out over my skis here. I don't actually know how any of this technology works. Let me go back and do a bunch of bottom up experiments and sort of just work with these libraries I've never worked with or this kind of data that I've never seen before.
And once you start to get a sense of confidence about how the problem is going to be solved, then I might switch over to, okay, let me go back to this approach of top down. So the same thing I think for Shiny apps, I tend to think top down. So if I'm looking at a dashboard that I want to build and it's going to have these four inputs, I think, what data am I going to need for that? Right? Oh, I'm going to have to do a bunch of aggregations in order to get that data. Do I also need to access the unaggregated data for other things in the app? Well, if so, then I really want to make sure that those are both things that are exposed as not functions in this case, but reactive calculations.
And then I feel like the relationship between the inputs and the reactive calculations, I tend not to even think about that at all. Like that just sort of happens. I call the things that I need and then they end up depending on them, unless there's some very specific reason, like, you know, I'm going to use this slider input to inform this calculation, but not till this button is pressed. Okay, fine. Then you use an event, you know, for that.
I can't relate sometimes when people say to me like, oh, there's all these like problems with my reactivity and I don't, I have a really hard time keeping track of what inputs are being referenced by what reactives. I'm sort of like, why do you need to keep track? Like it's not about keeping track of the global graph. So it's really about each individual reactive only accessing the things that it needs. That's the way I approach it.

