Daniel Chen - Shiny for Python: Building Production-Ready Dashboards in Python

And so if you learn anything from today, it really does become showing you where in the documentation page to look, and just have it open in the command tab, alt tab, really readily available.

And so what we can do is run inline equals true, and it'll just shift the order of that... Shift the direction of that... Of those choices. So if we go and do that, and hit save, and again, it'll auto reload. You'll see that now our bits, our radio buttons are now in there. And that's cool. You can now imagine, let's just take that plot that we just made and shove it down there and connect it to things, and we have our first app. So that's essentially the steps that we're going to be doing.

Adding output components

So we want to go and add in the figure. I just showed you all of the code that we need to add in that figure. But the thing is, all of that code that we just had from that first example, if I just copy and paste it in there, it won't work. Why is it not going to work? Part of it is, it's a dashboard. A dashboard is like a website. If we just say like, hey, here's a whole bunch of Python code website, go make this. It's not really going to be able to do that. So we have the code, which is good. We just now need to modify this little code just so we put in the little dashboarding components so it knows how to go and render the actual figure in our dashboard.

So we now need to talk about output components. We had input components. We're going to talk about output components. We can imagine the page of the input components that I just showed you. If you scroll a little bit further down, it is the same type of formatting, but now it says if you're working with matplotlib or iplyleaflet or plotly or alter, what other things? Like a data frame. What other outputs are there? Tables, figures. Any arbitrary text. All of those components have an output version to it.

So plot 9, if you've never worked with plot 9 before, I'm just going to tell you, plot 9 is built on top of matplotlib. So we're going to use the same type of output format that matplotlib uses to insert our plot 9 figure. Seaborn also is built on top of matplotlib, so you'll use the same type of output component for that as well. For many of you, I know in SciPy, there's always a workshop around iPy widgets and having all of those components, because to have those interactive visualizations in your notebook, there is ways to input your iPy widgets as well. It is a separate component.

So what we really do is we go on that page, the components gallery, we scroll down to the output section, and then we just find, like, my paper that I drew has this thing. This thing is a plot. Let me go find a component that renders a plot and click on it, and then you'll notice that what we have is a render plot decorator. So what this looks like is in Shiny, any piece of input or output that sort of reacts to one another in the application, you wrap it around a function. So we create a normal Python function. So a lot of Shiny apps really does look like a file with a bunch of Python functions. It's different from a module if you've worked with Python and imported other stuff before. You don't really import the... It's not like you're saving all of these functions into a file and you're importing them. It is Shiny's way of encapsulating, like, here's the code I need to run to render this plot, and that's why it's important to have everything as a function. So at the end of the day, all your output types will be wrapped around a function.

Typically it will match the ID of the input that you're trying to work with. So I'm writing, like, a species plot. I might just call this species. You can also name it as, like, plot underscore species. So in your code, what the inputs are in this plot are all together. The other thing that is important is make sure you return your plot. So a lot of times, especially if you're working with matplotlib or in Seaborn, you'll create an axes figure, like, the AX figure, and sometimes if you're used to notebooks, it's like... Once the AX figure is the last thing in the notebook, it just renders on its own and you never really have to work with the AX object, but do make sure that if you create an axes or a figure object, return the actual object. That's really important. Otherwise Shiny has no idea what you're doing. You created a thing, you just never gave it to me. So another important thing, make sure you return your output.

The other component, and this is the thing that's going to be very weird, new to people who... For me who solely works in data science, I don't really use decorator functions a lot, but they're used all over the place in Shiny. You don't have to know too much about what decorators are, other than it's the at symbol before a function. And the short answer between what a decorator is, is there's a special function called render.plot. It takes in a function that generates the plot, and then returns another thing that can run that Shiny understands, like, okay, I'm going to render this plot. I need to create the HTML for this plot and shove it into the website. You'll notice that you don't have to write any HTML or CSS or JavaScript to make that happen. That's all handled by this little magical decorator at the top. You also don't really need to write your own custom decorators as far as if you're taking this workshop. You don't need to worry about that either. Just know that in order for you to place a figure somewhere, you need to find the correct rendering output object. In our case, it's render.plot, and then here's the code that renders that plot.

Connecting inputs and outputs with reactivity

So we want to go and add in the figure. Now here's the magical part of what really makes Shiny different from all the other plotting libraries. Right now, this kind of, I would say like, it looks very Streamlit-like, if you've used Streamlit before, it's kind of like Dash but without the two components to it that you have to keep track of. But each one of those other tools do execute their code slightly differently. Shiny uses this idea of reactivity, and we'll talk more about reactivity in the second hour. But reactivity is, again, like the secret sauce of Shiny, it's what makes it unique, it's what makes, it's how my brain works now when I'm thinking about dashboarding.

The way this works is we need the ability to say you have all of these input components, it has an ID called species, and I want to capture whatever that input component returns. So if it's a radio button, it would make sense that the radio button returns you a string, and that's actually what happens. If it's a slider, you would hope that it returns you an integer, and that's what happens. So there's some logical sense to this thing. But what we want to do is, hey, we have all of these inputs, I want to get that information out of a particular input, how do we do that? In Shiny, you have the input parameter available to you. So in Shiny Express, you'll import the input object. And this input object actually keeps track of all the inputs you have on your application. That's why that ID is so important. And the way you say, hey, I have this radio button, its ID is species, and I want to get the value of what's selected, is you say input.species. This species is literally the same text from the ID. And then you wrap it in, you pass in, you call it like a function. And what that's doing is it's finding the input and calling it, and then getting the value out. And then when the user makes changes, it's going to rerun this particular function, and then get that new updated value. And that new updated value plays a key part in how our figure gets rendered. Specifically, we are using that input to change the subset of our data set.

So what I like to do, especially when you're starting out, is, yes, save the variable of the input that you're using, called species, and then passing it back in right here. So before, species was equal to Gintu or Addadale, like the hard-coded string. Now that we created those radio buttons, I just need to say, hey, just replace that with input.whatever the ID is. Don't forget the round parentheses. So it knows to get the actual value. And then the rest of your code doesn't change. So that's why it's really important to get that first step right. Get your code working, refactor it a little bit so all of your inputs become separate variables. And then when you get to this point, it's input.species. And then now you have everything you need to connect those two parts together.

This is my favorite part of working with Shiny. If you've worked with any other dashboarding library before, this mechanic that I just showed you is handled differently, depending on what framework that you're using. But this is how Shiny does it. There's an input object, and it keeps track of all your inputs.

So let's go and run that code in our little dashboard. So we still have the DAT. And again, all I'm doing is essentially, I'm pretty sure I can live code this, just to prove it to you. I'm putting my code in here. Why am I putting this code in here? It's because I know that the subset's going to change when the input changes. So it makes sense to call this code, this plotting code, that subsetting code, every time the plot regenerates. So I'm moving that subsetting code into the plotting function. And then up here, we now need to say input. And then over here, all I need to say is input.species. Why is it species? It's because this radio button and this ID is called species, and that's how it ties those two bits together. My app still hasn't crashed. And now I get a little reactive application. You can kind of see the blue kind of changing as I'm moving. And that, again, when I make a change to this radio button, and this is the reactivity part, it knows that it needs to rerun this plotting function because I've tied the input to this function.

Like, we'll talk more about how reactivity works, but it kind of just figures that out for you. So if you've taken... If you've worked with JavaScript before, there is no concept of a callback here. You don't have to manually keep track of all the inputs to generate this function. You just go and use the thing. And Shiny is really good at just figuring out what it needs to... What information it needs to get for that interactivity. And so this line right here, as magical, I find it magical, as it is. That is the core component of Shiny and reactivity and what makes it really nice. So we get that string from the input. We use that same species. The rest of the code doesn't really change. And now we have our first little application.

And so this line right here, as magical, I find it magical, as it is. That is the core component of Shiny and reactivity and what makes it really nice.

All right. I believe that is our first break time. If you haven't, and you want to follow along, this is the actual application that we just created. If you want, stand up, walk around, stretch out your legs, or you can go and get this application working. At the very least, get the code on your computer and try to actually run it on your computer. And we have a break until 2.30. But take about five minutes to see if that works, and that way everyone at least is able to stand up and give this a try.

Ow! I am making it worse. Okay. Hold on. There we go.

So like I said, if you have any questions about anything that I'm talking about, you can raise your hand. If you have a really specific question about internal implementation, or a question you want us to address at the last 30 minutes of today's workshop, there's a Slido, type in your question, and then we'll try to answer them all at the end.

And I'll also try to take a... I'll also try to take notes during the Q&A section, and then I'll upload all of those questions and answers to that website. So we're all keeping track of links and answers to all of that as well.

But yes, give this a try, or come back at 2.30, and then we'll go over the solution. I'll show you running this again, and then we'll go over to the next section.

Running Shiny apps and the reload flag

Because one of the questions is actually relevant to getting stuff running. If you're using VS Code or Positron, you can install the Shiny extension, and you'll get a little play button.

Otherwise, shiny run dash dash reload, and then point it to the application file. That will go and run your application. When you run things this way, it really doesn't matter what your app... What the naming of that application is. You can call it penguins.py, and it will know to run it like a Shiny app. The app prefix is really just making this button work.

So there's quality of life things that you get, as long as you follow the rules. Otherwise, if you can type the command, as long as you know what you're typing, you can totally do that. Reload is what is giving me the ability of... Every time I save the file, the app kind of reloads as I go.

Q&A: internet requirements and Shiny Live

There's actually some questions I know how to answer, so we'll talk about some of them. This is recorded, right? So we have proof that I can... Not everything needs to be punted to the Shiny team.

The short answer to this last one is my computer doesn't have internet right now, so I think that's proof that you don't need internet to run this stuff.

Okay, so it is bottom of the hour and yes, the short answer is if you're trying to work in an environment that's like actually air -gapped, you still need internet to like install Shiny. That's something you still need the internet for, but no, having a app file and using Shiny run to run that app file, you do not need the internet for that to work.

There are certain things that might need the internet, you know, a common thing that I run into all the time is like math jacks for like late tech and like formula rendering, like that just needs the internet to run for some reason or the other. So but at the end of the day, you don't really need a live internet connection to like click the play button or run Shiny live, Shiny run, yep.

So the question is like for certain CDN things, can you download the actual JavaScript bundle and then just point it and just include it in the HTML? The answer is yes, you can do that. That's going to be a Shiny team for like show you the code to do that, but at the end of the day, all Shiny and all of these tools are doing is letting you write Python and it writes the HTML for you.

So essentially you're, there's always a mechanism that says, what is it like in the header, you can have like an include or a link tag. You put that in there with that link and it's just a link to the actual file you just downloaded. So you can totally do that. Yes, things that might download on the fly, you can totally just hard download that.

In Shiny there's, in Shiny UI, there's a, it's like shiny.ui.tags and then you have access to the Python function version of all your HTML tags. So if you know how to write like raw HTML, you call that particular function and just write your thing. And then when it creates the HTML for the actual application, it's essentially copy and paste thing. You're just, you know, it's, I wouldn't be surprised if it's an F string that just dumps whatever you put in there, in there, right? So totally can do that.

So I think that's answered. One question is Shiny Live. Shiny Live is a really cool service. It is not like a gist.

So Shiny Live is really built on top of the last couple, last like few years, the last couple of years around just web assembly, which is run Python in the browser without having someone installed the thing.

So Shiny Live is a tool or it's a website that allows you to, here's a Shiny app. It's clearly running Python, but it's in the browser. I didn't need to install anything. So Shiny Live is this service that allows you to put in a Shiny app here and then give you Shiny app on the right-hand side without having anything installed.

There are a few limitations. The syntax is a little bit weird, but yes, you can have a requirements.txt file. So certain things can be installed. It's all built on top of PyIodide. So if it's installable by MicroPython, I believe, then you can install it in here. But not everything is pip installable in Shiny Live, which is really on top of PyIodide, which is part of WebAssembly. But a lot of the PyData stack is already available to you.

So Pandas, Seaborn, Matplotlib, scikit-learn, I think, is all part of just PyIodide. It's not even Shiny Live specific. It's just what you get as being able to use Python in the WebAssembly ecosystem. That is just available to you.

What's really useful about Shiny Live is sometimes I'm actually debugging apps in here. You can do that. What's also really nice is you can click on the little share button, and I can give you this URL, and you will have the code and the application. It's really nice if you're trying to prototype something and you don't want the other end user to install all of this Python stuff. You can do that.

The short answer is all the code is in the URL. So the Shiny Live code ends up being super, super long. I don't know how it's encoded, but it's not the actual raw string of your code. So it's saved with a little bit of space. But I can copy this URL, open a private browser, paste it, and it will give me this application.

So the question was, is it like a gist? It kind of is like a gist, except you don't see the code until you run the actual link. This is another tool that's available to you in debugging or sharing Shiny applications. This is really cool for prototyping certain things where you're like, I want to prototype this, and I want to show off a prototype without having the end user install anything. This is totally free to use. And you have the code kind of bundled with it as well.

The short answer is all the code is in the URL. So the Shiny Live code ends up being super, super long. I don't know how it's encoded, but it's not the actual raw string of your code. So it's saved with a little bit of space. But I can copy this URL, open a private browser, paste it, and it will give me this application.

It's running on my laptop. Clearly the internet connection is working. But yes, that's also... But it's not... The code running on my laptop is not a feature of Shiny Live. It is a feature of WebAssembly itself. That's why you don't have to install anything. The whole Python bundle and all of that is in your browser right now. So I am pretty sure if I look at the network tab, it's like a pretty big bundle that sort of just got loaded on my computer.

The website, yes, is hosted somewhere. Nothing you type in here is what we see on the other side. Unless the site itself has logging enabled. But if you put in a dataset in here, you're not supposed to, yeah, technically it's on your local laptop, but probably wouldn't want to do that to begin with.

You can sort of upload stuff in here. It uploads onto your side of the browser. That's all. That's what's happening.

It's a really cool tool. There aren't as many examples on Shiny Live, like that initial page. But nothing stops you from copying and pasting this code and pasting it here. And I believe this works. I might have to set up a requirements TXT because of Palmer Penguins. But I think it's working.

So this is on the web browser. It's not really running on... It's running on my computer in the web browser. But it doesn't require me to have Python and Shiny and all of that installed. So this is, again, a really cool way of prototyping things or showing off other features of things in Shiny.

And again, this is all built on top of the last couple of years, the whole WebAssembly toolkit. And the Shiny team sort of just leveraged that to make sharing Shiny apps a lot easier. And again, I can click on share and I can pick one of these links and give you the link and then you'll have this version of this application you can embed in a slide deck or just show or send it off to Slack. You can totally do that. We actually do that very often when we're trying to debug stuff. But yeah. That's Shiny Live. And it's a really cool service.

Express vs. core mode

And one of the other questions is... This is the one that I wanted to talk about before we go into the next session. There is two different ways you can write Shiny code. One is called core and the other is called express. I am today showing you express mode, which is a lot less code. It looks a little bit more like Streamlit, less like Dash.

The core mode is the more traditional route where you have a variable that defines all of the inputs in a separate section that defines all of the reactive calculations and they're two separate things. In general, in my day to day, I haven't really found the big limit of express yet. With Shiny core, you do have potentially more flexibility to refactor things. Again, personally, I haven't really hit limitations with express building single page dashboards... Yet.

Even in scenarios where I'm going to go through this exercise doing something in express, and then it's like... Oh, it works with all of this LLM stuff with multi-pages. And it's like... So it does get you really far. The other part of this question is... Can you mix the two together? The answer to that is no. You do have to make an initial decision of express mode or core mode.

It's in some sense also similar, I would say, to Plotly. I believe 90% of you, when you use Plotly, you're using Plotly.express as PX and you're just plotting Plotly code that way. Very rarely are you using Plotly, the core version. Even I've rarely used the Plotly core version for certain things.

There are two modes available. Part of it is also historic. Shiny was available as a dashboarding framework in the R community, and they just ported it all to Python one-to-one. So that's why something like core mode exists. And then to reduce the boilerplate and code you have to type in Shiny, they created express mode.

I'll leave that question up so the engineering team can also get a crack on it. And yes, it does support Matplotlib and Seaborn and Plotly and Altair. Not just Plot9 and ggplot. The answer to that is yes. I'm just showing it because that's how my brain works. My brain does not work in Matplotlib plotting.

Reactivity in Shiny

And I wanted to talk about the core thing that makes Shiny special, which is reactivity. There's a couple of links to other presentations that have also talked about reactivity. Shiny for Python has been around for a while. So yes, all of those talks are in 2023. They're all still very, very relevant to this particular topic on reactivity.

I literally actually just took the diagrams you're about to see from Gordon Shotwell's talk. But we'll talk about how reactivity works. And for any of you who have worked with any other plotting framework before, this is the part that will be very different with how Shiny goes and figures out how things are connected to one another and how things need to be executed or reexecuted.

So again, reactivity is the key word when you're talking about Shiny. It is essentially how it all works. And essentially it is a mechanism that allows you to connect the output to an input and Shiny goes and figures out how those things are linked to one another.

The benefit of reactivity is you don't have to write a callback. So in Dash, all of your output components, you have to say, like, I need to track what the input is so that any time the input changes, it knows that that's what the output is. Writing callbacks, if you've written JavaScript apps before, one of the pitfalls is it's a lot of manual accounting, which means that you could put an input somewhere and if you never explicitly write the callback, those changes you're making aren't updating the application.

One of the things that makes Shiny different is you don't have to write a callback. The way it handles that process is through reactivity. And I think it's pretty intuitive. You say input, the name of the ID, parentheses, maybe the parentheses part is a little bit weird, but that's how you link the reactive parts together.

So how does this work? Let's say we have a fairly complicated application. All of these rectangles are inputs. We haven't talked about reactive calculations yet, but you'll see it in the next app example that we're building. And then we have two outputs. One is a model score and the other is some API response.

So this is the application. When you run Shiny, it finds and realizes, hey, you have a whole bunch of stuff in here and it just knows about them. When the application loads up, it's going to be, hey, what needs to be displayed on the page? I need to show you whatever this model score is.

Maybe it's a piece of text that gives you the R squared number. Let's just use that as the example here. It's a piece of text on the screen and it's like, okay, you told me to show you this piece of text. That's one of those output components that I talked about. In our previous example, it was a plot. It could be a data frame table. Pretend that this is a piece of a rendered text element.

So the way this works is it goes and runs that output function. In the output function, you saw that it's like input dot something parenthesis. When it runs that, it goes, oh, you told me to look up something called filtered. Filtered is the name of that ID. Filtered is something that is in this case also a function. When it goes and calculates, it's going to be like, oh, you told me to go look up this input called account and now it knows to look at the account input to see what gets filtered.

The filtered also depends on a larger data set. So it's like, okay, I need the sample and the account. But sample, when it runs its code, it's going to see another input call in some sense. And then it's going to look up and it's like, okay, in order to get sample, I need dates and sample size. And so the arrows are drawing in a downward fashion, but it's really looking in reverse in how things are calculated.

So that's how it goes and generates the model score. Now it knows how everything is connected. It's now going to run everything and then you get your R squared number on your website. And then it goes and says, oh, wait, you need this other thing displayed on this page, this API response. What does it need? It runs that function. Finds that it needs filtered. Filter's already been calculated. It's already there. Essentially it's already been cached. So giving you the API response is really fast. It doesn't need to go and work its way all the way back and recalculate everything. It's already there. So let me just feed it to you.

So reactivity, one of the other things you get from it is it tries to do the most efficient computes. Like what needs to get computed or recomputed. It tries to figure that out in the most efficient way. So that's how reactivity works.

So if we go through another example of let's say the user changes the account. Needs a radio button, a dropdown, piece of text. When you go and change one of the inputs, because this diagram, like what you see visually, has already been created and being tracked of in Shiny, it's going to say, okay, everything downstream of this change is now invalidated. So it will go and invalidate everything downstream. And then you go back to the same process as before. You asked me to show you a piece of text. I'm going to run that function. It depends on filtered. Filtered needs to get recalculated. The account number changed. So that's how you get a new filtered data set. But the sample didn't change, because those inputs didn't change. So we can still use the old cache value of sample. So we don't have to recompute that, right?

You can take this idea to the extreme of like if each one of these steps took 10 seconds, changing account only caused a 10 second change instead of a 30 second change. And then because filtered change, it invalidated the API response, you also told me to show you the API response on the site. It's going to go and be like, okay, filter's already been calculated. Let me just give you that cache result again.

So now if we go through the example, like what happens if sample size change? So it goes through the same process. Everything downstream now gets invalidated. So now everything is blank. And then you're back to your initial state. And it says, what needs to be drawn on the screen? And you go back through that very first example again, right?

So just by saying input.id parentheses, that's all it takes from you as the end user to keep track of this graph. You don't need to, again, write a callback, because all of these arrows essentially is what you have to do, depending on the framework you're working with, like when you hear callback, you have to keep track of all of this. You have to write it in your actual code. In Shiny, I have this thing, I need it right now, and you just let the graph figure itself out.

In Shiny, I have this thing, I need it right now, and you just let the graph figure itself out.

So that's the really cool thing about it. You never really have to worry about, oh, I never connected dates properly. As long as you're using the date input, it's up to you, if you mess that up, like you never used it and you're still using the hard-coded date, which I've done in the past, yes, that is a mistake. But you don't have to keep track of, like, let's say you're thinking dates, this input was being used in like five other places. You don't have to keep track of that anywhere. You just go and use the thing, and the reactive framework will go and calculate that for you. And it will go and figure out what is the minimum amount of extra computations or executions it needs.

So that's how Shiny works. By magic of input.paren, that's kind of what you need. Which is kind of cool.

Building a more complex dashboard

So let's go and for the next... Might be over time, now that I'm looking at my own schedule. Let's work on a more complicated application. Let's build an actual dashboard, and we'll use some of the techniques that I just talked about in the first example to go and build this dashboard.

So we will go and build something that looks like this. Anybody have any ideas of what do they see? I'll take any volunteer. What do you see? You can call out an input, you can call out an output, you can call out a visual component, if you want. Slider. Yes, yes.

I'm also a university teacher, I'm okay with just standing here until someone says something. But yes. You have a slider. That is on the left-hand side. One of the other things that's really hard to tell, also the fact that this is a static image, is there's a little arrow here. And so these inputs are actually on a collapsible sidebar as well.

There's another part in that dropdown on the site that said components. There's another page that says layouts and layouts. A sidebar is a particular type of layout. So you can get the boilerplate code on how to make this little sidebar thing go away. Each one of these things are cards or value boxes, so we can have really important metrics sort of take up a little bit more space, have a nice little bounding box around it to give it a little bit more visual distinction.

And we can put information in here, or actual icons. So these are really good for those big top-level metrics that you're interested in. In this particular example, all of these are going to be different types of output. This is going to be dependent on a render text output, because when the sliders change, I expect the numbers to change. So it's going to be render text, compared to before, which is render plot.

So this right here, this was also another question. This is render data frame. And the other question is, yes, I believe Shiny works with NOR walls, which means like you can put in any data frame-like object, and Shiny will be able to render it in here. Your Python code itself can handle Polar's code, for example, and Shiny will do the Polar's thing. And if you need to display the data frame or the data, it's right here. Or if you just need the output from Polar's, then you have a little calculation box there.

We'll work with Plotly. So here's a little Plotly figure. So instead of a static figure, this is a more interactive figure. This is using the IPyWidget system to get this into Shiny. So all of those other features with maps or interactive plots, you can do that with Shiny. And then down here is a visualization that is built on top of Plotly called RidgePlot. So you can have some other plotting library object down here as well.

The trick is, you may need to know where this thing is coming from. So looking at the RidgePlot documentation, I think they said something like, yeah, it's a Plotly thing. So OK, we know it's a Plotly thing. Once you know it's a Plotly thing, you know it's an IPyWidget thing, and then OK, at render IPyWidget, and essentially that's what's going to happen.

So let's talk about all the different parts on this page. You have actually a title, which I also totally forgot. But you have a title. Here's the little component that you can use. There's a title parameter to set it.

We have a sidebar. So I'll show you how we can put together a sidebar. At the very top, we had three cards or three value boxes, and it took up the entire width of the dashboard.

So we can put in these different value boxes. We can put in placeholder text at our initial go. And then we can put them in a little bounding box, which is what's given to you when you use a value box.

We also have a full-width column underneath that has two cards, one for a data frame, one for the scatter plot, and then the last thing is we have a whole thing at the bottom. So those are the five big components on this particular application, and we'll go and draft this application in the next hour.

Planning the layout

If you have access to the slide, here are all of the documentation pages for all of the components that I just talked about. But what I want to do is give you about five minutes to work on to work on this particular app.

Again, when in the first hour I said, like, get a piece of paper out and draw out all of the components, you almost do the same thing when you're building a Shiny application as well. You'll notice that, like, my very first step is to identify, like, translate the thing I drew on the paper to just, like, have them all as placeholders in the application.

So I want everyone to at least give a try one of them. Even if you don't get the sidebar thing working and you just have a card, just put that into an application and run it. Or just get three of them and we can talk about the layout parts. But that's your main goal, is to get all of the big pieces laid out onto your application.

Now you have, like, one step closer in terms of, like, if I only had this much space on the screen right now and I'm planning to put a map there, a map is not going to fit well in here, right? So you now have your code that you are running with all of the outputs and all your figures and tables and everything you want to show.

On a separate step, lay out your application just like you did on your piece of paper. And you'll get a good sense of, like, is this enough space? Is this too little space? Do I need to move stuff around? Typically maps need a lot of space.

And my students also find, like, they realize that, like, yes, if I made this a map and it's only this much in height, that's really not enough space for a map to feel like a substantial way to interact with it.

And my students also find, like, they realize that, like, yes, if I made this a map and it's only this much in height, that's really not enough space for a map to feel like a substantial way to interact with it.

And again, because of the screen limit I have, this is also an application that scrolls up and down. And then we have our little sidebar component right here where we have a placeholder for all of our inputs.

So this is one of the things that I always do, which is lay everything out first and then go and see if that's enough space for what you need. So you don't end up working with dashboarding code when you're just copy and pasting giant blocks.

But take five minutes and look at, like, the UI card and just put in three UI cards somewhere and we'll talk more about the sidebar. Or look up the sidebar and then see if you can get a placeholder on the left-hand side and then three UI cards on the right.

And then I will, in the meantime, give everyone the code in our Slack group on, like, the actual solution. Just so you have this as a template. Because we're just going to be building on this application moving forward.

Reviewing the documentation site

Hopefully at the very... If you're trying to follow along, copy and paste what you have in Slack, try to run Shiny run on your end, and then I'm going to do the same thing as well.

I want to talk a little bit about the documentation site, since I think it's working now. Everyone's downloaded all the LLM models that they need. And at the very top, under components, like, I think in the next... If you're watching the recording, things might be moving around in the next couple of months, or by the time the recording's up, but the components page is what we just talked about.

You have all the inputs. If you scroll down, here are all the different outputs. Two different ways to visualize a data frame. If you just need, like, a random image, you can do that. You'll see that... Yes, matplotlib. We separate matplotlib and Seaborn, because they're such... The libraries are so popular that we're just gonna be super explicit, like, this is what you're using.

You can totally do that. Plotly, as a placeholder for all of the IPy widgets, same thing with mapping. Arbitrary text.

You can also write code that also returns another UI component, so that's... If you're trying to have applications that react to other parts of the application, and like, the tools and things change, that's also something you can do with Shiny. We're not gonna talk too much about it today, but that is also a thing that you can do.

Value boxes is what we're going to be using for those cards at the top, and then you also have the ability to just say, like, hey, just echo back whatever text I give you. So that's the output components portion.

The other thing with Shiny is you can have... We'll talk about chat stuff later. But you can have, like, pop-up messages, like, if something is...

folks where, you know, if any of you have to put reports together for like your boss or something, I've heard people say like, yeah, if my boss doesn't get the report in like Times New Roman, like he physically cannot like get past that and that's like the only thing that you have. Like they can't read anything else.

Sometimes I work with other people and it's like if the spacing is not right, like they're so fixated on that spacing, that like they can't oversee it to like look at anything else. So yes, the visual aspect of a lot of dashboards and reports like really do matter. And we try to make it a little bit easier for you with all of these little components.

The other thing I want to show you is on a reference tab. We're talking about Shiny Express. So you have it here. You can also look at Shiny Core. They pretty much match one-to-one. But this is sort of one of the things that I think, you know, the positive folks picked up from the R world is we sort of like namescape things, namespace things within the module. So all your UI components are also right here.

So if you need a quick way to figure out like what all my input components are, look for UI dot input underscore. And this is everything that you have available to you. We're working with a checkbox. Not a checkbox. We are working with radio buttons. So you can also click on the radio buttons. This is the actual documentation site of the radio button instead of the component prettier with examples view. So this is always a lot more in-depth.

You also have all of these output components. Render dot something. All of these layout components, you'll see a couple of them in the next couple of minutes. And then we'll talk about chat interfaces in the next hour. But like this documentation site really does have everything that you need.

And I think even though there's a lot of parts, it's sectioned off pretty well in terms of you can generally find what you need. And I do this with the pandas documentation all the time where I open the API page and I just scroll and I look for stuff. And sometimes things are like, oh, I never learned about that before. And you kind of click in. This is slowly how you learn a little bit more and more about a particular tool.

Building the dashboard app

All right, so let's talk about this particular app.

I'm going to create a new app. We'll call this app dashboard.py. We'll paste it. I'll save it. I'll close this application and I'll run it.

All right, cool. So you can see a little bit of the wrapping that is going on. The cards are sort of like stacked on top of one another instead of looking at them like you do in that screenshot. So if I wanted to, I can look at the the application in my web browser and you'll see that like, yes, it actually doesn't just look like what I have in VS Code. So that's also something that's nice.

Okay, so let's talk about this code line by line. Right now, we're going to still import Shiny Express. You'll notice that like, yes, my ID is telling me I'm not using input anything, everything's a UI part right now, and that's that's okay. Again, that is my tip to everyone is just work out the UI when you're first putting together a dashboard. Separate if you're already like given a dashboard, but if you're building one, just start with the UI, have your little notebook to the side of everything you want.

The first thing we're going to do is we're going to set that title, right? Remember the five things that we have in this application. First thing's the title. So we're going to set the page option to restaurant tipping. And then I also wanted to just take up as much horizontal space, just fill the page for the dashboard so I can say fillable equals true, and I'll be smart enough to be like, okay, you don't really need like margins on the side. I'll make my application auto grow to the bounding space.

So here's the first part that we haven't seen before. In the first example, we just said, here's the input, put it right here, here's the output, dump it right here. In Shiny Express, this is different from in Shiny core since someone asked about it. Instead of having separate, a separate area or a variable that is controlling each individual input and a separate area of your code base that's controlling all of the output, you're putting the inputs and out, I'm sorry, the layout, the layout UI components, instead of putting all the layout components in separate areas, we are using the width context manager to specify this area has this particular layout component to it.

So the actual function is UI.sidebar, but we're going to say, hey, we're going to create a sidebar, and this is what I want you to put into it. So I'm going to say sidebar components to text, and so my sidebar has sidebar inputs to text. Everything else, so outside of that, is now going to be the body of the application. But if we look at our app example, our body, that first one, if you tried to put in multiple cards in the example when I said try it on your own, you probably got three cards that just took up the entire width, like one on top of the other instead of one next to the other.

So there's another type of layout called layout underscore columns because we want to lay something out in a few columns. So we're going to use another context manager for that, so there's another width statement, and in each of those columns, we have three value boxes.

So we have a value box. Value boxes are a special entity in Shiny because if we look at a value box or if you look at the original dashboard, there's a label here, there's a value here, and then there's also an image off to the side. So a value box is a special type of UI element that sort of accentuates, like, here's a thing that you can visualize.

So value boxes technically come with their own context manager. Not every single UI element will have its own context manager like a value box. But again, that is also why I have said come to the documentation site and you will, in general, have the code to get you started with that particular application.

And again, we talked about Shiny Live. This is running in Shiny Live. You can literally copy and paste this into Shiny Live and then run it in there. So again, Shiny Live could also be another area where you're prototyping code, and then when you get it working, like, cut and paste that code back into your application as well.

The Bootstrap 12-grid system

Then we have our second row of cards, so it's another set of columns. And here's another scenario that we can use. I'm going to slowly introduce web technologies to you. I'm not an expert with all of the web technologies, but Shiny is built on top of Bootstrap, and Bootstrap has this 12-grid system, which is essentially saying every single website that uses Bootstrap's grid system is the width of the site, or that part, is divided up into 12 boxes.

And you specify the width by saying how many of those 12 boxes you want it to take up. And you can subset it. So if I'm saying I'm only using half the site, it is a width of 6, but within that 6, there's another 12. So you can always subdivide in, but the magic number is 12.

I like to believe that they picked 12 because it has a lot of factors in there, and I believe it was like, if we followed the Mesopotamians, we would be on a base-12 system. If you have 12 segments on your fingers, you can actually count by 12s. But I'm assuming that's why they picked 12. It's actually a very nice round number, because of how many ways you can break it up.

So here I'm saying, hey, I want a column, and I want two things in this column. So I'm saying 6 and 6. That's why 6 plus 6 equals 12. So if you see numbers like that in Shiny, it's typically referring to that 12-grid system. And really, where that's all coming from is Bootstrap.

So you might see it in Shiny, yes. You might see it in other libraries that deal with anything web-related when you're just working with spacing. The benefit of this is it helps reflow things when you have to go from screen to smaller screen to phone. Everything's always 12 things wide. You don't have to specify 300 pixels, because now 300 pixels looks huge on a phone and tiny on a computer screen. So that's why people try to use some other metric than putting in hardcode widths.

The benefit of this is it helps reflow things when you have to go from screen to smaller screen to phone. Everything's always 12 things wide. You don't have to specify 300 pixels, because now 300 pixels looks huge on a phone and tiny on a computer screen.

So in here, we have a few cards. And in the card, I'm going to say, I want a placeholder for my dataset. I want a placeholder for my scatterplot. And then we're going to have another column at the bottom that says, give me that ridge plot at the bottom.

And that's it. That is my first step. I will typically go in here and visually see, yeah, I know the data frame's going to be a little bigger. Is that okay? You just get a general sense of making progress with your dashboard and then slowly, oh, I actually have figures to work on. So if my boss is asking me about stuff, I can at least give them the figure and then tell them, hey, just imagine this in this section here. So that's also my tip and recommendation to a lot of people when you're building up an application like that.

Adding inputs and data

Okay. The next part is I'm going to give you all the code, but the short answer is let's go and put in those input components, right? So really similar to what we just did in that very first application.

This is a slider with two sides. A two-sided slider is really a slider with one side with, like, you pass in a list instead of a single value and it knows to make a two-sided slider. I believe that's how that works. Here, instead of a radio button, we're using a checkbox group, and it's because maybe you want the ability to select multiple. Could I have made this a selectize that selects multiple? I totally could.

My general rule is if you have, like, three things, you can probably show it all on the screen, because a drop-down with three is kind of weird sometimes. Again, depending on the screen real estate that you have.

And at the very bottom, we're not going to talk too much about the reset button yet, but you can also put in an actual action button that does other stuff, like reset these values if we need.

So I'm going to give everyone the code for this, and then I'm going to run it on my computer. And again, the code is in the repository. If you navigate to the actual slide deck, it's actually in the file for the slide deck.

So all I did was copy-paste. I put in a few more values, like placeholder stuff in here, and then I add in a couple of placeholder bits of text for those cards on the bottom. And then the main takeaway is, yes, all I did was in that sidebar, we're using input slider instead of input checkbox.

This is, you know, kind of how you define a slider. There's always an ID. That's the thing to keep in mind. There's always an ID, and then minimum, maximum value. That is going to be dependent on what type of component you're using. I'm using a checkbox group, and so there's, again, always an ID, and then everything else is going to be dependent on the object that you're using.

And then right here, it's still UI input something, but I have an action button, and I'll call it action button, and it has a title called reset filter. So I might actually call this, like, reset button if I wanted to be, if I know that's the only thing in there. But the documentation is called an action button.

And then everything else was left the same for our particular dashboard. So I'll give everyone the code for this in Slack. And then we can go on to slowly building up the next section.

So the next thing is let's go and add some data. Here's some code that will take in two placeholders that represent the index, that we read from the index, and then we, at the end of the day, get this filter data set at the bottom. So this is regular pandas code that we're going to say, hey, I know I'm going to have a placeholder for the lower bound of that slider, the upper bound of that slider, and then what is selected from that checkbox group.

So I'm doing the same technique that I showed before, which is refactor some of your code such that you try to get it into variable to input, and then write the code down here that will go and subset this.

There's many different techniques on if you have multiple ways to filter, how do you chain filter something? I believe this is what ChatGPT or Claude gave me. I was like, wait, that's actually way better than what I was going to do. So I ended up sticking with it. So you created one index of true false value, you create another one, and then you just intersect them all together and you filter it. I was like, that is definitely way better than what I was going to come up with.

So that is our code for filtering our data. The other thing, when you're slowly building up your notebook that accompanies your dashboard, actually go and compute the things that you want to show up in there. So I have the total number of tippers. What is the code that generates that? Well, it's the shape and then get the first value that represents the number of rows.

The average tip, I'm going to take whatever my filter data set is and divide it by tip by total bill. And then maybe I do some fancy string formatting to get a percentage because that's what I want as a rendered text output. Same thing with average bill. I'm going to write my regular pandas, polars, whatever code to calculate the average. And then maybe if I need some string formatting, like I need two decimal points. Yes, just write the code in your little notebook. And now this code is ready to be put into the dashboard. It's just not in the dashboard yet.

Same thing with the scatterplot. Here's my Plotly express plot. And there you go. Scatterplot. Great.

The ridge plot itself is a little bit more complicated. This is not a plotting class. Ridge plot itself, you can use other libraries for this as well. But here's just the plot for a ridge plot. And I didn't even bother trying to fit it on the page or anything. But you can... It's a ridge plot.

But again, the main takeaway is you'll see that I'm setting the orientation. I'm setting where the... What is this called? The legend. Where the legend shows up. Do a lot of that configuring before you dump things into the dashboard. Such that this just becomes a cut and paste when you get to that point.

So we will talk about this part. I'll take a little break. I think it's okay that we are sort of running on time. Because the next part is about LLMs. And that's mostly just me talking at you.

But the next thing is... We'll break here. But I'll talk about the sort of problem. And then we'll address the problem and solve it. Is we'll use the same techniques to get all of those values into our value box.

So we have our little slider. So this is all of our UI components. And then once we get past the UI, we have the body of our application. We have our layout. And then we have our value box. We have our title called total tippers. And then in here, we're gonna say, like, hey, I actually want you to calculate something. To show up in that value box, right? So that's why that slide that I had before, that just pre-calculated all the values, is really important.

And so if I look at the code for the first two value boxes right here, you might notice that, yes, I am subsetting the data. Even syntax highlights.

I'm subsetting the data set multiple times, right? That means that each one of these cards, when it's showing that display, is rereading that input from the input and doing its own subset and calculation that way, right?

So I'll leave this code right here just so we can have time for a break. And then we'll talk about how to address this problem.

We saw how we addressed this problem in that reactivity graph, where I said, like, hey, there's this intermediate sample that's being calculated. So we'll show you how we actually end up doing that in our code base. But if you didn't know how, like, that was a feature, this is kind of, like, what you can do. And you can see the app still works. It's just not as efficient.

And so you can always make something that works and slowly work and refactor things to make things more efficient. So if this is the first way you write your application, totally fine. You can always go back and, like, replace it with a variable.

So we'll take a slightly shorter break to make up on time. But come back at the bottom of the hour at 3.15 local time. And then we'll keep going with this example and just build out this entire application.

Resuming after the break

All right. Bottom of the hour. And we'll talk about it. So we left off talking about, yes, each of those cards or those value boxes. Not a card. Each of those value boxes, when we were asking it to show the whatever value you're interested in showing, it was doing this get the value for the slider, get the value from the checkbox, go and filter the dataset and then go and do your calculation. That was happening three times in our code.

If you think about the little reactive graph, what's happening is you have three outputs that you're asking the model to calculate, and it's pulling in from, like, each one of those outputs is pulling in from the input. So it looks even visually as a graph, it looks really messy. It would be nice if we said, hey, if we took this tips filtered and we just saved it into a data frame and then used that filter data frame in all of the other downstream outputs, that's one way where, one, we can definitely reduce the amount of copy and pasting in our code.

And then, two, if you think about what I showed you in that reactive graph part, it doesn't have to go and refilter your dataset three different times for each of those three value boxes.

Introducing reactive.calc

So the way we handle that is in Shiny there is another thing, this is another new thing that we're going to talk about, is you have this reactive calculation object that you can create. So you have inputs, you have outputs, and then if there's, like, some type of thing in the middle you kind of just want to save as, like, an intermediate value, you can use reactive.calc as a reactive calculation.

One of the things is reactive is a part of, like, the core of Shiny, so you do have to say from Shiny import reactive. That was a very conscious decision by the engineers that, like, actually tell you it is the same reactive framework in Shiny core or Express or just Shiny in general. So the import, we have to add another thing through the import.

But all we do is we create another object called reactive.calc and then this object, because it's not an input, it's not an output, it's, like, this thing in the middle, you'll notice that, like, I don't, like, this thing, this part here as an intermediate calculation doesn't really show up on the website, on the application anywhere. It's an intermediate thing. You didn't tell it how to render it.

So here's an example of an application. It's got an input slider at the top and we have a render text, so instead of render plot or widget or something like that, I'm just saying pull in the value from input X and then just show me that value. So, like, if I scroll this thing, yes, it's, like, reading that number.

What you can also do is if you have a more complicated calculation, you can save this as a reactive calc. So right here I'm using reactive.calc. I have X squared and then this object, right now it's going to be a I'm assuming it's an integer. It might come in as a float, but I don't think so. There's no decimal point. But this integer is going to be saved as X squared and then the way I use it is just saying X squared parenthesis. Go and call that function.

And, again, by calling this function, it is building that reactive graph. This is the magic of Shiny. It's, like, once you told me to show you this number, it's going to run this code and it says go run X squared and I'm going to go and run X squared. X squared, it says go look up X and then it's going to look up this value from this slider and then this value is saved as X squared. So it only needs to calculate it once. If I need to use it multiple times in my application, as long as the input doesn't change, it's not invalidated and so it will just return that cached value.

This is the magic of Shiny. It's, like, once you told me to show you this number, it's going to run this code and it says go run X squared and I'm going to go and run X squared. So it only needs to calculate it once. If I need to use it multiple times in my application, as long as the input doesn't change, it's not invalidated and so it will just return that cached value.

So same thing applies. We can do a reactive calc and instead of returning an integer, we can say return a data frame and now any part of the application that is reliant on that filtered data frame, we can just call it with the don't forget the round parenthesis. I keep bringing that up because that is, like, also I forget that all the time. The error message you'll get will be something along the lines of, like, I can't do this with this reactive object. The parenthesis actually returns you the calculated result.

So that's essentially how we're going to modify our application by saying, hey, we're going to just take that filtered data frame, make it a reactive calculation and then everything else will be based off of that intermediate value.

You'll notice here that, like, even though that I'm squaring something three times, it only shows up twice because this reactive calc, even though it shows up in between these two text boxes, it's not a render anything. There's nothing there that says, like, show the output to the page. You can write print statements and it will show up in, like, the console log or something like that, but it's not going to be rendered onto the page. You have to make an explicit render call.

So this is something that also happens a lot when I'm trying to debug something where, like, I have a render calc somewhere and then I just, like, what is happening with my dashboard? I'm either printing it or I'm, like, creating a temporary render output that just dumps me what I see. Something that I've done in the past is in my dashboard application I'll have like a, what is it, like a log panel, like I'll use the UI panel and make a log panel on the side and a lot of those intermediate calculations I'll just use that as a placeholder to dump out like all of my reactive calc outputs so as I'm building my app I can actually visually see that things are changing as I need so that's something that I've done in the past where you just have like a developer console view of what's happening.

But again, that's not really causing extra calculations because it's already visualized on the screen once, it's just pulling that cached value.

Walking through the full application

So I have the rich plot put in there, I have the plotly stuff put in there, so like Seaborn is for the data set, we have Shiny Express, we are doing some reactive stuff. This is the magic for any time you're working with any of those IPY widget outputs. There's render plotly, I think there's render Altair, but there's also in general, there's a render widget option. So if you're pulling some IPY leaflet , I think that's just render widget or something like that. So it's pretty flexible in that sense.

And Carson from the Shiny team really worked hard to make sure that it's compatible with all of the other Jupyter tools as well for those visualizations.

So let's talk about this app. We have our title, we have our UI sidebar, it has the bill amount, the check box group, and the action button, so none of that has changed. And now you see we have a reactive calc. And this index one, index two, union, is it union? Intersect. Intersect those things together to filter, to return tips filtered. This return type is tips filtered, but the function is called filter underscore data. So that means that any time I want to use this reactive filter data set, I call filtered underscore data.

And so right here, where this all became index one, index two, intersect, return, I can just say just take that data set, give me the first value of the shape, which is the number of rows. So not only have I reduced the amount of lines in this part of the code, I've completely gotten rid of the duplicate of that subsetting feature in my code base.

So that was card one. Card two, yes, I needed to calculate tip divided by total bill to get the percentage and then some string formatting. But again, all of that index one, index two, intersect is no longer part of that value box anymore. Same with calculating average bill. All I'm doing is using that reactive calc and moving forward from there.

Other things you can do is maybe you want to reactively render text for the title. You can totally save the bill to another reactive calc and maybe that bill value shows up elsewhere in your code. That's totally a valid thing to do. So there's nothing wrong with saying I'm just going to have a reactive calc and it's just one value. That's totally okay. The whole premise is Shiny will update that value, it will invalidate and recalculate as needed.

And then you use your normal good Python coding hygiene of let's not repeat our code with copying and pasting things over and over and over again. So it really does, reactive calc does solve that problem of reducing the duplicate in your code, which also means your dashboard will run a little bit faster because it's not recomputing something over and over again.

For other parts, so right here, this is a different render output that you've seen. But it's the same thing. Render instead of text, instead of render figure, instead of render plotly, we're saying render data frame and all I'm doing is here's a data frame, go and render it. I'm not doing anything fancy. You could totally do something fancy with the render data frame type, but I'm just saying in this particular example, let's just give you the data frame.

So again, there's also nothing wrong with a one liner that says reactive calc, show the reactive calc. That's literally what I'm doing right here. For our scatter plot, the code that I showed you in the slide deck is exactly the same. I'm just changing the variable for that data frame instead of like tips filtered, I'm using the reactive value for filter data set now. And then the rest of the plotting code, again, is exactly the same.

Same with this ridge plot, I'm using filtered underscore data and making sure I'm replacing it in all the places. But then this was just, that ridge plot just had a lot of code into it. And don't forget to return the actual plotting value. The ridge plot itself is using render underscore widget. And then you can also use render underscore plotly. I believe plotly would also work for render widget. But in this example, it's mostly just to show you that if you don't know what type of IPy widget it is, you can always use render widget.

And also the syntax is a little bit different because it's coming from a different plotting, different Python package with all of the IPy widget support. So it's render underscore plotly, not render dot plotly. So there might be some typos that you make. This is a pretty subtle one.

And let me give you a nice bigger view of it. Here's our dashboard. We have our render table, we have our scatter plot, we got our ridge plot down here. We got our three values at the top. And then now if the slider changes, you'll see things kind of jumping around. And again, all this is doing is actually just changing the value here, this data frame. And then everything else is really dependent on this data frame input.

And think about your own data sets. It's not limited to one data frame. You can have multiple data frames. Maybe you have one slider filter one data frame, another slider filter another. And then you have a reactive calc that does the joining. That's totally a valid technique as well. And that's a type of data cleaning that you have to do in the dashboard, right? Like you're asking the user to interactively filter so that you can interactively or reactively filter or join those two tables. That's something you would have to do in a dashboard.

So, that's our example of a more complicated application. It brought out a lot of new things. So, you have some idea of a layout. You have cards. We talked about column layouts. We talked about the CSS bootstrap grid system. We talked about reactive calculations so you're not just one to one. And all of these are using different cards so we can have a little bit more white space.

If you don't like it, you don't have to use a card. But that's sort of one of the things that I guess the team has decided is like the more you can wrap things around cards, you have a nice little drop shadow. It just helps pop out certain elements. That's really up to you and what you need. Because sometimes it does do a little bit too much white space padding so that's really up to you if you need that in your application or not. But those types of layout components are all available to you.

We haven't talked about the reset filter. Not in this particular class. After this course, I'll give you the answer for this reset filter button. It's using a totally separate... Not a totally separate. There's another mechanism. You use the ID of the component and there's a... I think it's like... Like UI.update. There's an update function that you use. And so this is now a special button that when you click on a button, it creates a side effect. It's just doing something else. It's not necessarily an input or an output. It's just changing an existing input. So that's also another feature that you can do.

But that's our first application. It's building on top of the same concepts that we talked about in our penguins example. But a little bit more worked out. So hopefully this gives you a little bit more inspiration for whatever dashboard that you're looking into.

LLMs and chatbot anatomy

So next, let us talk about some LLM stuff. And that's kind of like the thing everyone's talking about these days. But Joe Chang, he's giving a talk here about some of the scientific implications and how can we make LLMs work better for us as researchers and scientists. He's giving a talk on... Tomorrow. There's a talk tomorrow that Joe is giving. He also gave, like, a 40-minute webinar talk about what I'm talking about today, what he's talking about tomorrow. There's, like, a way more worked out version that you can click on that YouTube link. But we'll talk about some of these questions.

Who here has just used ChatGPT? If you're, like, my students, it's really everyone. And if you didn't raise your hand, you're probably a liar. But who here is, like, pretty skeptical about them? I think I was as well.

I'm assuming it's because your interaction with it has been primarily around, like, the desktop app of, like, ChatGPT or, like, Claude Code or something. Or Claude. Has anyone actually worked with an LLM, like, through code, like, in Python? For example. Like, coding against it with the API. A few? Okay. So we'll talk about coding against it with the API. And then literally show you how you can incorporate that into a dashboard and then put in, like, some guardrails around that as well.

And again, with LLMs, like, unless you're using a local LLM model, don't just dump stuff. Your IT folks are probably not gonna be very happy if you go and do that. So don't do that.

So we are going to first talk about the anatomy of a conversation. How does a conversation between us and a chatbot actually work? A conversation is really a sequence of HTTP requests. There's a server. And it is entirely stateless. That is a keyword. And I'll show you... I'll talk to you about the ramifications from that in a little bit. But you talk to a server, and it gives you back a response. And then you re-ask another question, and you talk to a server, and it gives you back another response. And it's entirely stateless in that the server, like, when you make the second request, technically doesn't know that you had a previous, like, line of conversation with it.

A conversation is really a sequence of HTTP requests. There's a server. And it is entirely stateless. And it's entirely stateless in that the server, like, when you make the second request, technically doesn't know that you had a previous, like, line of conversation with it.

So if we have a conversation and we say... What is the capital of the moon? You get a response. There isn't one. And if you say... Are you sure? The response will be something like... Yes, I'm sure. Right? And like... This seems like there's this concept of memory. But I just said... It's stateless. There's actually no concept of memory going on.

You have some role. There's a system prompt. Right? And this is a prompt that a lot of people talk about that is specific to the LLM. And you don't necessarily, when you're talking with it, have the ability to change it. But then you are able to ask a question. What is the capital of the moon? And you'll notice that this comes from the actual user. So it's able to keep track of who is saying what, what the rules are, what the model you have is, and then your API key or stuff like that. So you can specify the model. Here's the components. And then you get a response back. The response, you'll have... There's a specific type of role called assistant. And it'll be like... The moon doesn't have a capital. And that's the answer that you actually get back.

Other things that you'll say is... Why did the answer stop? It's because... Well, it just stopped. That's it. One of the reasons why it would stop is... You ran out of money. And I ran out of tokens. So if you want me to give you more words, give me more money is sort of another type of reason. Other things... LLMs are highly dependent on what's known as a token. It's almost like a word. And essentially you're paying for the number of token transactions.

When you go and ask a follow-up question... The assistant goes like... No, it doesn't really have a thing. And you say... Are you sure? You'll notice that when you type in all you... Are you sure? This is the entire payload. Like, the conversation history actually went with it. And what that means is... If you actually look at the response and it's like...

The assistant is this. And you look at the number of tokens. It went from a really small number to a fairly large number. And it doesn't seem like it made... Clearly that number jumped higher than what you would expect if it was remembering stuff along the way.

And it's because when you go ask the follow-up question, literally all the history went with the follow-up question. That's why the token number sort of... I don't know if it's exponential growth. But it grows the longer and longer you have your conversation. And that's in general how all LLMs work.

There's techniques that people are coming up with that try to cache things. But as far as this class goes, that's how it works. Which is kind of scary. You're going from 12 and you ask the follow-up question, and it went from 67 and you're like... That doesn't make sense. It's a literal scam. That's what's happening.

Tokens and context windows

Tokens. The fundamental unit of pricing and how things work are on tokens. So for example... What is the capital of the moon? It gets broken up into a sequence of numbers, which represent a token. Internally it's all doing math. Like transformer math. So that's eight pieces of tokens. Other words, like counter-revolutionary, will get broken up into four tokens. And then you're being paid on that.

Everything is based off of pricing. So for Cloud Sonnet, you are given, like... Whatever tier that you're paying for, at that mid-tier level, you're allowed about three million input tokens, $3 for a million tokens, $15 for a million output tokens.

And there's also what's known as a context window. So when you are having that back-and-forth conversation, and that entire payload of conversation is happening, that is the context window. So models have... That's why sometimes when you have a conversation with ChatGPT, it just says, like, I can't, I have no more space. It's because your conversation lasted too long. It's either you asked it to give you too many things, and it's part of the context window. Or your conversation just lasted way too long.

So those are sort of the key vocabulary words. Different models have different context windows. So you can, like... For example, this is a general sense of how large context windows can be. I'm in the very slow process of reading Gerscher-Bach, and that's about 67,000 words. So it's about maybe 80,000 tokens. And that is, like, the context window. So when you're talking about giving it... Like feeding it the documentation. So I can ask more questions about the documentation. That is all part of the context window. You're just feeding it that information.

That is how chats work. Again, the key part is everything is stateless.

Providers and the chatlas package

So let's talk about getting our first application working. There's a bunch of different providers. So we have OpenAI, which makes ChatGPT, Anthropic, Google X, Meta, which makes Llama. The Llama models are really popular, because it is completely local on your computer. So as long as you have the disk space somewhere, you can have that model, and it's not connecting to the internet somewhere.

So that's sort of why a lot of talks... I think Eric Ma gave a workshop yesterday about LLMs. That is why his workshop, I think, is using Llama models. It's because it's local, and it's not going anywhere. And Eric works for Pharma, so they don't really want their stuff going out into the public for no reason.

So here's some code. I can run it for you, but I just want to just generally talk about how do we actually code against this stuff? So for OpenAI, and you can take this code and run it if you want, assuming you have the API keys. For OpenAI, you create a client. And then here is literally the code that sets up, like, what is your system prompt? And then here is the user prompt. I type this in and say, what is the capital of the moon? And it will give me a response back, and I can go and print this response if I want to.

And if I want to ask a follow-up question, here's why sometimes working with this stuff is a bit cumbersome. You have to, when you go and ask a follow-up response, because everything is stateless, you have to construct that history together to go and ask that follow-up response, right? So when I ask you, are you sure, it comes with the, what is the capital of the moon? No, the moon doesn't have a capital. Like, that text needs to be in that payload. That HTTP request has to be in there. What that means is you as the Python programmer, you have to sort of handle and stitch everything together.

Another thing that you can do, if part of the setup instructions was around setting up a GitHub token, and it's because you can use GitHub models to actually use and connect to OpenAI, and have a free way to access the latest models, which is a really nice thing when you're prototyping small applications. There are rate limits to them, but it's better than putting in a credit card to do anything.

You can just do something right now, and if you're like, okay, I've ran out of credits for this hour, this seems like it works, maybe I can now go and use this as a demo and show it off and get money from other people that are willing to pay for this thing. So GitHub models, I would say, it's not new, but I recently discovered it, and it sort of made a lot of these workshops and tutorials a lot more useful, because anyone here, as long as you set the GitHub underscore token environment variable, you have access to all of those other models that GitHub has.

So don't let the cost of using any of these models be any barrier, and I very much implore everyone to try this stuff out.

So the whole point of this is the folks at Posit and Carson, who's going to be here in an hour, he wrote a package called chatlas, and one of the main premise of chatlas is when you go and ask for that follow-up response and you just say, chat, are you sure, as a second follow-up response, internally it's appending the message for you, just for you. So yes, you have all of the control with OpenAI and LangChain, and those are all frameworks on how you can talk with a chat bot or an LLM or the API, but for the most part, many of our use cases is I just have a follow-up question, like, I don't need all of that manual control infrastructure, and so chatlas, one of the ways that you can work with chatlas is you're saying, well, this chat object will keep track of the history for you, and when you ask for a follow-up question, it will just append it and resubmit it for you, which is really cool.

So yes, you have all of the control with OpenAI and LangChain, and those are all frameworks on how you can talk with a chat bot or an LLM or the API, but for the most part, many of our use cases is I just have a follow-up question, like, I don't need all of that manual control infrastructure, and so chatlas, one of the ways that you can work with chatlas is you're saying, well, this chat object will keep track of the history for you, and when you ask for a follow-up question, it will just append it and resubmit it for you, which is really cool.

So I am using ChatGPT. The model is 4.1, and then the system prompt is you are a terse assistant, and I can say what is the capital of the moon, and it will give me that answer, and I can just say are you sure, and it just, like, no more of, like, appending and all of that, which is really, really nice.

And if I really need to, you have access to this chat object, and you can see, like, here's, like, that JSON response, and I'm afraid to scroll up anymore because there might be an API key that gets exposed, but you do have access to all of those components, so what is also really nice is because of OpenAI and how popular that API is, a lot of other tools will model the OpenAI API interface, but you still have access to that JSON underlying information. So if you need to work with it in a more custom way, you totally could.

The 20 questions demo

There is a demo in here called 20 questions, and this is just to test your, like, understanding. The example is written in R, but I don't think it matters. But there's some cool stuff in here, and this is sort of why I just wanted to provide some space to bring up why chatlas is also a really cool tool. So let's play 20 questions.

But this application actually, like, yes, that's 20 questions, whatever. But actually the application is written in such a way just to prove one point, which is if we actually look at what actually is the model used behind each interaction, it is flipping between a totally different LLM model during each conversation. Why does this work? Again, it's because it's stateless, and the entire history is getting payloaded through each interaction. So this application, less about, yes, it's a Shiny app, technically, but it's really just to also drill the point that, like, these things are stateless. You can capture the full conversation just by getting the information from that last bit of whatever the last response is.

Here's an example of a system prompt. So this is asking it, like, here are all the rules for playing the game. That's how that works. But instead of Elmer, pretend this says chatlas. But that's also one of the really nice things about chatlas is there is a separate function that gives you a chat object that allows you to connect to all of those other API tools, API endpoints, or all those other chat endpoints. So it's less about OpenAI being the best or Anthropic being best. The Shiny team, or really Carson, really made it easy for you to just try out a lot of things. And as scientists, like, the better... The more tools you're given to be able to try out a lot of things to see what works for your use case, like, that's sort of what scientific exploration is or a part of it.

And so this is one of those tools that I want to bring up, because it really does make it easy to try new things. You can do the same thing with Llama models. Like, yes, you can download the 250 gigabyte model. That's great. It's 250 gigabytes. You might not have that much space on your computer. So could you work with a smaller prototype on your laptop and then in deployment work on a different model? Yes, it makes this operation much easier.

Could you use one of the free GitHub models and then compare the ChatGPT 4.1 to a Llama model? It is literally changing the parameter of that chat initialization feature there. So you have a lot of flexibility to change different objects from one another. And then the chat object, when you talk to another system, it will just reformat whatever payload it needs to just to be able to have a conversation with that. So this tool is less about telling you which tool is the best, but it's really just giving you a much nicer interface to work with these other tools.

And also, yes, we're at SciPy. It is an open source tool. So it's not like you're paying for chatlas or anything. If anything, you're paying for the API key. I never see any of that.

Building a Shiny chat interface

So why did I end up bringing all of this into this particular workshop? It's because, well, this thing that we just interacted with was a Shiny app. And so how do we go and build a Shiny app like this? And then the next part is, how do we build a Shiny app that has a chat interface that actually can work with data without it just hallucinating and making stuff up? And that's also one of the things that I think is pretty valuable for us to see and understand.

So there's two different tools that we can work with. If you're coming from the R world, there is a package called ShinyChat . If you're coming from Python, it is also a package called ShinyChat. It's the same name, but it works in both languages. And they both will connect to Shiny for R or Shiny for Python. So here is ShinyChat in a Python application.

We have a ShinyChat import, but the rest is the same Shiny application that we've seen before. So we have a title. We have to go and create the chat object, and here we're saying, here's the ID. And then I think by default it is set for, I think, Anthropic, but you can point it to a GitHub model. And then we say like, hey, show us the UI part of the app. And then the rest of it is code to display the actual chat application component.

So let me just run the code there, and then we'll see it in action, which I believe that will make a lot more sense. So this chat application, I think, doesn't do anything important. I think it just echoes back what I have said. So we're going to say, hello, class. It's just going to say, like, you said hello, class, right? But this is the core component of having a chat interface into a Shiny app, which is create the actual chat interface, and then use chat.ui to get the little chat-like thing where the user can type, and then the conversation output happening. That is the whole crux of having a chat component into your application.

Using the system prompt to work with data

So now the question is, how do we leverage this information to work with data? So in my, like, 20-minute way of telling you everything there is to know about how chat apps work, one of the things that I have also said was the system prompt is the command that you don't really get to tweak when you're having the conversation, but when you're programming with the chat application, you can provide a system prompt. And that system prompt is the instructions and steps you give the chat bot to say, like, these are your rules. Please follow these rules.

That's why in the 20 questions app, there was that block of text that said, you only play 20 questions. If someone asks you to write some Python code, just say that's not part of the game and ask for a follow-up question, right? You literally have to type those instructions out. Those instructions go into the system prompt section of the chat app. There's only a couple places you can modify a chat application. It's the system prompt or, like, your question to it in chat. There's other things, techniques, like RAG and all of that. At the end of the day, you're either modifying the system prompt, you're modifying the conversation, or you're doing something with the context window, which is what RAG is trying to do. But that's it. There's only a couple places you can ever tweak or fiddle with a chat application.

So if you get a little bit overwhelmed with all of the cool features or the hype of what's going on, just remember, everything's stateless. There's only a couple places that people really have access to tweak. And that's really the crux of it all.

Query chat: constraining the LLM to SQL

So on top of Shiny chat, which is a general, like, you have a conversation with it, what we've done is we've modified the system prompt such that it works better with data. And what query chat actually does is you feed it a data frame, and it will go column by column, figure out the data type of the column, and create essentially, like, the schema of your data frame. And then all of that information gets put into the system prompt. So that's step one. Your chat bot now knows some information about your data.

Whether or not this is a categorical variable. If it's categorical, and there's, I think, like, by default, there's less than ten unique observations. Yes, the bot will know about all ten unique observations. But if it's more, it'll treat it like a string. If it's a number, it'll know, like, is it an integer or a float? And then know the upper and lower bounds of it. So that's the information that the chat bot has access to.

The other thing that makes query chat really different is in the system prompt, we then say the only thing that you are able to give back to us is SQL. Like, you have to... Any time we ask you to say, like, filter my dataset, you have to give me the SQL command that tells me how to filter that dataset. Or if you need to do some calculation, SQL is pretty powerful. You can actually ask it for, like, calculate outliers by using, like, three times the IQR. There is an actual IQR function in SQL. And so you can get really far with just limiting its response to just SQL. And that's essentially what query chat is doing.

Is instead of that little side panel that opens and closes and you put in a filter, a slider, a checkbox, or all of that, if you replace it with a chat bot and you tell this chat bot the only thing you can do is speak in SQL, essentially I can take that SQL statement and through DuckDB or through Pandas or through Polars run SQL against it, just like that TIPS dashboard that we just had. Everything reacted to that one data frame. So once SQL is able to filter the data frame, the entire application can filter. And that's really cool.

Is instead of that little side panel that opens and closes and you put in a filter, a slider, a checkbox, or all of that, if you replace it with a chat bot and you tell this chat bot the only thing you can do is speak in SQL, essentially I can take that SQL statement and through DuckDB or through Pandas or through Polars run SQL against it, just like that TIPS dashboard that we just had. Everything reacted to that one data frame. So once SQL is able to filter the data frame, the entire application can filter. And that's really cool.

Because you give the user all the flexibility to specify how they want to filter a dataset through natural language. And it also frees you as the developer of, like, what happens if you have 50 columns? Are you going to put in 50, like, sliders in there? That looks really bad. I've tried it. It looks pretty bad. It never looks really good.

The other thing is you saw it even in the example code with the TIPS dataset, the dashboard was if I had two different filters and I wanted to filter the dataset, it was an and operator. It's either, like, all of this range or only this. There's no way to say any more complex type of logic. But if you're specifying in natural language, you can totally do that. You can say give me the top ten and you don't have to figure out what the range is for top ten. SQL can go and figure out, like, calculate the percentage, give you the top ten results. Like, limit ten. That's how you get the top ten. So you now have this ability to, like, yeah, all of those controls that we just talked about for the first two hours can in theory be replaced with a chat bot and then you constrain it to only speak in SQL given this data schema that you have.

And so we'll work with this particular application and you'll see it in practice. We're using query chat, we're using chatlas to talk to, I guess in this case, Anthropic, but you can be GitHub models. And then the rest of the app is really similar to what we've seen. Here's the Titanic dataset. This is some setup for query chat, which is just saying, like, hey, here's my Anthropic thing. There is a system prompt that's being involved. That's okay. I have to initialize it. But I am also giving you a greeting. So this is all the stuff that shows up in the chat app. And then the actual application is here's the chat part and then render the data frame based off of the chat.

And then I'm actually on time now by just talking through the concept and then showing you a couple of demos.

Query. Okay. It's working. All right. Cool. So this application, it's all this is. It's, like, small enough. Here's the UI. The chat UI. And then all I'm saying is, like, you're filtering a data frame. Here's the filter data frame. And you can actually ask it questions like, show me those who survived.

And then you'll see it's giving me the SQL. This is the thing that's really nice about the context of the chat bot, is this is inspectable. You can actually see, did it actually give me the right SQL? Select star from Titanic or survive equals one. That's what I asked. So if you have any question about whether or not this is doing the right thing, it's harder to hallucinate a SQL statement and it's easy to inspect a SQL statement. And that is sort of gives us a little bit more comfort in terms of if we want a chat bot to interact with our data, we can. And here's the data set that is filtered. And now if you need other metrics with it, you can do that.

The benefit of this is now you can say stuff like invert that or uh-oh. Perils of live coding. Showing live demos. Okay. Show me those who survived. And then invert that. But essentially what's going to happen is when you say invert that, in natural language it will realize that it's like, oh, no, instead of we're survive equals one, it's survive equals zero. Right? And so now you can actually have a somewhat nicer way to interact with something instead of like clicking reset, unclicking a whole bunch of stuff.

Is let me say like show me just the outliers who paid the most. So in this particular case, it was like selecting the 95th percentile within like the entire group. Right? So like, again, SQL is really powerful. SQL is also one of the things LLMs are really good at doing. And so you can leverage those two techniques for your actual data work.

And just to round off this particular section is side bot is not an actual package. It is a demo. But here's our tips data frame.

This is exactly the same thing that we just built before. And you can show me like show me the top tippers.

And again, it's going to replace this whole sidebar with my filter objects. And it's going to say order by percent descent or something like that. And there you go. It showed me the top tippers.

And then what's going to happen is remember our code that we just worked with is our entire application was reactive to this data frame. So if we have different cards, they're going to react. Or plots are going to react just because we have this reactive calc that has been filtered using SQL from an LLM.

That is one technique that I think we've like when we were all like we as the Shiny team, when we were tinkering around with LLMs, we sort of realized that this is a really powerful combination of techniques that work really well, especially if which many of you are really skeptical about LLMs. This is like one entry point into, hey, we can actually make something really useful and still have the ability to inspect this.

This is like one entry point into, hey, we can actually make something really useful and still have the ability to inspect this.

And then now we've reduced like if we show this to our like anyone who's working with our data, like, you know, how many times have we been asked to do something and they're like, oh, can you also add this one other thing? And it's like simple for us, but like, oh, man, that's like another 30 minutes because I have to figure out how to open the project again and like update or add another slider and hope that it works and et cetera, et cetera, et cetera.

Now we can say something like show me just the Sunday. And then like I don't know if it's going to put in the previous statement with the current one, but it doesn't matter. It gave me the SQL that's saying like, okay, no, it didn't show me the top Sunday, it showed me all of Sunday and I can inspect it. So this is like a render text of the SQL, even though that the conversation is also showing it to us.

And again, the code itself, all it's doing is like take SQL, apply to data frame, and then that's it. The rest of the app reacts to that rendered that reactive calc.

Break and upcoming modules section

So let's take a five minute break. And then I have the last section, which is how do we build modules that scale to, that help us scale our application. And I think because of the time of day, that is a please sit here and listen and then we'll talk about it. And I think the Shiny team will be here by then.

So take a five minute break. And if you haven't done so, you can ask a question. Ask all the questions about anything. Sorry that like your question is showing up. I was hoping that it would only show us this. But come back at the bottom of the hour and if you have questions, just post them here. And then when at the top of the next hour, we will talk about all of that. And we'll answer go through all of the questions that people have.

Shiny modules

Okay, so there's another topic, and this was the reason why I think it makes more sense to just shift it off and not do live code. If many of you aren't actively creating Shiny for Python applications, but there's a concept of a Shiny module that really helps with scaling your application if you have a lot of components and you want to section off all of this logic with all of these reactive calcs.

I just needed to kind of hide it away because it's all tied together, and maybe you have a multi-page app and you want each page to be in its own little world and reduce the clutter on any individual file. So there's this concept of similar to a Python module, which is a .py file that you import, there is a concept of a Shiny module, which can be a file that you import, but is really a way where you can hide away bits and pieces of Shiny code or reuse different components over and over again.

What is a component that you might, or something that you create that's reused over and over again? A few weeks ago, I created, with my master students, we had this project where we were working with different farms, and we were trying to measure the productivity, how much energy and the productivity a farm is able to provide. And we worked with one of the research labs at the University of British Columbia where we were saying, hey, we care about these metrics, and here are the figures.

But we had multiple farms to compare, so I had this idea of what if we treat each farm like a product of the Apple website, where you can compare your laptops together, give you the option to pick a dropdown and compare multiple farms with one another. And that was a really fun idea, because you can look across, and you'll see the same figure or the same number across multiple farms.

But I also wanted the ability to have that, like, compare field, where you can click on a button, and it will let you, it will create another column, and you can do another dropdown to pick another farm. So just like you can select three or five different Apple products that you can compare, you can select three or five different columns and then make a dropdown.

And so that was a really good example of a Shiny module case, because one column is just being repeated over and over again. But the most important thing that I'll bring up again that we mentioned in the first hour was, like, the idea of each component is really important. The other thing I didn't mention is the idea of each component has to be unique.

So I can't just, like, write a function that just pops in another, like, all of that data again, because the ID is not going to be unique. So you need a mechanism to change the ID of something. And that has to be tied to all of the output components, because you're reading it from the ID, and there's this tight coupling of inputs and outputs and the name of the ID. This is the problem that Shiny modules tries to solve, and this is how you can scale applications, especially if there's a component or a part of the application that needs to get repeated.

So here's another motivation of less of something that gets repeated, but you might have an application that just has a whole bunch of different parts, and you just want to segment off these parts away from your app. Modules are another way to sort of just, hey, this is all, like, related to one another. All of these reactive calcs, they're all together. They don't really get used anywhere else, but I have a 3,000, 5,000 line dashboard file, and this is just really hard to maintain. How do I sort of break these things apart such that, you know, maybe the data part is in its own little, like, Shiny module?

So you can also use modules as a way to help scale your application, which is technically part of the title of this workshop that we're getting to right now.

Walking through a module example

So here is an example of an application. And I believe this works.

So, again, it's not the perfect example of why we want to use modules, but we'll sort of break this part apart, this application, such that we'll convert this code into a module. And what is the thing that we want all tied together is these three filters with this particular dataset. And that's something that we want traveling together in modular form.

So here's our initial app. What does this app look like? So it looks like we have this input checkbox group. We have an input slider. We have an input slider. And those are the inputs of this particular application.

One of the things about this particular slide deck is I wrote all of this code in Shiny Core. So you'll have a glimpse of what Shiny Core looks like. Essentially, instead of just using width statements and just calling UI.input checkbox group, I'm saving all of these into a particular variable that contains all of the UI components. So it's a little bit more code, because I have to wrap everything in a UI component variable or a UI variable. But everything that I just showed you with Shiny Express is the same.

Modules aren't just tied to Shiny Core. You can use modules for Shiny Express. I just sort of said, hey, I'm not asking you to copy or run this code. So it's okay.

So in the server code, we've seen a reactive calc. In this reactive calc, it's really similar. We have some type of variable that will handle a mask. We will go apply each of the filters. So this is my original way of find all of the... Apply the filters for each of the columns. If it's a number, do that. Otherwise, if it's a categorical, do some other stuff. And then at the end of the day, just return my filter data frame and then show it.

So multiple ways where you can say, take in all of the components of the input and then apply those filters to our data frame.

So one of the ways that we can do it is write a for loop to create all of the UI elements. What we're trying to do is for each of the columns in my data frame... For each of the columns in my data frame, if it's a number, I want you to create a slider. If it's a categorical variable, I want you to go and create a checkbox group.

And so one way that you can do it is I have a list of things. I want you to iterate through this list and then go and create that list from it. So I've sectioned off the most important parts is, hey, I have all of my filters. And I'm gonna go through each one of my columns. And I'm going to look at the type of that particular column. And if it's a number, return a slider. If it's not a number, return a checkbox group.

And one of the things you'll also see is I am setting the filter... The ID to be filter underscore column name. And it's my kind of way of saying I'm assuming that each of the variable names or the columns in my dataset are going to be unique and I don't have two of the same column names. And I'm going to use that as the ID of my UI component.

But what happens if we need to track more information? We need to also track the kind of filter, not just if it's a number, make it a slider. So for example, what happens if I need to know... No, I really want this to be a two-way slider, because that's the range that I'm going for. This only makes sense if it's an individual number. So now I'm trying to track not just the column name, which then feeds into the ID. I now need to track some other variable that is like a piece of metadata on that column.

And so you can do that. You can sort of write it out as a list comprehension, and then now you have all of the pieces of information sort of traveling together one at a time to go in and reactively or dynamically create the inputs that you need.

So when you need to do something like that, what you need to do is... This is a part that we didn't really talk about today until now, which is you can actually write code that generates user inputs, and then use that dynamically generated user input as the actual input in your code. It's kind of meta in that sense, where instead of saying, like, I want a checkbox right here, and checkbox shows up, you can actually write some type of reactive code that says... If this happens, put in a checkbox. If that happens, put in a slider. And use that return object as something that shows up as the input component in your application.

So in Shiny Express, anything that was like an at render is essentially server code. So you can actually say, like, hey, I want to generate a piece of UI. That's what gets reactively generated. And then I just want you to put that right there. So you can also do that in Shiny as well.

Again, that example, we didn't really cover in our tips or our Penguins example. But that is another really good use case where... Yeah, like, if you're asking a user to upload a data frame, you probably want different things showing up on the screen, depending on... What happens if your dashboard accepts, like, two different types of, like, machine-generated output? And one is from machine A and one is from machine B? You can go and figure out what data set is coming from what machine. Maybe it's that simple as counting the number of rows. And then that governs, like, the entire how... What application layout gets loaded.

So that is something that you can do with this render UI component is, hey, when I get this output, render all of this other stuff in one way versus another.

When to reach for modules

So, there's a couple ways that we can, you know, when we are trying to create a whole bunch of UI elements, you can use a for loop to create the input components. You can use a for loop to read the input components. You can use a for loop to place the input components somewhere. You could define a helper function that essentially uses a for loop to put stuff there, or you could dynamically render it with something like render.ui.

But, and in our particular use case, when we are looking at a data frame, we want to track the column name, the label, the column type, and then like the actual input components. We're like trying to track four things all at once.

So, whenever you have that situation where you're like, hey, I'm trying to dynamically create something on my dashboard, and then like, I'm also now trying to keep track of a whole bunch of stuff, that should be, when you get to that point in writing a Shiny app, take a pause, and that's going to be, come back to this slide deck, or like I've also have a whole talk that went through this slide deck as well. Take a pause, because now you're in this situation where like you're trying to keep track of all of this stuff through a for loop, and that's not really the best way that you want to deal with that.

So, if you end up calling the same component to create a function multiple times, like take a pause. If you're creating a list of IDs, and like you're just appending stuff to make a bunch of IDs, because IDs have to be unique, also take another pause. That's probably, you can slowly start creating more problems for yourself. Maybe you're creating two lists, and you're trying to zip and like iterate the two lists, because they're all tied, the information is all traveling together. Don't do that. That is another place where you might want to pause.

And then, yes, like iterating across multiple lists to ensure things are captured together, like especially if you're using the zip function.

So, in our example, we have our filter component. For each column, we're going to do a whole bunch of stuff to figure out what component is in there. What, again, what you can do is like, maybe you have a list of columns, a list of column types, and a list of filters, and then now you're going to go dynamically generate that part of the UI. If you write code like this, like, again, that's one of those things where you might want to be, take a pause.

The other thing that we've talked about is if you have like, in your application, if there's a whole section that has like a bunch of reactive calcs, and it's all related to one another, and it doesn't really, like whatever the one output is gets used by the other parts of the application, but everything else, all of those calculations that's happening there, if that doesn't get used anywhere else, that is also another code smell when you end up writing a very large application that you're trying to, that you need it a little bit more maintainable. That's something that you're going to have to take another pause. Maybe this is another technique that you'll look into.

Dynamically creating IDs, again, if you are iterating through a for loop just so you can have a unique ID, that is something that, again, you don't really want to be able to do, or at least like minimize where you're doing that, because you're probably doing that in order to repeat some part of your dashboard. And then when you're repeating something that is against, you might need to think about other techniques on how to create these IDs dynamically.

How Shiny modules work

Okay, so Shiny modules. They allow us to encapsulate all of the IDs from the use, from the, what is it? From the inputs and the, from the UI and the server. And so we can, in our code, write, use the same ID. So everything is working together. And then when we encapsulate it into a module, essentially what it does is it prefixes or suffixes whatever you want to call that module. And then so it becomes our unique name.

And again, this is all on the premise of every single component, input component, has to have a unique name. And so when you end up in a situation where you're just trying to generate unique names, Shiny modules is one way where you can say like, hey, I want to reuse this whole thing again, but I don't want to manually go and recreate all of those IDs. Again, this is where Shiny modules come in.

So in our case, we are iterating through all of our columns. And each of our IDs need to be, they're tied to a particular column name. But then if we want to reuse all of this, they need to be also namespaced into one Shiny module name. So nothing is repeated.

And so the way we do this is we can take our existing code and we can say, create an actual module. And it returns like all of the calculated filters that we're working with. And then in our server, all we have to say is, here's our module, here's our new module, like namespace name, and then everything else gets prepended to it.

So we have our UI, it's iterating through and capturing all of the components that we need for our particular filter. On the server side, it is exactly the same code. It is the same reactive code that we want. And then we are literally copying, pasting it and setting it as like a module server. So we're encapsulating all of that logic that is doing the filtering in another separate function. And here we can see that, yes, this particular module or this particular function will return a mask that we can then use later on to go and filter our entire data set.

And so what this ends up looking like is once you modularize all of your code, you can then just say, here's this module that I want to go... I'm sorry. We want to go and call the module. So this is the module that we just created. We want to give it a namespace. So we'll call it module. And what this means that all of the filter IDs will have module, I think like prepended with like two underscores. And then that will make it unique if we want to use this whole system again.

Our module itself needs our Penguins data set so we can go and figure out for each column what type of output that we need. And then we will go and put those individual components somewhere in our UI. And then at the end of the day, our module returned back a filter mask by looking at the mask output of it. And we can use that just as something that we use regular pandas to subset on.

And then with that, what you can do is because now you have two separate functions, one for the server and then a separate, this server function also goes and you can create the UI, corresponding UI component of it. You can take those two pieces and move all of that code into like a helper.py or a module.py. And then what you can do in your application is now say import module.py those two components or those things, and you can just go and reuse them. So it is a way where you can go and take really big complex parts of your application, move it away into another module, a Python module if you need to, and then it'll slowly help simplify your application.

Now the fact that you have a single .py or Python module, what you can then do is create a package for it. And now if you have like a really complex, like, hey, if I feed you this type of data input from this particular type of machine, your module can take that as an input, do all of its calculation. And what that means is the end user can just work with the UI, work with the server. So it's really similar to when I just showed you the query chat example, where it's like, you just created the chat example. That's actually like a module, and you just use chat UI, that showed up there. And then the server code was all of the chat interactions there. And again, query chat is a Python package that you can install, and you can incorporate that into your Shiny application.

Now the fact that you have a single .py or Python module, what you can then do is create a package for it. And again, query chat is a Python package that you can install, and you can incorporate that into your Shiny application.

Okay, last. And then so what ends up happening is after you move all of that code out, instead of all of that long code with, let's go iterate through each of our columns and for each column, go figure out the data type, what you end up having is creating the server, putting that stuff in, putting the components where you want in the UI, the server returns a mask, and then you go and deploy the entire, or subset based off that entire application, right? So this code that you see here after the imports is like, that's the entire Shiny app, right? Like up here, that is like, okay, I'm loading some stuff up,

but all of that reactive logic is now hidden away into another file, and I'm just using that here in my application. So if you end, again, end up in a situation where your Shiny application is becoming like 1,000, 5,000, 10,000 lines long, figure out the parts that are completely tied together, and then turn them into a module if there's a bunch of related reactive calculations happening.

The last thing that you can do, and really what you want to be able to do with a Shiny module is reuse it. So if you look at the, if you remember the example I talked about, like, hey, I wanted like the comparison, like a product, what you can do is now you can reuse that entire module. So for example, if I wanted another tab that had the same penguins thing, these two things look the same, and it's because if I turn one into a module, I can actually have another thing just repeated again, and I think the code for that literally looks like penguins one, penguins two, that's the dataset, and then the module form looks like here's the Shiny code for module one, and I've already done all the logic with setting up all of that UI and all of that server code, so I just need to recall that module again, give it another namespace, and then I can have those two tabs without having to re-import, copy paste the same exact logic, just so I can change the ID for, because the IDs need to be unique in a Shiny application.