Using your dataset in Shiny Templates | Carson Sievert | Posit

Watch the Shiny team’s Carson Sievert change the dataset in a Shiny Template. Find the right template for you at https://shiny.posit.co/py/templates/ 0:00 Intro with Carson Sievert 0:19 How to load the template code 1:14 Running your Shiny app in VS Code with a live reloading preview 1:29 How this template works 2:23 See the contents of your data in a data_frame 2:53 How this template imports data 3:33 A more optimized way to import a large amount of data 5:10 Changing the dataset 5:44 Troubleshooting the inevitable errors when changing the dataset

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hi, I'm Carson. I'm an engineer on the Shiny team here at Posit. And today I'd like to show you a real basic Shiny template and show you how to get your own data into that template and start visualizing your data.

Right, so on the Shiny for Python website, I'm here on the templates page, which you can find under the components menu. We think of templates as kind of just a useful combination of components, layouts, and reactivity concepts. So I'm going to start off with one of our basic apps here. On the page for this template, I get some code that I can run in my terminal to get the code for this application. Right, so I'm going to grab this Shiny create code and paste this into my terminal. And it's going to ask me for a destination directory for these files. I'm going to say basic. And then give me some instructions to open up the app file as well as install instructions, go into this directory, and then install the requirements.

Running the app in VS Code

And then when you open up the app file in VS Code, as long as you have the Shiny VS Code extension installed, you should see a play button here up in the right hand side corner to run your Shiny app, get a live reload of my Shiny application. But really what I want to focus on here with this application is, you know, at a very high level, we just have a page title here, we have some code to render a histogram, and that histogram is reactively reading the variable from the data set that we want to be visualizing. So here this input dot bear is referencing this input select down here. And this input select has the VAR ID, the label for the input control, as well as the choices for that input select.

So these choices here are specific to the data set that this is visualizing. So these, the names here are very intentional and are referring to the columns in this DF data frame here. Before we actually go in and look at the details of how we're getting that DF, let's just also get a nice interactive display of this DF data frame. So here I can see the actual data frame behind this histogram below. And now if I wanted to, you know, I could add more input controls or modify this input select to look at the different variables in this data set. But let's actually have a look into how we're importing the data.

How data importing works

So we've set up a lot of these templates to be based on this kind of approach to importing data where there is a local shared Python file. So when we say from shared import DF, that's going to execute this shared.py file. And most of these, at least at the moment, are set up to use Pandas read CSV to read in a CSV file that you also get with the template. So this is nice and easy and, you know, low dependency way to read in some data for our application.

If you have larger data, you probably want to use like a more optimized file format and a more optimized way of importing that file format. So you might want to consider using something like Arrow to read in Parquet files or something like that if performance is an issue for your app when it comes to importing the data. But this also demonstrates like a useful concept when importing your data. You don't necessarily want to rely on relative file paths. Really what you want to lean into is using some sort of file structure like this, where if I can assume that the CSV file is a sibling to this shared Python file, then what I can do is leverage the double under file object to get the file location for this Python file and then get the parent directory to make sure that I'm actually in this directory for this application and then read the CSV file of the penguins.csv. So if I'm reading a file from disk, this is like a nice setup to do this in such a way that your code is very portable from machine to machine.

if I'm reading a file from disk, this is like a nice setup to do this in such a way that your code is very portable from machine to machine.

Another advantage of setting it up in this way where you import from a local module is especially for Shiny Express. This will ensure that you're only importing the data once. There are situations where, you know, if you're not importing your data in a certain way, that data can end up getting imported twice. So again, for performance, it's kind of a nice abstraction here to put this in a separate module to do your data importing in this separate module.

Changing the dataset

But just to demonstrate how we could get in a different data set here, I'm going to cheat a little bit in the sense that I'm not actually going to use a different file per se, but I'm going to leverage the fact that Seaborn prepackages data sets. We were just working with penguins, but let's do tips instead. And now I'm going to call that df since my app is importing from this file and looking for an object named df.

And now I'm getting some errors that are pretty expected in the sense that the Seaborn Hisplot logic is now receiving a different data set. It doesn't know what to do with this build length mm because that's not actually in the tips data set. But our render data frame is just displaying the data set that this new data set that we're importing from the shirt module. So I can just come to here for a nice preview of the data set that I'm working with. I can see a couple of numeric variables here, total, bill, and tip. So let's, instead of these two variables, do total bill tip. And now I have my data frame, uh, by default showing me, I believe the total tip.

So let me put this data frame below the select input so that I can see the variable that I'm working with and just make sure that I can change this to tip and yep, my histogram updates to show me the new tip variable. So that in a nutshell is how we can take a template, uh, bring it down locally, get it running in VS code and bring in our own data set.

Featured software#