How to automate your reporting with Quarto Dashboards and Posit Connect
Get ready to up your reporting game! Isabella Velásquez dives into the practical side of lightweight dashboards made with Quarto, the next-generation R Markdown, and Posit Connect, our premier publishing platform. You’ll learn how to build and automate Quarto Dashboards with Posit Connect. We'll showcase a Python example, but the same principles apply to R, Julia, and Observable. Helpful Links GitHub Repo: https://github.com/posit-marketing/inflation-explorer Q&A room: https://youtube.com/live/d21PQyOGlgY?feature=share For anonymous questions: https://pos.it/demo-questions Quarto dashboard documentation: https://quarto.org/docs/dashboards/ Quarto dashboard gallery: https://quarto.org/docs/dashboards/examples/ You can add the monthly recurring event to your calendar with this link: https://pos.it/team-demo Timestamps: 1:17 - Overview of the project 2:14 - The data (Consumer Price Index released monthly by Bureau of Labor Statistics) 3:00 - The benefits of using Quarto 5:41 - Introducing Quarto dashboards 7:17 - Navigation bar and pages in Quarto dashboards 7:30 - Dashboard Layout: Rows 7:47 - Dashboard Layout: Columns 8:00 - Tabsets in Quarto dashboards 8:16 - Sidebars in Quarto dashboards 8:25 - Cards in Quarto dashboards 9:50 - Theming in Quarto 10:38 - Today's dashboard on BLS Data 11:30 - Options for deploying Quarto dashboards 11:50 - Posit Connect overview 13:02 - Demo starting in GitHub repo for project 14:15 - Creating a virtual environment for the project 15:50 - Scheduling a script to run when we know new data is coming in (using pins on Posit Connect) 19:17 - Publishing a script to Posit Connect from VS Code 20:15 - Adding environmental variables to Posit Connect 25:30 - Final dashboard reveal
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hi everyone, my name is Isabella Velazquez, and welcome to this month's end-to-end workflow with Posit Team. Have you ever found yourselves spending hours creating and updating dashboards, only to have them become outdated shortly after? Well, today we're going to tackle that challenge head-on. With Quarto, the next generation R Markdown, and Posit Team's suite of tools, we'll show you how to create dynamic dashboards and automate data updates.
Quarto is a technical content creation system, and when we say content, we mean content. Quarto allows you to harness the power of Python R, Julia, and Observable to create a wide range of outputs, from websites and presentations to the latest edition dashboards. And with Posit Connect, you can take your dashboards to the next level by automating data updates at regular intervals.
Throughout today's webinar, we'll walk you through the entire process, from setting up your project and extracting data, to pinning it on Posit Connect and creating a dynamic Quarto dashboard. Don't worry if you're new to these tools, we'll start with a quick overview before diving into the code. But this isn't just a one-way presentation, we want to hear from you too, so please feel free to ask questions and share your thoughts during the live Q&A session at the end. Together, let's explore some of the possibilities of Quarto and Posit Team, and I can't wait to see what you create.
Project overview and workflow
Let's go over today's project. This is the general workflow. First, we have our data, which can be in any format. Today, we're getting data from an API. Next, we bring that data into VS Code on Posit Workbench to clean and format it for our dashboard. This process is sometimes called Extract Transform Load, or ETL. We can do this in a script or a QMD file.
Within that document, we pin the dataset on Posit Connect using the pins package, making it easy to use the data in different places. The ETL process is automated by scheduling refreshes of the script or the document in Connect. Then, we create a Quarto document for our dashboard, which reads a pin to generate the visualizations. Finally, we publish our dashboard on Posit Connect and schedule it for automatic updates to pull the refresh pin from the ETL script.
The data we'll look at today is the Consumer Price Index, or CPI, a key measure of inflation that tracks the changes in the prices of a basket of goods and services purchased by households. It covers a wide range of categories, including housing, transportation, food, and more. The CPI is released monthly by the Bureau of Labor Statistics, or BLS, in the United States. Today's dashboard displays the data from the BLS website, refreshing every month when the new report comes out.
And the final dashboard will look like this. It displays key metrics in value boxes at the top, a time series visualization showing the trend of values over time, and comparison metrics with category breakdowns in a graph and table. And this is just one example of what we can create with Quarto dashboards.
Introduction to Quarto
Quarto is an open-source scientific and technical publishing system built on Pandoc. It's a project sponsored by Posit, taking what we've learned from creating our markdown with the goal of making it easier to create scientific documents. You can write together narrative and code to produce beautiful and professional outputs.
Quarto offers a unified syntax for core features and maintains consistency across all formats. Features like tab sets and code folding and cross-referencing all function in the same way. Additionally, Quarto provides a universal theming system based on SAS for all HTML output, using core variables defined by BSLib. Quarto can be called from the terminal or the command line and is completely independent of computational systems. While Quarto supports NIDAR, the engine behind our markdown, it can also support other systems. And ultimately, the goal is to create dynamic outputs that are reproducible and easy to create.
Quarto enables single-source publishing, where different output formats like websites, presentations, and more are created from a single source.
Here's how Quarto works in the backend. An engine is the tool that runs code. This is the basic structure that creates documents from the source, the Quarto markdown file or the QMD file. When we render a Quarto document first, the engine executes all of the code chunks and generates an intermediary markdown file that includes both the code and the code output. Then that markdown file is processed by Pandoc to create the finished document.
Today, we'll be using the Jupyter engine since we are creating a dashboard of Python. But all the same principles apply to the NIDAR engine for R because Quarto is inherently multi-engine.
And here is an example Quarto document. At the top in the YAML header, we specify the format and the engine we're using, in this case, Jupyter. Then within the document itself, we have text and executable code here in a Python chunk. Running or rendering this document on the left gives us the document or the output on the right.
Here's another Quarto document where instead of format HTML, we have format reveal.js. And changing just the format without changing the actual content of the document gives us another output but with the same information, the presentation on the right.
Quarto dashboards
Quarto dashboards is an exciting new feature that allows you to create static and interactive dashboards using Python, R, Julia, and Observable. Below are some excellent examples from my colleagues, which you can see in the Quarto dashboard gallery link below. Released in Quarto 1.4, users can develop beautiful dashboards with a diverse range of components such as Plotly, Leaflet, Tables, Text, and more. These dashboards offer extensive customization options, allowing you to define layouts. You can author content in your preferred notebook editor. And you can choose between static or interactive formats based on your needs and preferences.
This Python dashboard contained on a single page uses interactive Plotly visualizations to examine development indicators within the Gap Manager dataset. It features two rows, with the second row divided into two columns. And as mentioned earlier, what we need to do is define the format as dashboard.
All dashboards feature a top-level navigation bar, which offers a title, and optionally, a logo and author information. For dashboards with multiple pages, each page is linked within the navigation bar, as you can see here. Within a page, dashboard components are organized using alternating sets of rows and columns. These rows and columns are defined by markdown headings and computational cells. For instance, here's a layout with two rows, where the second row is divided into two columns.
By default, dashboard pages are laid out first by row, then column. However, you can change this by specifying the orientation columns document option. Tab sets are a useful tool for incorporating multiple perspectives of data or showcasing secondary content. You can create tab sets by applying the tab set class to a row or column. For instance, here's a layout where the bottom row is presented as a pair of tabs.
Sidebars are a great place to group inputs for dashboards. To include a sidebar, add the sidebar class to a level 2 heading. Cards serve as the basic building blocks for displaying content within dashboards. They are automatically generated for cells and markdown content within rows and columns. For instance, here, each cell would translate into a separate card within the dashboard layout.
Plots represent the most common form of content, supporting both interactive JavaScript-based plots like Altair and Plotly, as well as standard raster-based plots like Matplotlib and ggplot2. Data tables can be incorporated into dashboards as a straightforward tabular display or as an interactive widget with sorting and filtering capabilities. Value boxes offer prominent means of showcasing simple values within a dashboard. And while dashboards typically feature plots and tables within cards, you can also include arbitrary markdown content anywhere within a dashboard layout.
As mentioned before, there's extensive customization you could do with Quarto. Quarto offers users a selection of 25 themes sourced from the BootSwatch project. Here is the United theme specified in their theme. And here is an example of the Quarto dashboard using this theme. And you could use these themes as a foundation with more customization using SAS. So BootSwatch has over 1,400 SAS variables that control fonts, colors, padding, borders, and more. So you have a lot of control over the appearance of your dashboard.
Dashboard structure and publishing
Today's dashboard is going to feature several of these components. As mentioned earlier, we need to designate the format as a dashboard. We can add things like navigation buttons. For any code chunk that we don't want to display as a card, we can use Quarto's code chunk options. In this case, include as false. We're going to use a row-based layout to lay out all the cards. Within a card, it could have a value box here displaying the last updated date. Then within another row, it could have several columns. One column can showcase some narrative context. And then the other column can showcase our graphs and table.
You have many options for deploying dashboards created with Quarto, whether you're working with a static dashboard or a shiny dashboard. However, for today's purposes, our focus will be publishing on Posit Connect, which is a secure platform tailored for sharing data products within organizations.
Posit Connect is a publishing platform that allows you to securely share insights with decision makers, automate time-consuming tasks with code, distribute custom-built tools and solutions across teams, and scale the execution of data-driven applications and APIs. Posit Connect allows you to publish a variety of R and Python products, including Quarto documents. There are various options for publishing on Posit Connect, depending on your needs and your editor, such as push-button publishing, get-back publishing, or publishing from the command line, which is what we're going to showcase today.
Since we're using Python, we can install the rsconnect-python package from PyPy and then use the command in the terminal to deploy the document. If a document with source code is on Posit Connect, we can re-execute the document either manually or on a schedule using Posit Connect's scheduling options. This display shows up where we can specify that we want the dashboard to be updated every day at 6 Denver time. Notice, you also have the option to email documents once they've been re-executed, which can be shared with your collaborators, viewers, and more.
Demo: setting up the project
Let's begin in the GitHub repository for today's project. I'm going to copy this URL here, and then I'm going to enter my Posit Team environment, where you can see the three products that make up a Posit Team. Posit Workbench, Posit Connect, and Posit Package Manager. I'm going to open Posit Workbench. Then I'm going to create a new session in VS Code.
Once it is active, click to open it. Now I am on VS Code on Posit Workbench. VS Code allows you to clone GitHub repositories from within VS Code. I'm going to open my explorer, clone repository, paste that URL, and then open up the clone repository. These are all the files that we need in order to create the Quarto dashboard. However, to begin, there are a few things that we need to set up first.
First, we want to create a virtual environment for our project. I'm going to open up a new terminal, and then I'm going to create a virtual environment. Once that has been created, I'll activate it. Now I'm going to install all the packages that I need in order to run my project.
Now that the virtual environment is set up, let's talk about some of the files on the left. Under images, we have a logo we'll be using in the dashboard. We have a custom SAS file that makes small tweaks to the visual look of the dashboard. And then we have script.py. Script.py serves as the extract transform load process for our project. As mentioned before, these files can be documents or scripts, and we use them to pull in the raw data, transform it, clean it up, and then save it in a location that we'll use elsewhere, in our case, pins. And because the data is updated on a regular basis, we want to schedule this script to rerun every time that we know a new set of data is coming in, so that the pin stays updated with the latest information.
Some things that I wanted to mention are here in line 10 and 11, I mentioned the pins package. So the pins package is a package for publishing your data, models, and other objects from R and Python for use in other places or with sharing with your colleagues. The way that it works is that you pin this information on pin boards, such as folders, like Dropbox, Google Cloud Storage, and in our case, Posit Connect. Pins are very useful when you have that kind of ephemeral data that is constantly changing and you want to refer to the latest version without needing the heavy-dutiness of a database.
Next, I want to mention the .env package. So, Posit Connect is a secure server and therefore needs an API key in order to be able to log you in and publish your content. However, the sensitive information should be stored securely so that you don't accidentally expose it in your code. The .env package in Python is very useful for handling your environment variables. You store your variables in your .env file and then can recall them throughout your script or your document whenever that you need them.
For example, here, we are pulling the BLS key from the environment, the .env file, for use without actually writing it in our script. And we need this key because the Bureau of Labor Statistics requires an API key in order to use our API. Scrolling down, we see the rest of the script. So, here we have a function that's going to use that API key, call the API. A list of tables that contain the information that we want to display on our dashboard. A program that runs those tables through the API in order to pull their data into Python. A series of data cleaning steps, some transformation and creation of new variables. Then on line 157 and 158, we pull from the .env file to get our connect server name and connect server API key, which then we can refer to on line 160 and 161 to write that clean data on a pin, which we can then use to refer to in our dashboard.
Publishing the ETL script to Posit Connect
As mentioned earlier, we use the .env file to store our environment variables. To do this, you can open up a new terminal. And then here you can write touch.env. This creates a .env file. And then in this file, you can write in your environment variables. For example, dlsp equals .env key here. It's very important to never actually show your environment variables.
Now that our environment variables are set up, we can publish the script to Posit Connect. We open up a new terminal. And here I'm going to call the rsconnect Python package by writing rsconnect and deploy Quarto. I'm going to use a new flag so I can show you what this looks like from scratch. And then the name of the file, script.py. When we run this, what Posit Connect is doing, it's checking everything in the environment. It's looking at all the packages and making sure that it can recreate everything. And then it re-renders our document or our script.
If you notice, we actually hit an error. However, do not worry. The reason for this is, as you recall, we needed the BLS script in order to be able to run our file. Posit Connect doesn't have our BLS API key. And so we have to give it to it. So we can open up Posit Connect. And here's the script. And under vars, we can add our environment variables to Posit Connect. For example, BLS key.
When I save, I can rerun the document. And we can see that the script successfully ran. And this is because it needed that environment key and it didn't have it.
Let's look at the other configuration options. Under access, I can make the sharing available to anyone, no login required. I can add a vanity URL that's easier to understand and read. And under scheduling, I can schedule the output so that it reruns on either a manual basis or a scheduled basis. Again, in this case, I want this script to rerun every single month whenever BLS updates their files, so that it pulls in the new data, cleans it up, and then saves that updated information into our pin.
So I can check this. I can see when it's configured to rerun. I can change the schedule type to daily. Now it's going to run on the first of every month. And as mentioned earlier, we can actually send an email after every single time that this script is updated to ourselves or our collaborators.
One thing I wanted to mention too, going back to our content, we can actually take a look at our pin. Again, this is the data that's stored on Posit Connect every single time that script is rerun. Again, we can change the access and the URL if we want to. And in terms of what it looks like, we can see what is contained within the object. And it also provides the code if we want to use this data in other places.
Building and previewing the Quarto dashboard
Now let's go back to our project. Let's look at our Quarto document that creates our dashboard. It is in file index.qmd. So as we mentioned earlier, the top begins with a YAML header with some of our metadata like our title, the format as a dashboard, as well as pointing to where our logo is and creating some navigation buttons. And finally, our theme. In this case, it's a conjunction of the simplex boot swatch theme and our custom SAS file.
Down below, we have an executable chunk with a code chunk that says do not include this in the actual dashboard. This is really our setup chunk. So it imports all of the packages that we're going to need. It's going to connect us to our connect server and pull in that pin. And finally, make a few small values for us to display in our dashboard. Scrolling down, we can see our first row. And it contains three value boxes that are pulling those values that created up above.
And then another row, one column consists of markdown text. And then the other column is a tab set. So it has separate tabs for each of these cards. And these cards display either graphs or a table.
You may have noticed I actually have my graphs in two separate code chunks. The reason is otherwise plotly might display the same plot twice. So one of the code chunks has include as false, so that it doesn't actually show. And then the other chunk has false. And that's the one that will actually display as a card. And finally, I have a code chunk that creates a table using the Python package break tables.
So let's preview this. In VS Code, we can either do command shift K. Or you can also do this in the terminal. And now we have our dashboard preview. I've zoomed in, in order to make the code on VS Code a bit easier for you to see. But this is what our dashboard looks like.
As mentioned earlier, there's a navigation bar at the top. Three value boxes in a row. And then another row with text and our graphs and table. This graph is meant to show time series of changes of values over time. And just to showcase the table. This is created with a new package called the create tables. And it is a wonderful way of making display tables using Python.
Publishing the dashboard to Posit Connect
Now let's publish this dashboard on Posit Connect. To publish this to Posit Connect, once again, we want to create a new terminal. Here I'm going to write rsconnect deploy Quarto. I'm going to do the new flag again and then the name of the file. Now we want to add some of our additional files so that it knows to use it for the dashboard. In this case, we want to add the logo. And then also our custom SAS file.
When we run this, once again, what Posit Connect is doing, it's bundling everything up, recreating the environment, and then re-rendering our Quarto dashboard. Once it's done, it gives us our URLs. And we can click in and see our dashboard.
Taking a look at the configuration, we can change it to anybody and give it a new URL, say, pin score save. As before, we want to schedule this so that every single month after our Python script has rerun, this dashboard Quarto document also reruns to pull in the latest pin. Since it's a little later, it should pull in the latest information.
And then in terms of taking a look of what it would look like to your actual viewers, we can check it out. Look at our graphs. See our values. And there you go. We have an end-to-end workflow with Posit Team where we created an extract transform load process using a variety of packages, including pins. Every month, this ETL process is scheduled and repeated on Posit Connect. And then we have a Quarto document that reads that new pin every single month because it's scheduled on Posit Connect and creates this updated dashboard.
We have an end-to-end workflow with Posit Team where we created an extract transform load process using a variety of packages, including pins. Every month, this ETL process is scheduled and repeated on Posit Connect. And then we have a Quarto document that reads that new pin every single month because it's scheduled on Posit Connect and creates this updated dashboard.
Thank you so much. I hope that this was informative. And I look forward to hearing your questions and comments.