Resources

PDF Palooza 🎉 Save time with dynamic PDFs powered by Quarto, Shiny & Posit

Many of us need to produce multiple PDF versions for monthly business reports. This workflow demonstrates how to save time by dynamically creating PDFs using Quarto and Shiny, while also showcasing the beautiful possibilities Typst offers for PDF design. You will learn: 1. What is Typst and its advantages over LaTeX. 2. Typst formats: posters, docs, flyers, articles, etc. 3. New Quarto 1.5 feature - Typst CSS 4. How to use a Shiny application to create Typst PDFs dynamically Helpful resources for this workflow: GitHub Repo: https://github.com/ryjohnson09/pdfpalooza Q&A Recording: https://youtube.com/live/RTr5D4xV5_Q?feature=share We host these Workflow Demos the last Wednesday of every month, and you can add them to your calendar with this link: https://www.addevent.com/event/Eg16505674

Oct 30, 2024
31 min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hello everybody and welcome to this month's Posit team workflow demo. It's been a few months since I last gave one of these demos so this is the first time we're meeting. My name is Ryan Johnson and I'm a data science advisor here at Posit. So in today's workflow we are going to focus in on PDFs, which is a file format we've probably all encountered at some point. Now we're going to kick off the demo talking about what are PDFs and why are they such a popular method for sharing documents. And then we'll jump into the actual demos where we'll demonstrate how to create some really beautiful scalable professional PDFs using a combination of Posit's open source tools and our professional tools.

Alright so just a few slides here to get started. And the first thing I want to talk about is what is PDF, which stands for portable document format, particularly why is it so popular. So if you've ever been in a situation where you had to share a document with somebody you probably had to stop and think and consider what is the optimal format for sharing this document. For example if my documents in a Microsoft Word document and I want to share it with somebody else you probably ask some questions to yourself like does that person also have Microsoft Word and if so do they have the same version or at least a relatively recent version or similar version to what I have. Maybe if they don't have Microsoft Word but also work in Google Docs we're not sure.

So maybe you try a different format and you want to create an HTML document. So if you go to share an HTML document with somebody you have a whole new set of questions such as what internet web browser will they use to open up this document which is essentially a web page. Are they going to use Safari, Firefox, Chrome and depending on what browser they use it may render slightly differently. Now for a lot of these HTML formats they require some system dependencies such as JavaScript. So if they go to open an HTML document and they don't have JavaScript installed will it render appropriately we don't know. Also if you go to try to email an HTML document to somebody a lot of email clients will actually block emails that have HTML files as attachments so that's something else you need to consider.

And then particularly important for today's demo what if we want to share a document that's created using something like Jupyter Notebooks or Quarto or R Markdown. So if we go to share one of these documents does that recipient also have R and Python installing can they even open up these documents and if so do they have the same package versions and language version that I do to ensure that it's the exact same document and it's rendered the exact same way when they open it up on their system. So that's a lot of questions to answer and this really is the driving force behind why PDFs were created because they are perfect for cross-platform compatibility.

So if I create a PDF on my Mac computer and I go to share with somebody using a Windows machine I can be pretty confident it's gonna render exactly the same because it preserves formatting which is really important. PDFs is also something known as a what you see is what you get format meaning the content won't shift reflow or break apart when open on different devices. So unlike formats such as Word where formatting can change depending on the software or device things are just going to stay pretty consistent with PDFs. PDFs are also by default non-editable so if you go to share with somebody they're really not going to be able to edit it. Now there are some ways you can edit a PDF but usually by default they're non-editable and they're also print ready so if you receive a PDF you can typically file print and that printed output will look pretty good.

So if I create a PDF on my Mac computer and I go to share with somebody using a Windows machine I can be pretty confident it's gonna render exactly the same because it preserves formatting which is really important.

LaTeX and Typst for specialized documents

And there's a whole lot of other reasons why PDFs are so popular for sharing documents. Now most software for creating documents has some type of built-in PDF export option but when it comes to the creation of specialized documents that include things like mathematical equations, symbols, citations, other complex structures like chapters, sections, table of contents, journal formats for example, complex tables and figures, chances are you're likely going to come across a tool called LaTeX. Now in full disclosure I have never wrote LaTeX code so I can't speak too much to how complex it is but from what I heard it is a fantastic tool but it does have some complications and limitations. Now fortunately many of the open source tools that we'll work with today have methods to create PDFs without ever having to know how to code in LaTeX but there are some newer tools out there for creating PDFs which we'll discuss later on today. They actually have no dependency on LaTeX.

Creating a PDF with R Markdown

Now for now let's start with tools that have been around for years and that's R Markdown. We are going to create a PDF report using R Markdown but to do this we need some data to report on. So I'm going to be using the Gapminder data set and some of you may be familiar with this especially if you're at our conference in 2024. We actually had a keynote completely dedicated to the Gapminder data set but the reason why I'd like to use it in my demos is that it's pretty simple. It has about 1,700 or closer to 1,800 rows but it only has seven columns and you can see those seven columns here. So on the left hand side we're looking at country and the continent that country is in. We have the year and then we have a life expectancy which is the life expectancy for someone born that year from that country. We have the population for that year and also the GDP per capita.

Now this data set is actually ideal for this workflow because it allows us to demonstrate how to create a template PDF and efficiently generate multiple PDFs for each country in a customized time frame. So this is kind of a pictorial image of what we're going to be doing at least to start. We're going to create a template R Markdown document and this single R Markdown document is going to be used to generate any PDF we want for a specific country and also for a specific date range.

Alright so here I am within the home screen of Posit Workbench and you can see I currently have an RStudio session running. Let's go ahead and click into it and within this RStudio project which I've called R Markdown PDF you can see I have a few files down here. We're going to start with this basic R Markdown document. This is going to be our template document. So once I click on it I'm going to go ahead and walk you through what you're seeing here. So at the very top we have our YAML and we're going to focus in just on a single country to start and we'll do that United States. And importantly you can see right here the output is PDF document. So this is the output that when we click knit here at the very top it'll generate a PDF using LaTeX.

I've set some chunk option parameters here and then this first code chunk we're going to load some packages and again we're focusing on the United States and for the Gapminder dataset we can also choose the date range. Now the Gapminder dataset goes from about 1952 to 2007 so we'll just encompass the entire date range. So I'll go ahead and run this code chunk by clicking the play button. And while this is running and finishing up running the next code chunk here is just going to generate a map just because I thought it'd be interesting or nice to have on the document an actual you know image of where does this country you know live in the world. So I'll click play here and this is the output from that code where you can see the United States is highlighted here in blue.

I've also included just some text here to replicate what a typical document would include both figures, tables, but also some text. Alright down here we're going to create a table and this table is going to be created using the very popular package in R known as GT. If you've never used GT before it's a fantastic package for creating really nice tables in R and there's also an equivalent in Python called great tables which you can check out. So this table is going to summarize all the the the columns for a specific country for the United States. So let's go ahead and run it just take a look at this table.

So here you can see the United States overview and then by year we get life expectancy and I've also color-coded it here which will be important a little bit later on and then I have the population you can see the nice syntax here with millions and GDP per capita is the last column. It's a really nice looking table and this is kind of showing this HTML output and it's really important just take a mental note of what this looks like right now because as you'll see in a little bit it's actually not going to be as conserved when we render to a PDF.

All right so we have some more text down here and then we're going to create some plots using ggplot. So this first plot right here I'll click play is showing the United States life expectancy which is shown here in this orange line compared to every other country in the world over time and we can generate a similar plot for population. So here we have the United States population compared to every other country in the world as these light gray lines over time and then finally we can do this for GDP per capita and you can see where United States falls with respect to every other country. So now we have these nice plots we have a table we have some text let's go ahead and render this document as a PDF using the LaTeX.

Alright let me go ahead and bring it on to my screen here so you all can see it and here's that final PDF and it looks pretty nice let me zoom out a little bit so we can take a look so we see our plot here is nice and conserved right at the top here's some text and then here's our GT table and it still looks pretty good but it definitely looks a little different from the HTML version when we ran it within the R Markdown document and the rest of document again looks pretty nice the text looks good here's some of those images that are nicely embedded another image another image and there we go.

Parameterized R Markdowns

Alright so we're going to use this R Markdown document to basically again be our template so that we can input additional countries in additional time frames. So let me go ahead and just quickly come back to the slides here. Alright so now that we've created this template R Markdown let's talk about using this R Markdown as a template for additional PDFs again especially if I want to change the country or change the date range. So currently I have this hard-coded in the document you can see here on lines 22 and 23 if I have a specific country focus I have to add it as a variable here in this R Markdown document and similar for a date range. Now it'd be really nice is that instead of hard-coding this we could add these variables as parameters to this R Markdown because again maybe someone else wants to view a different country or a different date range and coming into the R Markdown and forcing someone to actually delete United States and type in their own country and the same thing for the date range how can we make the variable selection process better more enjoyable.

So that's where parameterized R Markdowns come into play. So what I'm showing over here on the left-hand side is an expanded YAML so again the YAML is that very top chunk at the within our R Markdown document. So we're still generating a PDF document but here starting on line 3 you can see I'm defining some parameters specifically I have my date range where I have some default values we're gonna make it a slider bar we have min max and a few other things as well and then we also have country focus and similarly we'll make this as like an input like a select input and we're gonna make all the various choices the various countries in the Gapminder data set. So once we've defined these parameters within this R Markdown YAML there's actually a way when you go to knit this document if you click the little drop-down instead of knitting to a particular format you can say knit with parameters and what that gives you is a really nice user interface for inputting parameters.

So here you can see I can easily change the date range up here at the top and then I can easily select a different country and once I selected these two I can hit knit at the very bottom and it will generate an additional PDF report. Let me go ahead and switch back to Posit Workbench and we're going to demo parameterize R Markdowns.

Alright so I'm back in the same RStudio environment we were just in I'm going to close out of our template R Markdown document and I'm going to open up this other R Markdown document you can see down here in my files tab called parameterized. So here's that same YAML that we just saw in this slides and again this what this allows us to do is knit this document with parameters. So again I click on this little drop-down at the very top and select knit with parameters and once we do that we get a window here just give it a few seconds to boot up and it gives us the two input options. We have a slider bar to choose our date or year range and we can also select a different country. So just to make things interesting let's go ahead and change up the date range so we'll just select 1977 to 2002 and for a country let's go ahead and just scroll up a little bit we'll pick a random country let's do Cameroon and I'll select knit.

So now it's generating another PDF document using LaTeX but it's going to generate this PDF for that specific date range and for the country Cameroon. So let's just give this a few seconds to run. Alright so now the PDF is created it's rendering and here we are so now you can see Cameroon is highlighted and if I just scroll down a little bit you can see the PDF still looks pretty nice and importantly you can see our table here it's only showing the the dates for which we selected. So within that slider bar date range we have years 1977 going up to 2002. So that's how we can create parameterize our markdowns.

Introducing Typst

Now we can actually take this document and we could publish it to Posit Connect but before we do that I want to come back to the slides and again talk about that table issue that we mentioned earlier. Alright so going back to the table that we created with the GT package in our markdown this is what we want this is that nice HTML output but this is what we got when we rendered it to PDF using LaTeX and again it looks okay but the formatting is not quite you know what we want.

So this is a good introduction to a tool called Typst and if you've never heard of Typst before it is an open source markup based type setting system very similar to LaTeX and it's designed to be as powerful as LaTeX while being much easier to learn and use. Now Quarto one of our other open source tools for creating scientific documents it includes the Typst command-line interface so there's actually no separate installation of Typst required. What that means is that when you install Quarto Typst is already baked into it and this is actually different from how LaTeX works. Typically when you want to generate a PDF using R Markdown and LaTeX you do need to install LaTeX or some version of LaTeX outside of R. Typst is also it has super fast render times and you can easily create highly customized templates.

So this is a good introduction to a tool called Typst and if you've never heard of Typst before it is an open source markup based type setting system very similar to LaTeX and it's designed to be as powerful as LaTeX while being much easier to learn and use.

So the next couple slides here is just a show-and-tell showing you some of these really cool templates that you can create. For example if you're creating any type of journal article for a scientific journal you can build that with Typst as shown here. This is one that I'm particularly excited for so I came from an academic background where I had to present posters at various conferences and what you can actually do now using the Typst integration within Quarto is create scientific posters as shown here. You can even do something as simple as create letters and maybe even flyers. So for example for your chemistry department here's a really nice flyer you can build using Typst within Quarto.

So Quarto in version 1.5 which is the most recent version as of today it now allows HTML tables such as what we're seeing over here on the left-hand side with custom CSS styling to be output in Typst. So instead of the table looking like this we can now have it look like this when it gets rendered to PDF.

Quarto with Typst in Posit Workbench

So we're going to switch back to Posit Workbench again we're going to open up a Quarto document and show you some of this Typst integration and I'm also going to demonstrate how you can render a Quarto document with parameters in a similar fashion we did with the R Markdown document. All right so here we are back in Posit Workbench and I've actually switched to a separate RStudio project called Quarto Typst and within this project I have a document called QuartoDoc.qmd which is what you're seeing up here in the top left quadrant. So in this Quarto document you can see up here at the very top we have a very similar YAML to what was in the R Markdown document but here we're going to render this Quarto document so you render Quarto you knit R Markdown documents but we're going to render it to this Typst format and I've added some specifications you can see here. You can also see here starting on line 11 I'm adding in some parameters specifically country focus with the default of Peru and the date range from 1952 to 2000.

So if I wanted to render this document with parameters we don't have that same drop-down we have with R Markdown but instead we can do it from the terminal using Quarto's command line interface tool. So I'm going to go ahead and select the terminal here and we are going to run the Quarto CLI tool to render this document which is Quarto.qmd or QuartoDoc.qmd and then we're going to feed in some parameters to this line here. So to do that we're going to give dash capital P that's our argument for parameters and we'll give it the country focus parameter to start we have a colon and I'm just going to pick Canada as another country someone might be interested in and then for another parameter we're going to add in our date range and I'm just going to switch it up from the default so I'm going to go ahead and select 1977 to 2002 and I'll close off that bracket and once I hit enter that will render this Quarto document but again it's going to render a PDF using the Typst integration so we'll just give this a few seconds.

Alright it actually rendered it here within my file directory so if I click on this QuartoDoc PDF here is our new PDF document created using Typst so here we have Canada and you can see the country highlighted but probably the most important thing here is the table and you can see the formatting looks almost identical to the HTML format that we're really kind of desiring so this looks really good and this is again just highlighting some of the new features within Quarto where you can have some of these HTML custom CSS directly within your PDF documents. Alright so let's come back to the slides here and we'll move on to the next workflow.

Building the Shiny app for dynamic PDFs

Alright so to finish up today's workflow we are going to essentially create a Shiny application which replicates that user interface we saw with our markdowns for selecting the various parameters. So because that feature doesn't at least yet exist within Quarto we're going to create a custom Shiny application which is shown over here on the right hand side. This is at least the user interface and you can see it's simply nothing special it's just a series of inputs so we have a drop-down for the various country we have our date range which we're used to but I've also added in some extra flair with the help of the Typst integration which gives users the ability to download either a document PDF or they can actually download a poster PDF so simple as clicking a button and then selecting download PDF report a user can then download whatever document they desire.

So we're going to go ahead and show you this Shiny application and then we'll wrap things up by publishing it to Posit Connect and showing you how you can then share the Shiny application with your co-workers or whoever you would like.

Okay so here I am within that same RStudio project but this time I opened up this app.r file which is what you're seeing up here in the top left quadrant and just as a side note we'll be sure to include all this source code in a github repository and we'll make sure you all are aware of it. But let me just quickly run through some key features in this Shiny application. So at the very top we're defining our user interface and here are all those various inputs that we just saw in this slide. So we have a selecting input for country slider for our date radio button so they can change the format and here's our download button. Now if you scroll down to the server function starting on line 27 just to highlight right here this is where we take the inputs from our Shiny application and create a list of parameters which will eventually be fed into the quarto render function. So depending if they have documents selected or if they have posters selected it'll render either the quarto doc QMD or the quarto poster dot QMD and we'll feed in the custom parameters based on the various inputs the user selected.

Publishing to Posit Connect

So we could run this application here locally within Posit Workbench here in RStudio but what I'm actually going to do is publish it to Posit Connect. Now there are a variety of ways you can publish to Posit Connect but probably the quickest and fastest way to get something hosted on Posit Connect is to click this blue button here at the top of your screen. So once I click that we'll get this menu here to define all the files over here on the left hand side that need to be included in this deployment bundle and to briefly go through them here are the top three these are various extensions or it's one extension for the poster format. So this poster format was actually downloaded as a quarto extension so make sure we include that. Here's our Shiny application we're also going to include a PNG logo for Posit Team and then we also need those reports the quarto doc report and the quarto poster report. I'm going to be publishing it to an instance of Posit Connect which I've already connected to and I've actually previously published this so I'm just going to go ahead and update the same piece of content.

So once I hit publish it's going to go pretty quick. Here's my deploy tab that opens up and what's happening is that is capturing all of my dependencies in my environment here within RStudio. So what packages I'm using, the versions of those packages, and what R version. Once it replicates that environment on connect it can then deploy the Shiny application and we'll just give it a few more seconds to render.

It's almost done and here it is. Again not a very flashy Shiny application but let's go ahead and give it a test. So I'm going to go ahead and select a different country so I'll just pick one randomly. I'll do Mexico. I'll just briefly tweak some of the dates here so we can see those effects on the output table and let's go ahead and start with a document. So I'll click download PDF report and we get this little status bar at the bottom of our screen and we'll just give this a few seconds to build this custom report.

Alright so it actually downloaded to my personal computer so I'm going to go ahead and open it up and make sure you all can see it here. And here we go we see Mexico is highlighted and then you can see the table for the statistics here are within the date range that we selected here in the table. So this looks really good a really easy way for someone to create a custom PDF report again using Shiny using the Typst integration. But if we wanted to have someone create a poster instead let's go ahead and select a different country. I'll do let's do Indonesia and we'll just do the full date range and I'll select poster and then we'll go ahead and download this as a PDF report.

Again this is going to download to my personal computer here so let's give this a few seconds and let me resize it so you all can see it. Here we go. So here's Indonesia you can see the maps a little smaller we can certainly tweak this if you want to go deeper into it. Here's our table down here in the bottom left and we get this nice academic research poster and you can even define the authors or the Posit Department of Awesome.

Sharing content on Posit Connect

So just to round out the demo here again when you have a piece of content like a Shiny application hosted on Posit Connect ultimately you want to be able to share the Shiny application with whoever you want and that's where these access controls come into play. So here I can click this little gear icon which brings up my content access or controls including access control. I currently have defined specific users or groups and so currently in this configuration I'm the only one that can view this content. But if I wanted to share it with say a colleague or potentially a group of users at my company I can just add them here so let's see if Rachel's actually on here. She is. So I can select Rachel and now we are the only two people here at Posit that can view this content.

But we have some other sharing settings like all users login required so if someone has access to Posit Connect they'd be able to view this content which is a great way to share content internally within your group but no one outside your company. And then if your license allows for it there's the ability to share a interactive piece of content like a Shiny application with anyone in the world and the only thing you would need to do to share this content is grab the URL you see at the top of your screen. It's not the prettiest URL it's got some random letters and numbers in it so we can customize it. I can come down here at the bottom right and say Gapminder PDF generator and then create this custom report save in this configuration and now to share this this application with anyone all I have to do is copy this URL and send it to somebody just like you would share any other website.

All right so I hope today's demo was helpful and that you now have a better understanding of the features within Posit's open source and professional tools for creating custom PDFs and sharing them with the world. If you have any questions then stick around and we'd be happy to answer them during the live Q&A. Thanks everybody!