Resources

Nathan Stephens | Make PowerPoint Presentations with R Markdown | RStudio (2018)

Data scientists use R Markdown documents to create reproducible code that can be rendered in a variety of output types. Some of the most common output types include HTML, Word, and PDF, but new improvements make it possible to create PowerPoint presentations as well. PowerPoint presentations are still common currency for sharing insights in most organizations today. This webinar demonstrates how to create feature rich PowerPoint presentations from R Markdown and how to use these presentations to share insights, visualizations, Shiny apps, and more. About Nathan: Nathan has a background in analytic solutions and consulting. He has experience building data science teams, architecting analytic infrastructure, and delivering innovative data products. He is a long time user of R

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

I'm excited to talk about making PowerPoint presentations with R Markdown today. This is a project that we've been working on for a while now, and it's nice to finally present some of the material.

So you're probably wondering, why would you want to create a PowerPoint presentation from R Markdown? Well, there's a number of reasons, but I'm going to point out a few.

So one reason is that everyone knows PowerPoint. It's a standard communication tool, especially used in business. People learn to speak the language of PowerPoint. And the fact of the matter is, it's very useful, because it works. It's very flexible. You can make slides fast. And everyone has personal experience with it. Some people love it. Some people, not so much. But everyone has an opinion on it, because we've all touched this tool.

But your work, of course, is coded in R. And that's because your code is your product. So your output is a manifestation of your code. So plots, tables, model output, that's all coming from your code. Well, that's where R Markdown becomes very useful. R Markdown makes your work reproducible. You can weave text and code to produce elegantly formatted output. You can use things like R Notebooks for doing interactive analyses.

R Markdown, if you haven't used it, will allow you to create a number of different output types, reports, apps, dashboards, HTML docs, PDFs, Word documents, but not PowerPoint. Not until now.

So now you can make PowerPoint presentations with R Markdown. As of RStudio version 1.2, you can render PowerPoint presentations from the IDE. This combines two great things, PowerPoint and R Markdown. You can think of it as the Reese's Peanut Butter Cup of presentation software. I'm not sure which one's the chocolate and which one's the peanut butter, but you can tell I put this together during the Halloween time period.

Live demo in RStudio

So let me show you a quick demo of doing PowerPoint presentations with RStudio. So this is the RStudio preview version 1.2. If I go up to File, New File, R Markdown, and then I choose Presentation down at the bottom, I have an option to do PowerPoint. Click OK. And this is an R Markdown document.

Again, if you've never seen R Markdown, it will contain the prose, some narrative, plus some R code chunks, some things about R that you might want to include in your document. And then the output type here is PowerPoint presentation. And the way you create the PowerPoint presentation is you hit this knit button.

So when I hit this, I'll give it a name. And it generates a PowerPoint deck with the content, some bullets, some R code output, and a nice little plot.

Now, if you don't want to use the RStudio IDE, that's fine. You don't have to use the RStudio IDE. You can always render this programmatically. This all works with open-source software. You can do this inside of an R console. Type in render, give it the name and the output format, and that will also generate the PowerPoint output.

Benefits of R Markdown for presentations

So some of the benefits of using R Markdown here. Well, in this book, R for Data Science, there's this lovely chapter at the end of this book. If you haven't seen it, I recommend that you read it. It's about R Markdown, about communicating insights to decision makers.

And it makes the point that R Markdown is a great way to influence other people in your organization and other people that will be making decisions based on the work that you do in your data science. We'll allow you to do better data science because you'll be collaborating with other data scientists who are able to read your code, understand your code, and reproduce your code. And it gives you this nice environment in which you can do data science.

When you put your narrative together with the code and the output into a single document, I would argue through personal experience that you do better data science. It allows you to think more clearly and understand why you're doing what you're doing. From a presentation standpoint with PowerPoint, I think one of the big takeaways here is that you're just going to spend a lot less time iterating on presentation slides and a lot more time doing what you do best, which is data science.

When you put your narrative together with the code and the output into a single document, I would argue through personal experience that you do better data science.

How it works: knitr and Pandoc

All right, so how does it work? Well, you write your code in R Markdown format, and then you knit the code with the Knitter package. And what Knitter does is it takes your R Markdown code and it converts it into Markdown format. And Markdown format is a universal format that can be put into this other system called Pandoc. And Pandoc is what actually converts the Markdown document into PowerPoint. Now, all of this is already bundled into the IDE, so it's seamless, but this is what's happening under the covers.

So I want to focus on this Pandoc piece for a second. Well, what is Pandoc? Well, Pandoc is the open source software that converts files from one format into another. You can think of it as a Swiss Army knife. And PowerPoint output was added to Pandoc in Pandoc version 2. The initial release was in December 2017, and then there was a number of progressive updates that happened at the beginning of this year to allow you to create this PowerPoint output.

And Pandoc is bundled into the RStudio IDE. And the RStudio IDE version 1.2 has the ability to create PowerPoint documents and RStudio Connect as a version 1.6.4, which came out a few months ago, that also has the ability to support PowerPoint documents.

Now, a lot of this work here with Pandoc was done by a developer, a Haskell developer. Pandoc's written Haskell. And he works out of, he's a professor at Johns Hopkins University, and his name's Jesse Rosenthal. And Jesse's actually on the call today, so it's great to have him here and joining the webinar. But most of this development work was, that allows us to convert documents into PowerPoint, was done by Jesse.

R Markdown's not the only way to do document conversion into PowerPoint. You have these other R packages that you might have used. And these are great R packages. I've used some of them. They allow you to have fine-grained control over what happens on those slides. And some of these will have features that aren't available in the Markdown format, conversion format. So, these packages are still relevant. You might still find these very useful.

Use case: render and customize

Let's talk about use cases for creating PowerPoint presentations from R Markdown. The first one is to render the presentation, and then customize the presentation. And that's probably going to be a very common use case for your PowerPoint slides. You're going to output the text, the tables, and the plots to PowerPoint. And then you're going to modify these or copy these into your final presentation.

This is a notebook that analyzes bank data. And in the course of this analysis, I create a whole bunch of visualizations that are interesting. And normally, what I would do with these visualizations is I would take these, and I'd copy these one at a time into my PowerPoint presentation. My PowerPoint presentation would become a mess, and then I'd lose track of which ones are which, and I'd have to go back, and when I go back to update my analysis, then things become out of sync. It'd be much better just to be able to render all of the slides in one shot from this notebook. And that's what you can do with the PowerPoint output.

So, I come up here and say, knit this presentation. This will go through and spin up a new R session, run all of the R code through that session, and then generate a PowerPoint presentation. Okay, so here's the slides. You can see that I've got my narrative again. And then I've got some graphs here. And I can take these graphs, and I can copy these into my final presentation. I've got a table, and that's really great, too, right?

And I can, if I don't like the order here, I can just move these around if I need to. If I want to change the background, change my template, I can do that as well. Choose a different template. If I'm not happy with the table, the table renders as a PowerPoint table, so I can go ahead and change that table design.

And I can come in here and say, you know, I've got more than one data source. I've got data sources. I'm going to change that. And now I've really broken the reproducibility paradigm, right? I've actually gone through and made some changes. And now if I go back and redo the slides, I'm going to have to do those again. But I'd argue that you're much better off because you can produce a lot of these at the same time, and you can keep them anchored to that R Markdown document that explains what you're doing.

So we think this will be a pretty common use case for taking slides and then manipulating them. People like to customize their PowerPoint presentations. And chances are, your slides are going to be one part of a larger presentation that ends up getting delivered to the executive or the client or whoever you're trying to communicate with. And that will work great in this workflow.

Use case: publish to RStudio Connect

The other use case is to publish to RStudio Connect. What is RStudio Connect? It's a publishing platform for R. It runs your R code, and it allows you to share your Shiny apps, R Markdown reports, and much, much more. So in this use case, what you're going to do is you're going to output the entire presentation with R Markdown, and then you're going to render updates programmatically on a schedule. And then you'll distribute those PowerPoint presentations by email, and you can even design them to accept user inputs.

Here's another presentation that I'm using. It's the stock data. And this is a parameterized R Markdown document. And if you haven't seen parameterized R Markdown before, they're extremely powerful. You set up parameters here and then inputs here. This is going to be the Tesla stock ticker symbol. And if I want to call that in my R Markdown document, I just use param string symbol.

So these become very useful in their own right, even regardless of how you're, you know, accepting user inputs or what output formats you're using. Parameterized R Markdown, parameterization is extremely powerful in R Markdown documents. But here what I can do is I can say I'm going to knit this thing. I'll go ahead and knit it with parameters. And I'll choose like the IBM report, I guess, right? I'll knit this document.

Great. Now I have a document for the IBM candlestick chart. And this is what I want. But I can't anticipate everybody's needs here with all of the different stock tickers that they might need, nor am I able to automate this every day. You know, I don't have the ability to update this every day. I have to be able to automate it with the system. So I want to make sure that this report gets generated daily and sent out to the people that are interested in it.

So what I do is I publish this to RStudio Connect. And I can do that with this publish button and choose RStudio Connect. And this is what RStudio Connect looks like. Here's the default report. That's going to be the Tesla report. If I want to change the report ticker, I can go in here and I can say I want to change IBM, right? And you run that report. I'm not going to run the report because I've already done it for you. It's right here. And then if I want to, I can go ahead and email this report to myself, right? Or I can come in here and I can schedule this report to run every day and then send me an email when it's complete.

And I can add more people to this report. And what that report looks like is if I go to my inbox, you can see that I just got the report. It's got this nice graphic here that tells me that IBM closed at 1.20.03. And I come down here to the end and I can see that I've attached the PowerPoint presentation to the email. And again, this is going to be the updated report that I just ran.

So you can see that this would be a very useful way to communicate your insights with a lot of people who prefer to get their information by email and PowerPoint. I personally know a lot of people in that boat that would like to get their information communicated to them in email and PowerPoint. Perhaps you do as well. The nice thing here is that you only have to write the code once and then RStudio Connect can handle the scheduling and the distribution and accept the user inputs systematically.

The nice thing here is that you only have to write the code once and then RStudio Connect can handle the scheduling and the distribution and accept the user inputs systematically.

Supported features

So those are some use cases. Let's talk about features. So the PowerPoint output accepts most of the markdown features that are supported in other document formats. For example, inline formatting is supported. So this is what the inline formatting looks like in your code. And this is how it gets rendered in the PowerPoint.

Supports lists, LaTeX. So if you're putting in math equations, hyperlinks, of course, block quotes, images from the web. This went out and snagged an image from the web and then put it into my slide. Images from file. This actually took an image from a file and put it into my slide. You can add captions to images. You can add a linked caption to the image. Or you can link the image. In this case, what happens is if I click this image, this will take me to the R for Data Science page. Supports tables. So you write your tables and markdown code that looks like this and they get rendered like this. Notice there's a caption down here.

Also supports these really nifty features like columns. So columns are great. What you do here is you create this fence. This is called a fence div. And you put all the contents for the first column here. And you put all the contents for the second column here. And then it will show up in your two-column layout. It's called the two-content layout. Two columns and two contents and two columns.

Supports speaker notes. Speaker notes are very useful. Again, you put a fence div here and you can put the speaker notes that will render in the presentation. And templates. The templates become very useful also. And I will be talking about that a little bit later.

And of course, this translation supports R code. That's the whole point of this, right? You want to be able to put R code into your R markdown documents. What kind of R code does it support? Well, any of the R code that's in your R code chunks. And that typically is going to look like code. And your code gets highlighted. The syntax of your code gets highlighted in the output automatically.

Tables. So if you call the cable function, which is in Knitter, it will create a table in PowerPoint. ggplot or any other visualization that will render that image on a content slide. And if you do something interactive, like in HTML widgets, here I'm using diagraphs, the rendering process will go, if you have installed the webshot package, the rendering process will go and grab a web screenshot of your interactive visualization and include it in your PowerPoint presentation.

And finally, Shiny. How do you include Shiny in your PowerPoint presentations? Shiny is interactive. PowerPoint presentations are static. So obviously, you can't put Shiny inside of PowerPoint. That would be ridiculous. What you're going to do instead is you're going to put a static image of your Shiny application on your slide and then link that to your Shiny application, which will be hosted in Shiny Apps IO, RStudio Connect, or Shiny Server.

In the course of my analysis, I created this Shiny application. It's a customer tracker app. The nice thing about this is it allows me to drill down into subgroups and other segments. But obviously, I can't put all of those segments or groups into a PowerPoint presentation.

I go back to the code and I include this line, include app, and I put the URL for that application into my presentation. And when I knit this process, it's going to go and take a screenshot of that Shiny application and put it back into my presentation. For example, this deck shows the total segment and the total group, gives me some high-level results and some details. And then it says, use the Shiny app to explore other segments and groups.

And this image here is a linked image. I can click on this image and it seemed to have lost my PowerPoint presentation. Let's try that again.

So when I click on this image, this will take me back to my Shiny application. And this Shiny, this will allow your viewers to interact with your application, choose a subgroup. And I've even put a download link here to take a snapshot of this report that is formatted in Excel and allows you to see the report that you've dug into. It gives you all the data to do cross tabs if you want to do that. But what I've done here is I've tied a PowerPoint presentation, an Excel output, and a Shiny application and put them all on top of the same, yeah, the same code base.

Resources and getting started

All right. So let's talk about resources. What resources are available? The first place to go to get help would probably be rmarkdown.rstudio.com. That's our website about R Markdown. There's also a new book that's out called R Markdown, the Definitive Guide. It gives you information about all of the output types and everything that you'd want to know about R Markdown. I've created a getting started page and a troubleshooting page for PowerPoint output specifically, so I recommend taking a look at those. If you have specific questions about your work, then community.rstudio.com is a great place to connect with other people. And then if you do find any issues, please let us know by submitting issues to the R Markdown GitHub repo.

Templates and slide structure

I want to address some frequently asked questions. There's a question about getting started, about using templates and presentations that we get frequently. So this is the getting started page. We come down here to templates. You can see that the way you set up a template is with this reference doc. And the template supports four of the many layouts that are in a PowerPoint template. These are the layouts that you want to include in your template for R Markdown documents.

The first one is the title layout. So this will give you the title and the subtitle on this slide. The next one is the section header. And the section header will only render the header and not the content. The title and content slide, which you typically think of as the main slide, and then the two-content layout, which allows you to do columns.

Now, layouts are important because Markdown doesn't allow you to programmatically change the design or size of your presentation. So what it does is Pandoc actually ships with a 4x3 vanilla template that is used to render presentations by default. So if you want to use a widescreen format, you're going to have to create a widescreen template and include that as a custom template here in your reference doc. If you want to change the background or if you want to change the layout, the placeholders in any way, that's all going to happen with the template.

The next thing I want to talk about is structuring the presentation. This gets a little detailed, but I think it's important to go through. PowerPoint presentations have a hierarchy of title, section, and content, whereas Markdown documents have a hierarchy of headers and contents. So in order for the PowerPoint presentation to get structured, Pandoc has to determine which Markdown header level should be used for the PowerPoint slide level.

So for example, in this Markdown document, you have a hierarchy of headers that go from 1 to 5. And this header is the one that Pandoc is going to use as the slide level by default. And it's going to use it as a slide level because it's the first slide that has content associated with it. So everything that gets rendered above this is going to be turned into section slides, and everything below this is going to be turned into subheaders on the content slide.

Now if you didn't want this to be the content slide and you wanted this to be the slide level, you would change slide level to 2 because 2 would match up to the two hashtags on this R Markdown section header. Okay, so you probably don't have to worry too much about this because the defaults should work well for you in most cases. If you want to avoid this issue altogether, you can just use the same level header for everything, and then everything will be rendered as content slides.

The way that the slides get broken up follows these five rules. I'm not going to go through all these right now, but just know that there is a way that they get rendered, and there's an easy way to find out what those rules are. Just go to the Getting Started page, and it will explain it.

Troubleshooting common issues

All right, so troubleshooting. What types of issues might you expect to run into? Well, some of the common things that we see is that the PowerPoint presentation gets broken, and you get a message that says that it needs to be repaired or removed, or you get this weird one, which doesn't even give you anything at all. Typically, the reason for these errors is that the Markdown code that you wrote looks something like this, where everything is condensed together. When you write Markdown code, you need to put spaces in between the content, the title, and the speaker notes.

If you go and put those spaces back in and then restart PowerPoint, PowerPoint is an old program. It needs to be restarted every time you get into this weird state. If you go back and put the spaces in, restart PowerPoint, this should render more successfully. That would be the first thing to try.

How do I put an image and a text on the same slide? There's two ways to put images and text on the same slide. One is to use captions. If you use a caption in Markdown, it's going to look something like this. If you use caption in a code junk, it's going to look something like this. You can write very long captions here. This is actually a very workable solution, probably the easiest solution to your code. Table support captions as well. You have to use table caption on those.

The other way to do it is to use columns. This uses the two-content layout. You can put your text on one side and your image on the other side. You could actually add a caption to this image as well. This is a great way to separate text and columns. You can put images on both columns or an image on this side and text on the other side.

The next question, of course, is how do I control the placement of images and text? There is no way to do that programmatically. You have to do that with the templates, as I mentioned earlier. Here what I've done is I've taken the two-content layout in my template, and I've stacked one of the content placeholders on top of the other one, and it renders quite nicely. It allows me to put an image on the bottom and text on the top. The only issue here is that you only get one of these, so you have to stick with this layout throughout your presentation. But this will work really nicely for this layout. If you need more programmatic control, then you're probably going to be in the situation where you need one of those R packages we talked about earlier.

How do I control the size of an image? Well, you can't. Pandoc automatically resizes images to fit a placeholder. Currently it's not possible to programmatically change image size. How do I create a build slide? Build slides are very useful. Currently that's not supported, so you can't do a build slide today. That's where you click the button and bullet point 1, 2, 3 show up in order.

How do I render a presentation programmatically? We talked about that already through the render command. And how do I check my markdown code? So sometimes it's useful to check your markdown code if you're trying to debug your rendering process. And the way you do that is you use the clean equals false option, and that will output the markdown file that gets generated in the rendering process. You can open that up and see if it's creating the markdown that you intended to create.

So all of the things that you saw today were generated in R Markdown with the PowerPoint output, including this presentation that you're looking at today. This was also generated from an R Markdown document, and all of that is stored in a GitHub repo under the Solenge category, and it's called PowerPoint. I've included a number of vignettes in here as well that I didn't show today, but are very useful in understanding how these features work. If you're interested in looking at that, you can download the GitHub repo today and try it out.

But make sure that you install RStudio version 1.2 preview, and you can get there by Googling RStudio preview, downloading the latest copy, and trying this out today. So I hope you have a lot of fun making PowerPoint presentations with R Markdown, and thank you very much for the time. Let me toggle over and take some questions now.

Q&A

What customizations of PowerPoint can you do within R? I think the best, can you do any customizations on tables? Yeah, so that's all going to be dictated. I think the question is, like, yeah, can I programmatically change the table color or shape or size? All of that is going to be dictated by the templates. So you're going to want to define all of that in the template. So if you want to change, like, the link color, for example, you'll change the color of the link in the template, and then Pandoc will automatically render those links in that color.

Are those RMD codes embedded in the PowerPoint file? The RMD codes embedded in the PowerPoint file. If I understand correctly, you're talking about the syntax highlighting and how the code is being outputted to the PowerPoint file. That's in the echo. But I'm a little confused about this question because you won't actually have code in the PowerPoint. There's no code that's going to run in the PowerPoint file, right? You're just going to echo the code into the PowerPoint slides, and Pandoc is going to do the syntax highlighting for you. I hope that answers the question.

Can you trigger an email based on output in RConnect? So the emails are you can manually email yourself. So the way you email in PowerPoint is you can email yourself here. That's one way to email. The other way to email is to send an email after it has been rendered, after it's been completed, after it's completed the schedule. So that's the way. So basically you can send emails either on demand or programmatically.

Does it support BibTeX citations as footnote of the slide? I don't think I've tried that. That's a good question. I haven't tried that. There are some footnotes that belong to the layouts, and you might have seen that in the presentation. So I would have to see if the layouts actually supported that footnote.

Regarding my Shiny question, what I'm envisioning is a button in my Shiny app that an end user can press that will output a PowerPoint which will contain an image of the graph that the end user has customized. Yeah, so here's the rule of thumb with Shiny. You can do anything with Shiny, right? I mean, it's basically unbounded. So if you want to have a Shiny application that renders a PowerPoint presentation from a button press and then allows you to download that or even pipe that to some other system, you can do all of that. And you might use an R package to render that PowerPoint presentation, or you might use R Markdown file to render that presentation. It's really up to you to determine how to do that. But if you use the R Markdown format, you'll want to use that render command that we showed you to, you know, generate the PowerPoint output.

Do custom templates need to be stored someplace specific? No, you can reference them from any location in the YAML header. Does it matter if you create the PowerPoint on a Windows laptop or Mac? Will the images work if you open the PowerPoint in the other OS later on? Yeah, cross-platform will work just fine.

Do these templates allow for background images? Yes, you can put images on there on the template. That will work. Can you use a custom layout that is not one of the four mentioned above? No, you have to use those four layouts that are mentioned in the documentation. Any idea when this will be released for real and not in the preview? I do not have a release date for RStudio 1.2 for you today, but please keep track of our blog. You'll see that there's lots of updates that are coming out. We're very excited about the next release.

Can you create a PowerPoint graph with the data behind it so someone else can modify the graph in PowerPoint directly? No, that is not a feature that is supported today. What about font size or type of font? Again, all controlled from the template.

I'm just going to pause here on some of these questions because there are a lot of questions that are stating, I like to have fine-grained control over my PowerPoint presentations, and if you really want fine-grained control, you should either use one of those R packages and code those up explicitly or you should do the custom work. That would be my recommendation.

How can I choose different slide packages in R, such as IOSlides? If you want to do different slides, I think this is the question. If you come up here and do new file R markdown presentation, you can see here's IOSlides and Slidy and Beamer, and we actually support more than just this. There are other ways to create presentations. I hope that answers your question.

Usually I have to create slides with the company's standard design. Is there a way for R markdown to accommodate PowerPoint templates? Yes, templates are a big feature in rendering PowerPoint presentations, and you can do that from the reference doc that is here in the reference doc. Okay, and I would point you to the help documentation online.

Can you customize different slides differently, like a two-content layout for one slide and another layout for another slide? So you have these are the four layouts. I think this is a good question, actually. These are the two layouts for content slides, and these are the only two layouts that you have. Now, you can modify these to be top and bottom, side by side, big and small, but these are the only two that you have. And the way you use this one is you specify columns, and columns look something like this. And the way you specify the other one is not to use columns. So those are the two options today. Maybe we'll have more options in the future. Please submit your feedback. We always love to hear from our customers what they think is useful.

Can you embed VBA code? That is a dark, dark question, and I'm just going to move on.

Is it possible to have more than four templates? No, just the four layouts today, but we appreciate the feedback, and maybe we can add some more in the future. Can I e-mail you for other specific questions I have? Yes, I would go on to, for other follow-up, I would go on to community.rstudio.com, and I'm on there, and other, yeah, and Yiwei is on there, and multiple people are on there that can address these types of questions. So please look for us on community.rstudio.com.

Will you create a repository for sharing PowerPoint templates that can be used with R Markdown for making these presentations? Yeah, you know, that's not a bad suggestion. I thought about that. There's a number of other issues at RStudio that are kind of like that where, you know, maybe you need a collection of CSS templates or you need a collection of add-ins that are really useful, some PowerPoint templates. I think it's a pretty reasonable question. My best suggestion right now is check out the GitHub repo that I pointed to earlier. It's this PowerPoint, and this has a number of templates already inside of it that you can use.

Okay, yeah, so we don't have a 1.2 release date for you today. For the finer-grained control, I recommend the officer package. Yes, yes, so that's exactly right. If you want fine-grained control where everything is, the images are placed exactly where you want it to be and size and layouts are complex and you've got other things going on in that slide, by all means use an R package or do it by hand. The R Markdown, keep in mind the R Markdown format is really used for creating reproducible documents that you can collaborate on and do data science with, and they're used to output into multiple output formats, not just PowerPoint, but to Word, HTML, PDF, and many more. I think that's a good place to pause. I really appreciate the questions and really appreciate the attendance. Thanks, everyone.