Resources

R Markdown

Jeff Allen talks about recent R Markdown changes in a webinar from RStudio

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Alright, so I want to talk today about the next generation of R Markdown, which is a package that has been available for a couple of years now, but over the past couple of months we've really overhauled it and added a lot of new interesting features.

So before we dig in too deep, I'll just introduce you to Markdown if you're not familiar. So Markdown, they define it as a plain text formatting syntax, but the idea is that you're really just typing in plain text and then it's going to render to more complex formats. So if you've ever created a txt file, then you've already done half the battle.

Really the format allows you to focus on, really more primarily just on creating the content that you want to write without having to worry about the formatting and the markup and things like that. But what's great about this is that you can just write your text, just focus on the content, and then later render that to HTML or PDF or some other format. And so R Markdown is a package that allows you to do this from within R.

So Markdown, just as an example, here are a few things that, just Markdown conventions. So again, you're just typing normal text primarily most of the time, but if you want to do different levels of headers, you can do those. You can create bold things. You can create hyperlinks or lists or tables. So the power is there if you do want to create these more complex formats, but typically it allows you just to focus on the content rather than needing to worry about making sure that everything's styled correctly and flowing correctly.

Literate programming with R Markdown

All right, so R Markdown then is, it's kind of a package in the realm of literate programming. And the idea is that rather than embedding comments in a string of code that you have, that instead you would actually embed the code within the document that you're writing. So you can write kind of this narrative and sort of create this prose that describes the analysis that you're doing, and then within that just embed chunks of R code. And obviously there are a variety of different ways that you can leverage the tool, but that's one of the more popular ones we see in terms of reproducible research and literate programming.

But the idea is that you're going to render the textual output and the graphical output and anything else that R is creating just in line within your document. So this is an example here. So on the left you can see that this would be the input format. So there's this convention of three backticks, and then you specify that using the R language. And then in here you just write any R code that you want, and then you close that with three more backticks.

And what that would render is what you see on the right here. So you would see that you have your input commands, and you can see those just match the first two lines of your input. And then any output that's textual here. So when you run length X, you get the output here, which is 100. And then afterwards we do histX to produce the histogram, and you can see as well that the image is just going to be embedded directly in your document or your slideshow or whatever it is you're creating.

So this is a really easy way for you to work with your R analysis in a format that you want to share with someone who may not be comfortable running your R code themselves.

Old workflow and knitting to HTML

So I had mentioned that the R Markdown package has actually been around for a while. And so this was primarily the old workflow, is that you had R Markdown, that would render to Markdown, and then we used a tool called Sundown to create HTML, and you could actually create LaTeX and PDF out of this as well, but it was kind of a more simplified pipeline.

So this is just an example R Markdown file I have. You can see, again, most of it's just text, like I had explained before. We have some special formatting. We have a couple of R chunks here. And then we're doing actually a couple more advanced features, but I won't go into the syntax of these. But just so that you can see that they are possible here, you can do LaTeX style equations, you can do footnotes, you can do all sorts of things like that. But again, primarily the focus is just on creating prose and embedding your R code within that.

So what I'm going to do is I'm going to knit that document to HTML, and you can see what that's going to produce. So this is an HTML file that happens to be shown in this RStudio viewer, but we could open this up in an external browser, we could share it, send it via email, whatever we want to do. And you can see that all the conventions that we created in our Markdown document, hyperlinks, bold, any commands that we executed, the output of those commands, even any equations or footnotes with hyperlinks, are all produced tables and even graphical output from your R code.

New output formats

But now, as of the latest overhaul, we actually have a variety of different formats that you can very easily get into, including Microsoft Word. And so I can show a couple of examples here.

Okay, so knitting to PDF, again, has been possible before, but just to show you that it works, again, you get the equations formatted as you'd expect, you get footnotes, you get tables, all the things that we were seeing previously, you get in PDF as well. But then even more impressive is that all these things actually work in Microsoft Word as well.

And so if you are fortunate or unfortunate enough, depending on your perspective, to need to work with Microsoft Word, then you can easily now create documents while working with tools that are a little more streamlined and efficient, but ultimately produce documents that you can share with other collaborators who may want to use Microsoft Word. And again, as you can see, all of the conventions that we've been using actually work. There's an editable Microsoft Word equation in here that represents the equation that you produce. Again, you have your input, your output, your images, tables that you can actually go in and edit and resize and do things that you want to do with.

All right, so many of the output formats are already defined, so we've shown you that you can do Microsoft Word, you can do HTML, you can do LaTeX or PDF, but you can also, the entire system is pluggable, meaning that you can actually define a custom format and render your output into that format if you so desire. I'll show you the documentation for that at the end. We're not going to actually go through that exercise just for the sake of time today.

But what's interesting about this is that you can create output formats that are entirely novel. So if HTML or Word or PDF aren't cutting it for you, you can actually create entirely new output formats or just modify existing formats.

But what's interesting about this is that you can create output formats that are entirely novel. So if HTML or Word or PDF aren't cutting it for you, you can actually create entirely new output formats or just modify existing formats.

Custom templates

So for instance, if you are happy with HTML, but you want your company's CSS style sheet applied to it or something of that sort, then we can certainly do that. So I'll show you just an example, again, not of creating the element, but perhaps if your company had certain CSS styling that you wanted or a certain image header or something like that, perhaps you have a certain scaffolding that you want to provide around all the documents that you produce, we can certainly do that.

So just as an example, I've created a new output format. So when I go to R Markdown, or rather an output template, when I go to R Markdown from template, you can see that I've created this package, Jeff package, and it has a new template within it. So you can imagine that your company could invest one time in producing these artifacts and then allow other users to share them within the convenience of R Markdown.

So this is the template that I defined in my template. And you can see, perhaps you always need to end your documents with an inclusion in this company, or if you had a certain journal that you were submitting to and they had certain style guides for what sections you should provide, you can do that. But what's neat about this is that within my template, I've specified that there's a certain CSS style sheet that I want to be applied. And you can see this does not look like the original HTML document that I produced, but rather it has different colors and different fonts and things like that. So you can imagine that this could be styled for your company or your personal use if you prefer a different styling.

And then also you have custom formats. So if you've worked with Beamer, which is a LaTeX presentation format, that's historically been kind of difficult because you have to write the LaTeX by hand. What's great about this is that R Markdown, since it can render LaTeX, it can actually render Beamer as well once we've defined the Beamer template, which we've already done for you.

And again, this is full-fledged R Markdown. So you can embed R codes, you can embed images, you can do all the things that you're used to doing in R Markdown. And then at the end, go to native PDF and you get this Beamer document that has all the content that you were expecting in it. But without having to go through the headache of learning LaTeX or even if you know it, sometimes using it can be a bit painful. And so this allows you just to continue working with R Markdown, a very convenient and efficient and streamlined format, but you can produce very rich and complex output formats.

And then also we have different HTML slide templates.

And then kind of as the last example here, you can really go all the way with this and create entire templates that really define the entire structure of what you want to do. So you may have peeked at this when I was showing the previous template, but if you are submitting to JSS or you're submitting to UseR or something of that sort, then you can really, you know, you can define a template that you can continue to reuse. And so if you envision yourself submitting to a journal frequently or even if you're just submitting once, it may be worth the time to create this output template for you.

So again, this is a UseR submission template. And you can see that we have, you know, all the basic R Markdown stuff that we've been using up until now. But what's interesting now is that we've actually, we're using a bibliography now as well. And by providing this references section, that's going to encapsulate all of the references that we've defined here. And we're using, in this case, we're just using a BibTeX format, but it actually supports a wide variety of different bibliography formats that you can use.

So if I go to knit, I'm going to get a PDF of my UseR submission. And you can see that it actually populates the bibliographies for me. It populates the citations. It does all these things that I would want it to do. And so if you're submitting to UseR, this is a very easy way to do it. But also if you're submitting to some journal that has certain conventions, you know, or a certain LaTeX style sheet or something like that, this is now a much easier way to interact with them.

Shiny interactivity in R Markdown

And then the last point here, just to make this kind of a teaser, I won't go into a whole lot of detail on this, but is that now when you're using the HTML format, you can actually support interactivity. So if you're familiar with the Shiny package that allows you to do kind of interactive web analysis within R, if I go to R Markdown and I click Shiny, I can create a Shiny document or a Shiny presentation. We'll just do a Shiny document for now, but Shiny presentations are actually kind of fun because you can imagine that you're using an HTML style or an HTML slideshow, but in the middle of your presentation, you have some interactive element that you can go in and kind of dig into in the middle of your presentation, which is kind of a fun thing to be able to do.

So you can see here that this is just an R Markdown document, but if you're familiar with Shiny, you'll recognize some of these functions here. It's a very easy way to get into Shiny. There's really no overhead or boilerplate that's required. But when I go to run this document, I'll save it first, and when I go to run this document, you'll see that it's a regular R Markdown document just like we've seen before, but now it also has Shiny interactive components. And so when I go through and I toggle these different widgets that I can play with, I'm able to change all these things interactively within my R Markdown document.

So this is a really great way if you kind of envision that you're creating a largely static document, but there are little pieces where you want to add some interactivity, then this is a really nice way to be able to do that.

Resources

So finally here are a couple of resources, and we'll make these slides and everything else available to you online afterwards, but this would be the best resource for R Markdown here is rmarkdown.rstudio.com. That will have all the details that you need about creating custom templates, custom formats, anything that you'd want to create, and then the slides from today are actually available here as well.

The one trick here is that you'll probably want to download the latest version of RStudio, so just for your convenience to make it easier to use all these things. Now, again, if I didn't mention this before, R Markdown is an open source freely available R package. It's downloadable from CRAN, and so you're certainly welcome to use this from any R editing environment that you choose. It's just that in RStudio, we've done the work to make some of these things a little more convenient in terms of just adding some buttons for you and simplifying things.