Resources

Jake Thompson | Branding and Packaging Reports with R Markdown | RStudio (2020)

The creation of research reports and manuscripts is a critical aspect of the work conducted by organizations and individual researchers. Most often, this process involves copying and pasting output from many different analyses into a separate document. Especially in organizations that produce annual reports for repeated analyses, this process can also involve applying incremental updates to annual reports. It is important to ensure that all relevant tables, figures, and numbers within the text are updated appropriately. Done manually, these processes are often error prone and inefficient. R Markdown is ideally suited to support these tasks. With R Markdown, users are able to conduct analyses directly in the document or read in output from a separate analyses pipeline. Tables, figures, and in-line results can then be dynamically populated and automatically numbered to ensure that everything is correctly updated when new data is provided. Additionally, the appearance of documents rendered with R Markdown can be customized to meet specific branding and formatting requirements of organizations and journals. In this presentation, we will present one implementation of customized R Markdown reports used for Accessible Teaching, Learning, and Assessment Systems (ATLAS) at the University of Kansas. A publicly available R package, ratlas, provides both Microsoft Word and LaTeX templates for different types of projects at ATLAS with their own unique formatting requirements. We will discuss how to create brand-specific templates, as well as how to incorporate the templates into an R package that can be used to unify report creation across an organization. We will also describe other components of branding reports beyond R Markdown templates, including customized ggplot2 themes, which can also be wrapped into the R package. Finally, we will share lessons learned from incorporating the R package workflow into an existing reporting pipeline. https://rstudio.com/resources/rstudioconf-2020/branding-and-packaging-reports-with-r-markdown/

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Good morning. My name is Jake Thompson. I'm a senior psychometrician at Accessible Teaching, Learning, and Assessment Systems at the University of Kansas, and there I work with K-12 assessment data, so taking student test results, turning them into scores, and then the final step is writing technical documentation to support the use of those assessments and the scores.

The assessment I want to talk about today is a workflow that we've developed for doing these reports in R Markdown.

So before we get too far into it, I do want to acknowledge two of my colleagues, Noel Pablo and Jeff Hoover, who've contributed a lot to the work, both on the package and just thinking about this workflow, and so Noel is here today, so if you see her, be sure to say hi. Jeff was not able to make it.

Why R Markdown for reports

So I probably don't need to sell this crowd on R Markdown too much, but I do think it's worth repeating why we wanted to use R Markdown for these reports. The first reason is that R Markdown reports are reproducible, and I mean that both within and across years. So if we have an analysis and someone says, where did this number come from? Because we have the code, it's easy for us to say exactly where that analysis happened and how we got those results.

But then it's also reproducible across years. So if you think about educational assessment, students test every single year, which means we have to redo those reports every year with the updated data, and so by using R Markdown, we can just drop in the new data files and then get the updated report for that year. So it ends up saving us a lot of time.

Secondly, these reports are dynamic. So I think we've all probably experienced an email that says, oh no, we have an updated data file. The one you used is wrong. But by using R Markdown, we can just drop in that new data file and get corrected results really quickly. And the other reason is that we can get multiple output formats with R Markdown. So we can write reports out to Microsoft Word or PDF if we're using LaTeX, or you can even make Flash Textboard or some slides or a website.

The branding problem

So the problem that we faced was that the default look of Markdown reports that you see over here on the left does not really match what our normal reports look like on the right. And so often organizations, Atlas included, have specific branding. So we say if we're going to make a report, this is what it needs to look like. So in order for us to use R Markdown for our technical reports, we had to make sure that it matched our brand guidelines. So today what I want to talk about is how we went about making these branded reports in a reproducible and scalable way.

Step one: find your brand

So the first step is to find your brand. And that can mean many things. Some organizations have a style guide or brand guidelines that will specify what fonts you're supposed to use, the colors, spacing of the margins. So anything that is really annoying and gives you a headache if you have to look at it for too long, that's probably what's in the brand guidelines. If you don't have that, you might have a marketing team who puts together your reports for final distribution, or you might even just have an editing team. So if you have someone who edits your final reports right before they go out, they are probably doing the little editing stuff so you don't have to. So you find them, and that will help you figure out what your final result needs to look like.

Step two: build your Word template

Step two is to then build your template, which is obviously not that easy. It's an iterative process that we're going to talk a little bit more about. So basically you'll start a template, you'll render your document, realize it's not exactly what you wanted, make some tweaks, re-render, and so on and so forth until you get to that final version that you want. So today we're going to talk about how to do this with a Microsoft Word template, because I think it's the most straightforward, but you could do this with any type of R Markdown output.

So I think the easiest way to do this with Microsoft Word is just to start writing your document in R Markdown. So you can start it just like you would any document, and then once you get a little bit into it, you can hit that knit button, and what you'll get is an output that looks like this. And this is the default Microsoft Word template if you don't specify any changes.

And so what you'll do is you can click your little cursor into some section of the document, so for example if we were in the title, and then you'll click the styles pane up top, and there you'll see exactly what style the default template has chosen for you. So if we are in the title, it'll say that we are in the title style, and then if we hit this arrow, we can modify that style. So when we modify, we can change the font that's being used, the color, the size, the spacing around the paragraph, basically anything that you want to change for that style, you can modify.

And then once you've done that, you save that Word document, and that will be your template. And so I normally save it as template.docx, just to make it very clear what I'm doing, and then in your R Markdown YAML header, you still specify that you want to use the Word document, but then you include this extra option that says reference.docx equals template.docx, so that template file that you just saved. And so now when you knit that R Markdown document, it's going to use all those styles that you defined when you were modifying those styles in your Word document.

So there are a lot of different ways that you can modify styles in a Word document to do some really cool things. So these are some resources that I found helpful when we were developing our Word templates. So Daniel Hadley actually gave a talk at RStudioConf a couple years ago called Branding and Automating with R Markdown, where he gives some really good advice about how to do this in Word and some different tricks about how to get page breaks by using a level six heading, I think. There's also a post on the RStudio website from Richard Layton called Happy Collaboration with R Markdown to DocX. And then there's a chapter in R Markdown, The Definitive Guide, about Word documents.

Polishing reports: figures and chunk options

So once we have our templates, you're not done. So your reports probably need a little bit of polishing. So for example, if you're including figures, you might want your figures to also match your brand guidelines. So use the same fonts as your report, maybe have a color palette that matches your brand's colors.

And so there are some really good examples about how to do this. I'm not going to talk about ggplot2 themes and how to create them today. That's kind of beyond the scope of what I want to talk about here. But I did include some packages that I think give some really good examples for how to do this. So the HRBR themes package, and then also Artistic and bbplot, I think do a really good job of creating custom themes.

And then you also might want to think about setting some default knitter chunk options. So if you think about how big you want your figures to be in your report, if they need to follow a certain aspect ratio. These are all things that you can define as defaults within your R Markdown document.

Wrapping everything into an R package

Okay, so we have all of that. We have our templates, we have our ggplot2 themes. The question is, what do you do with that? One option that we tried for a little bit was just having things live on a network drive. And that got really messy really quickly. So every time you start a project, you're then copying the templates, the code for the ggplot2 themes, all that into every project. And then if there's an update, you then have to go copy that into all the projects to make sure everything's using the most recent version of the template. And it gets really messy, and no one's sure if they're using the correct version of things, and you have things called final and final underscore final two.

And so what we did was we wrapped it into an R package. And this ensures that if you have the most recent version of the R package, you're always using the correct version of the template. It's easy to distribute if you host it on GitHub. And finally, you can include documentation. So we have vignettes included that demonstrate to people who maybe don't use R Markdown that much exactly how to write a report with this, how to include figures, and any other thing like that.

And so what we did was we wrapped it into an R package. And this ensures that if you have the most recent version of the R package, you're always using the correct version of the template.

So the package we created is called R Atlas or R Atlas. And you can find the code for this at the link for the talk. So if you want to use it as a template for your own organization, please feel free to adapt it and use however you see fit. But this includes templates for different types of reports that we create, as well as some convenient project templates, which I'm going to talk about in just a second, custom ggplot2 themes, and vignettes for documentation.

So when you're creating this package, there are several directories that I think are important. We're going to talk about three of them. Two of them live in the ints directory, and that's the R Markdown and RStudio folders, and then the R directory.

So the R Markdown directory is basically just where your templates are going to live. So we have the ints directory, R Markdown, we define that we're making templates, and then for the R Atlas package, we have two types of reports that we make generally. We have topic guides and we have tech reports. So inside the topic guide, we have resources and the template doc X, because those go out as Word documents, and then the tech reports get rendered as PDF documents, so we have a LaTeX template that gets used there.

So this isn't anything that is super crazy. It's more just making sure that you have structured your folders in a way that makes sense to R and the functions that we're going to use to wrap it. And so that is our wrapper functions that go in the R directory. So ours is called render.R since we're rendering the documents. And so this is the function that we use when we're creating a topic guide. So we have the only arguments that get passed in are the dot, dot, dot arguments, and the first thing we do is we just tell R what template we want to use. So because we put that topic guide template in the inst directory, we can use the system file command to find that template. And then we just call our normal rendering function.

So the default in an R Markdown document is usually Word document, if you're knitting to Word. Here we use the book down Word document to function because it has a bunch of other features and functionality that are useful for us. And we just tell it that we want to use the reference doc X, we want to use that template, and then any other arguments that got passed get forwarded on to that book down function. And then this is also where we specify our default knitter chunk options. So here we've said that by default we don't want to show the code, we just want the text and the results to show, and we also set a default aspect ratio for our figures. And so now you'll call that function in your YAML header, which we have an example of in just a couple slides.

Project templates for easier adoption

And so the last thing we tried to do was make it easy for people to use this package who maybe aren't as familiar with R, or may not be as familiar with R Markdown. So with RStudio you can make project templates, and so if you don't use projects, I highly recommend it. I think it's a great way to keep your work organized. But in the project template, it also lives inside the ints directory inside an RStudio folder. Again, we're making a project template, and there are two important things. There's the topic guide.dcf, which is going to specify the name of your template and what files you want to open, and then also the resources for that, so what is going to be used.

And so now if you go into RStudio and click I want to create a new project, you can click the new directory, and now your project template will live in this little nice GUI for users to click on. And so here we can click we want a topic guide using RATLIS, and it will automatically create your project for you, and it will open this index file, the index RMD file that we've created, so this is the template RMD file that automatically gets opened when you create that new project, and so it has the YAML filled out. You can see here that we've specified the output type as RATLIS topic guide X, so that's the rendering function that we just made.

And then it has some default bibliography settings, and it also includes a setup chunk that loads some functions, and basically what we're trying to do is just make it as easy as possible for someone who's not as familiar with R to access this workflow, so they can just click new project, have the document opened up, and start writing and not have to worry about defining the templates correctly, loading in all this extra ggplot2 theme code, and things like that.

Other examples and resources

So RATLIS is definitely not the only example of this happening. There are many other examples that we've found very useful when putting our package together that I think are good resources for you if you're thinking about making a package for your own organization. So the Sorenson Impact package I mentioned earlier, Daniel Hadley gave a talk a couple years ago that was very similar to this one, and they use the Sorenson Impact package.

There's also the Thesis Down package that does a little tech template for theses at Reed College by Chester Ismay, and if you go to this repo in the README, there's a link to about like 50 other packages that have forked Thesis Down for their own colleges or universities, so that's a really good place if you want to see how to take an existing package and kind of modify it to meet your own needs, I'd really recommend going and looking at that work. And then also the Articles package wraps a bunch of different LaTeX templates for rendering R Markdown documents to different journal specifications.

Q&A

So thank you. We have a few minutes for questions, and I see there's already a few on Slido, so I'm going to start here. The first one was, can you get the CSS out of a Word document template and then use it for PDF or HTML outputs? So that's not anything that I'm familiar with. I think if you want to go to CSS or HTML, it's probably better to use one of those rendering functions or look into the Page Down package, which will define, you can use CSS to define PDF output that way.

Another question on Slido was, how do you include logos? Yeah, so in your R Markdown document, there's a function in the Knitter package called include graphics, so if you have your logo in your project directory, you can just say include graphics and then the path to that figure and it'll drop it in right for you.

We have a few more minutes, so I'm going to go ahead and ask, can you knit PDFs using your doc X templates? No, you have to have a LaTeX template if you're going to knit to a PDF.

Do you have any concerns with using an R Markdown document as a requirements document? For example, using it for communicating model racks and writing test cases? I don't. I personally use R Markdown for everything, because I think it's reproducible. You can see exactly what you did in terms of code, so I mean, I would prefer that to a Word document. So I'm probably biased and maybe not the best person to ask, but I don't have any problems using R Markdown.

How are you hosting and sharing these docs? So all of our documents get hosted on the company website. So most of my reports that I write are in support of the dynamic learning maps assessment. So if you go to dynamiclearningmaps.org, there's a research publications page where you can see all the different reports that we've written.

So last question is, what challenges did you encounter in the adoption of these templates? So the biggest challenge was the first one before we had the package, and there were all these different template versions running around, and no one was sure which one to use. So I really recommend writing an R package, because that is a really big headache. Beyond that, if you think about trying to share it to a broader company, I really think removing barriers for people who maybe don't use R as often, so that's where those project templates I think can be really helpful. If you can make it easy for someone who doesn't know R or R Markdown to just open up a document and start writing, it's a much easier ask to say, learn some R Markdown formatting, like use double asterisks for bold, as opposed to learn how to use R. That's a much tougher ask that people are going to give you some more pushback on.

If you can make it easy for someone who doesn't know R or R Markdown to just open up a document and start writing, it's a much easier ask to say, learn some R Markdown formatting, like use double asterisks for bold, as opposed to learn how to use R. That's a much tougher ask that people are going to give you some more pushback on.