
Rob Hyndman | How Rmarkdown changed my life | RStudio (2020)
Over the last few years, Rmarkdown seems to have taken over my life, or at least my written communication. These days I use Rmarkdown to maintain my website, write my blog, write textbooks, write academic papers, prepare slides for talks, keep my CV up-to-date, help my students write theses, prepare university policy documents, write letters, prepare exams, write reports for clients, and more. I haven't quite got to the point of using it for shopping lists, but perhaps that's my next Rmarkdown template. I will reflect on the journey in getting to this point, what I've lost and what I've gained. I will also speculate on what might be next in the Rmarkdownification of my life
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Thank you. I'm Rob Hyndman. I'm a professor at Monash University in Australia, and I want to take you to a land far, far away, Australia, to a time long, long ago when I looked more like that. I'm going to start about there. So I'm 17 there, so we're talking mid-80s, and I had my first computer, so the first time I was ever trying to type a document on a computer.
It looked a little bit like this. That's actually not my first word processor, but it was a couple of years later. It was called WordStar 2000. I thought it was awesome. There's only one other person in the room old enough to remember WordStar 2000. Unfortunately, they went bust fairly quickly, so I switched to WordPerfect. A few more of them, which was even more awesome because it was a markup language, and you had two sections of the screen. You could have WYSIWYG at the top, and down the bottom you had markup, and you could see where bold was on and bold was off. I mean, even Microsoft Word does not have that today.
So this was good. I used that for a while, but then when I went to university and tried to type mathematics, that wasn't so helpful. So eventually I switched to using Tech, as in not LaTeX, Tech. I used that for a while until somebody said, hey, there's this other thing just come out called LaTeX, which has sort of better markup facilities. So I switched to LaTeX, and I used that for a long time, many, many years. It was fantastic. Could write beautiful documents. I wrote several theses using LaTeX, lots and lots of reports for clients, academic papers, and so on.
Early days of statistical computing
So that was for the writing side, and then, of course, I was doing data analysis. I'm a statistician, so I needed something for the statistical side, and around 1987, I discovered this new tool. Well, it was new to me. It was before the days of R, before the days of S+. It was an early version of S, running in a new IDE that went bust in about a year, but it was called Ace, and that was pretty cool. So I was using S for a few years to do my statistical computing and using LaTeX to do the writing.
So I haven't got a screenshot of S. I couldn't find one, but those of you who grew up knowing S will see, recognise some of those books. Then eventually S+, came out, and so S+, had quite a neat IDE. So S+, for the computing, LaTeX for the writing. It was pretty good, actually. You could do quite a lot, except that, of course, you needed to keep these things in sync, and I would be writing S++ code, generating graphics, saving them as PostScript graphics, pulling them into my LaTeX documents, and if I changed the S++ code, I'd then have to re-run the LaTeX document to make sure it was up to date.
And so I developed this way of doing that, using... I switched to R at some point, too, yeah, about 2000, switched to R. That was the R website at around the year 2000. Hasn't changed very much, has it?
Make files and the switch to rmarkdown
Okay, so I invented this way of trying to make these things work together. So a make file is a thing used by programmers to compile programs, and so you make a small change in some function, you don't want to recompile everything. So you can use that in other tools as well. So I was using it for R and LaTeX. If I made a change in my R file, it would recompile the graphics. If I made a change in the LaTeX file, it would recompile the LaTeX. But it was smart enough to notice that if the graphics had changed and it needed to recompile the LaTeX, even if the LaTeX file hadn't changed, and I wrote these sort of complicated make files to do this. I thought this was super cool. I thought everyone would do this. I would change the way everyone did R and LaTeX together. I wrote blog posts about it, and I tweeted about it. Here's a tweet about it. Got one like, so you know I had a following.
This was 2012. But this guy from the U.S. called Ewa noticed and he tweeted back saying, or another way of separating R from reports and a little link to a thing he was developing called knitr. And I looked down and said, yeah, I know, but I reckon my way is simpler. Does knitr have any real advantages over my approach with make files?
So I persisted on for a while and sort of kept an eye on what Ewa was doing. And after a while I realized that, you know, he was right. And that although make files are useful for a lot of things, putting the R code and the LaTeX stuff together was going to make a lot more sense. And so I switched and I started trying to write my academic papers using R Markdown, but of course they had to have a certain look, so then I ended up writing R Markdown templates.
That's a recent paper I wrote in an R Markdown template, and I was writing client reports, so I had to write an R Markdown template for clients. And this was life changing. Like, this made my life so much easier. I was writing papers and writing reports, just had to have a couple of templates and everything was good.
And this was life changing. Like, this made my life so much easier. I was writing papers and writing reports, just had to have a couple of templates and everything was good.
So I put those things together and I couldn't think of a decent name for my package, so I called it Monash EBS templates, EBS being econometrics and business statistics, the department in which I work. And all you have to do is to use the template and you can generate these nice things.
And there's, before I show you that, there's lots of stuff that goes into the front end of an academic paper, so we need a title, we need a list of authors and where they come from, their addresses, the abstract, key words, what number of working paper it is in my department. They always want you to stick JEL codes on things and so on. So I just put all this stuff into the YAML and I can generate working papers.
Building a website with Blogdown
So that was sort of my first attempt at writing an R Markdown template and as I said it changed my life and then I thought well what else, I do a lot of other writing, maybe R Markdown can be used for other things. So at around the same time as this LaTeX S Plus thing, I also got into making websites. The web was developed in 1993 I think, it took about four months before I had a website and it looked a little bit like that. That was in 1997, four years later, but it didn't change that much in the first four years. Some academics still have websites that look like that.
Somewhere here, S Plus functions for time series, that became the forecast package for R eventually, but the functions were originally on my website. Still handwriting HTML code, just getting better at it. Still more hand crafting of HTML code. At this point I thought I should have my own domain name and so I bought one, which is the one on the screen there. Don't go there. I changed my domain name to something else with a dot com on the end eventually. Didn't worry about the old domain name, but because there were links to it all over the web from all of my papers, a unpleasant organisation got it and has bad stuff on there, so don't go there.
Then I thought handwriting HTML code is no fun. There's these new things for content management systems, so I switched to one called Joomla. That was painful. I switched to another one called WordPress, which was a little better, but slow. I was trying to write blog posts about R and every time I would put my R code in, it would sort of strip out everything after the less than hyphen, thinking that this is an HTML tag, and then I'd have to go back in and sort of try to copy and paste my thing back in, and I was ending up spending about as much time sort of trying to put all my stuff into the blog post as actually writing the code, so that was no fun.
That's the theme for WordPress, and then I thought, so this brings me to about 2016, and Ewa wrote Blogdown, so I thought, okay, I use R Markdown for everything else from all my other writing. Let's see if I can make a website using this thing called Blogdown. You need a theme. I looked around the place. I figured out that Blogdown is built on Hugo. Kieran Healy had a great theme. He had his website on Hugo, so I forked his website, deleted his content, stuck my website, my content in. With permission, he says in his GitHub repo that you're allowed to do that. Thank you, Kieran. Did all this on a very long flight between Melbourne and Amsterdam, copying and pasting and making sure everything worked, and then thought, great, now let's do the R bit, and it didn't work. Unfortunately, Kieran's theme didn't work with Blogdown. Not every Hugo theme works with Blogdown, so flight home again. I start again. This time, I found another theme, and then I adapted it to look like Kieran's because I still liked his website, so my own Blogdown theme that's on that GitHub repo, and it's saved so much time doing a website that way.
Writing books with Bookdown
Okay, so another thing I was doing was writing books. My first book published in the late 90s was this one. It was, by textbook standards, an extremely successful book, but I didn't make any money out of it because the publisher makes a great deal of money and they didn't give much to the authors, and I decided that I would never, ever do another book with a commercial publisher again because they screwed my students and didn't benefit me. So I thought if I'm going to do another book, I'm going to put it online for free because my students wouldn't buy it. It was too expensive for the students to buy, so they would be looking around on dodgy websites, getting bad information about forecasting, and using that instead of the book that I thought was reasonably good. So I thought, let's do it online for free.
This was before Bookdown existed. I didn't know anything. I mean, I knew I hadn't thought about it as far as I know, so I thought, let's put it on. I can do it. This was also before my website was on Blogdown. So I tried to make it in WordPress. The first version of it came out in WordPress. I decided that wasn't so good, so I switched and I just copied everything over to Drupal and a long, painful experience. That worked better. The first edition of the book came out in Drupal, but then I saw Bookdown and I thought, that's what I wanted, but it's just a few years too late. So the second edition of the book I did in Bookdown, and that meant that I could have a PDF version and an online version and everything would work beautifully, and it really did make an enormous difference.
Slides, CVs, theses, and more
So that's the book as it is now. That's the online version. It's a relatively simple CSS styling, but it's just running Bookdown. The LaTeX is more complicated because I wanted the LaTeX to look a certain way. That's a page from the LaTeX version, which is sold on Amazon.
Okay, at this point, I was getting right into R Markdown. I'd made websites, I'd written books, and I had templates to do things. What about the slides that I used for talks? So way back in the day, my slides looked like that using the LaTeX seminar package, and then after a while, I switched to a package called Prosper, which gave nicer looking slides, and then I switched to Beamer, which was a little easier in some ways to do things, and then I switched to R Markdown, and that is my slides that I use these days.
So that's a package called BIMB. BIMB stands for BIMB is not Beamer, and it generates Beamer output from R Markdown, and BIMB is on CRAN. I'm using the Monash theme, which is why that has Monash as a template on the front, and it's very, very easy. This particular talk, that is my YAML. Title, author, I switched the date out, not in the thing you can see, the date's actually here, but in the final version, I switched the date out from my Twitter handle, changed the font size, pretty much that's all I had to do to the YAML, and away you go, writing R Markdown. Super good.
My CV, which had been maintained for years, I thought we could do this in R Markdown too, so there's my CV in R Markdown, that's using the Vitae package, written by Mitch O'Hara Wild and me. I called that particular template after my own name, because that's the template I was using in LaTeX, but there's other templates that you can use, and it's, again, the YAML's pretty simple, you just stick in things like your Twitter handle and GitHub, that creates this stuff at the top, and then everything else is standard R Markdown.
My students were writing theses, I said I've found this thing called R Markdown, you should all be using R Markdown, it'll change your life. I wrote them a nice template, stuck it on my GitHub account, they have to fork it, and then they just use that, and it generates theses in exactly the format that the Monash Research Office thinks is correct.
I became head of department recently, the chair of my department. I have to send memos to staff, naturally what am I going to send my memos in? The university template is a Word document, so I switched it out and made it into an R Markdown template, and I send around memos like this with all the branding and so on. I can take off the branding and I write the same, I use exactly the same template for writing referee reports for journals.
Pretty simple YAML in this case, you've got a title, you've got an author, do you want the branding on or off, so the Monash branding at the top or not, and then just whichever the name of the template, which is Monash EBS Templates Memo.
Had to write letters now, I'm head of department with nice logos and stuff on the top. Again, there was an awful Word template that the university gave me, so I turned it into a nice R Markdown template and put it in the package. This time there's a little bit more in the YAML because you've got to say, well, I'm putting in who I am, what my qualifications are which come in under here, my address. I made it general enough so that other people can switch their names in as well, who I'm writing to, what their address is, how I start, what my signature file is, and then it all just gets generated and it looks fantastic.
And the other thing I do as a professor is I write exams for students. Again, the university gives you this awful Word template with the logo is actually the wrong aspect ratio, it's sort of squished, and the fonts change halfway through the front page. So I made that look better, still looks like what Mono says the template should look like, and then the exam is all written in R Markdown so it's reproducible, the graphics get pulled in, everything's great.
What I've lost and gained
So basically everything I now write, I write in R Markdown. I spend my life in a text editor either writing R Markdown or writing YAML, or occasionally fixing up the packages that produce the templates. What have I lost? I can't actually remember how to write Make files anymore. Even though I thought it was the greatest thing, I just, it's complicated and I can't remember. If I have to have a Make file, I'll find one that did something similar and copy and paste it and hopefully that will work.
What have I gained? It's so much faster, both in terms of the ability to write things but also the speed at which documents get rendered, the speed at which my website runs, the speed at which I can update things. It's just incredibly, it's an incredible gain in speed. It's so much simpler. The documents are really, really simple. I don't have all that latex preamble at the top of my documents. I just have a fairly simple YAML header. Everything I do is text, so it's really simple. It's all GitHub, all on Git repositories, so you've got version control, everything's reproducible. There's enormous gains by switching to R Markdown for pretty well everything in my life.
What have I gained? It's so much faster, both in terms of the ability to write things but also the speed at which documents get rendered, the speed at which my website runs, the speed at which I can update things. It's just incredibly, it's an incredible gain in speed.
So when I was writing this, I thought, well, do I write anything these days that's not in R Markdown? And I can only think of about these items that I still type on my computer but I'm not inside an R Markdown document. One is I maintain the tennis club website. One is I write shopping lists sometimes and I type them out and email, social media, tweeting and that sort of stuff. So I thought, well, okay, I can solve this. So I went and recreated my tennis club website in R Markdown between the time I started this talk and now. It's now all running in R Markdown. It runs about 30 or 40 times faster than it used to run and it's much, much easier to maintain. I don't think I'll bother with the shopping lists. They're actually not too bad without R Markdown.
But email would be great. So, Yiwei, I'm not sure what your plans are next, but I'd like to see Maildown, please, with a Chrome browser plug-in so when someone sends me an email, I can just hit reply and it'll come up with an R Markdown box that I can go knit and send and away it goes.
So I'd just like to finally just say thank you to Yiwei. You really did make my life so much easier. I'm sorry I didn't believe you when you first said, have you looked at me like that? But you were right and I was wrong. Thank you for that.
So the slides for this talk and all the links to packages and links to anything else that I've mentioned are going to be on that if you just go to that link there. Thank you.
Q&A
Any advice on converting awful Word template to beautiful R Markdown templates?
That's actually really hard. I look at the Word template and I copy and paste the text and then I just start from scratch. I don't try to do anything in the Word styling itself. I just start again because I find the Word templating is too difficult to convert.
Which aspects of building templates or related YAML that are particularly challenging or error prone? What advice would you give to a novice in this area?
Almost all of the templating I've done has been with PDFs, so it's LaTeX and the backend. So you do have to be quite good at knowing about LaTeX, but I'd been using it for 20 years, so that was not a problem. But writing a PDF template, you have to know a reasonable amount of LaTeX to know how to do that. I have less experience at writing HTML templates and no experience at all of writing Word templates, so I wouldn't even know how to do that.
Someone else asked, do you use Bootstrap or other frameworks to style your templates? No.
And this is getting a few upvotes here. I like R Markdown, but I find Google Docs much easier for collaboration and iteration with others. Any thoughts on that?
So Google Docs are fine as long as you don't have a lot of R to do, but putting in R graphics inside a Google Doc, there's cutting and pasting to happen there, so that's painful. So I find R Markdown works much, much better if you've got any real statistical computing happening. And for collaboration purposes, I generally try and teach my collaborators how to use GitHub, and mostly that works okay. Occasionally I'm working with people that don't know GitHub or are not prepared to learn to use it, and then it's a little painful. There's lots of emails and, can you please change this section, here's the words I want to add in, and things like that.
