Resources

Introduction to Bookdown (R Package) | RStudio Webinar - 2016

This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web... . We try to host a couple each month with the goal of furthering the R community's understanding of R and RStudio's capabilities. We are always interested in receiving feedback, so please don't hesitate to comment or reach out with a personal message. Read more on our blog: https://blog.rstudio.com/2016/12/02/announcing-bookdown/

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hello everyone, this is Yihui, and today I'm going to talk about a project that I have been working for the past half year, which is called bookdown, and it's an R package for writing books with R Markdown.

So first I'd like to talk about some motivation behind this package. Actually I wanted to do this two years ago, but I managed to get some time earlier this year, so I started this project earlier this year, and there are some problems that I want to solve.

And I have been a student in the major of statistics for over 10 years, and I believe that there are some problems that we should solve for books.

The first one is that books should have been much easier to write technically. Normally people use LaTeX or Word to write books, but these things are either not flexible or too complicated, and there are too many technical challenges, and today I will show you how easy it can be to write a book.

And the second problem is that I believe most of the books are just way too expensive, and my ideal price for the books are just like $10 or $20, and you know typically the books that you buy when you are a student, I mean the textbooks are typically like around $70 or $100 or sometimes even $200. That's just too expensive, and I believe we can cut some cost in the books.

The third problem is that the books, especially the printed books, are never interactive, and the content is not rich enough.

If you use bookdown, you will see that you can have dynamic content in your books, which makes it much richer to read, and for example you can embed interactive HTML widgets or even interactive shiny apps right in the book, and when you read the book, you can just interact with these widgets or apps.

So you can just imagine that when you read a book on, for example, linear regression, you can just fit a linear regression by yourself as you read the book, or if you read a book on machine learning, you can just turn the parameters in your models and see the output directly. So that is just like a personalized book, just like personalized medicine. So you can have your own version of book, so it's very interactive, and the content can be very rich.

So you can just imagine that when you read a book on, for example, linear regression, you can just fit a linear regression by yourself as you read the book, or if you read a book on machine learning, you can just turn the parameters in your models and see the output directly. So that is just like a personalized book, just like personalized medicine.

And the fourth problem is that typically books are very slow to iterate, and so after you publish your first edition of book, you often have to wait for several years to write the second edition, and I believe that's just way too slow, and it can be much quicker to iterate, and I will show you how.

And the last problem is that the books are often written by one person or a limited number of authors, and the feedback that you get when you write a book is very limited as well. So typically your publisher will find some reviewers for you, it's like three anonymous reviewers, and you only get the feedback from them, and I firmly believe that you can attract a much larger number of contributors to your book, and you can get much, much more feedback on the book, and I will also show you how.

Problems with traditional book writing

So when you are thinking about writing a book, so you may open your LaTeX editor or Word to write a book, and when you want to buy a book, the book can be like $300, which is too expensive, and when you are considering writing another edition of the book, I guess you probably often look like this. You just simply give up, because there are many problems. For example, we need to solve these problems, we don't want these problems to happen over and over again.

We need to end that. For example, if you are familiar with LaTeX, you should understand what I mean in this GIF. So if you work with LaTeX, sometimes LaTeX can be very fragile, even if you change only a little bit in your document, you can totally screw up your layout. Your figures and tables can float anywhere in the book, and it's just very distractive when you write the book.

And how about Word? I know many people use Word, and Word is good if you write a few paragraphs, and sometimes maybe after you write a few other paragraphs, and you feel very good. But if you write a long document, like a book or a very long report, eventually it will just blow up, like this.

Why Markdown?

So we don't want you to be in that situation. So what I would recommend is that you can use Markdown. Markdown is a very simple language. I assume most of our attendees today are reasonably familiar with Markdown.

So basically, if you are able to write emails, you should be able to learn Markdown. And actually, I have a personal award, which is if you are not able to learn the basics of Markdown in 10 minutes, I will just award you $10.

So Markdown is very simple to write. It's just as simple as this. You just hop on the car and you're ready to go.

Actually, because Markdown is so simple, of course there are some features for writing books that are missing in Markdown. Actually, the original Markdown was designed for writing HTML content, and it was very simple.

But we have developed a package called R Markdown, which added two powerful components to Markdown and makes it suitable for writing articles and books.

And these two components are Pandoc and R. So first, we added R to Markdown, which means we can mix R code with your Markdown text. So you can embed your computing right into your book.

So when you write a book or an article, you can just embed R code chunks there and you click a button and the R code chunks will just be executed. And you will get the output and you don't have to run the code separately and copy and paste the results.

So we added R, which means you can do statistical computing, data analysis, data visualization in your document. So that's one source of force in Markdown.

And the other component is Pandoc. And Pandoc is a very powerful tool for converting documents. And it can convert Markdown documents to many other formats, for example, to HTML or to LaTeX PDF or to ebooks and even Word documents and presentations.

So it's another powerful tool. So with R and Pandoc, Markdown can be as powerful as this. So it has a lot of force.

Introducing bookdown

So that's R Markdown. And the story has not ended there. So now we have bookdown. And just one sentence summary of bookdown. It's a tool. It's a very simple tool that makes you look awesome.

Because when you write a book, you know, when you tell other people, oh, I wrote a book, and of course you look very awesome. So bookdown is just that simple tool behind you that makes you look awesome.

So you can write a book with very simple tools. And so I'm just going to show you some demos using bookdown. Before that, I want to briefly explain the basic structure of books.

So when you write a book or a long report, you often do not want to write everything in one single R Markdown document. Instead, you have multiple R Markdown documents. And one R Markdown document is just one RMD file.

So you have multiple RMD files. Typically, each RMD file is a chapter. So you have multiple chapters, and bookdown can convert these multiple chapters into a book. And the output format can be like PDF or HTML or EPUB. EPUB is a format for ebooks.

And in bookdown, I also extended some Markdown features. For example, as I said, the original Markdown was very simple. And in Pandoc's Markdown is more powerful, but it still lacks some common features, especially for academic writing, for example.

It currently does not have the feature of numbering your figures and tables or cross-reference your figures and tables. And I added this feature in bookdown, and I will show you examples later.

You can also embed interactive content, like videos or HTML widgets or Shiny applications in your book. And you may ask, when the output is not HTML, for example, if it's PDF or EPUB, how can these HTML widgets or Shiny apps work?

And that would be a very good question, and the answer is that if the output format is not HTML, bookdown will just automatically take screenshots of HTML widgets and Shiny apps and embed a static screenshot in your book.

But if your reader wants to know more about these interactive widgets, they either go to the HTML output of your book or go to the URL of the Shiny apps to interact with this interactive content.

Gitbook style and output formats

You can have different styles in the output as well, and the style that I personally like very much is a style called Gitbook style, and some people might be familiar with this. So this style was actually borrowed from another open source project called Gitbook, so you can know more about that from Gitbook.com. And we borrowed the style from them, but we replaced the internal Markdown renderer with R Markdown.

So basically that means we are using Pandoc to render the book instead of their simple Markdown renderer.

And the Gitbook style is also very responsive in terms of that when you read the book on smaller devices like on your phone, the layout will just automatically adjust to the smaller devices. And there are also different themes like lighter themes or darker themes, and I will show you these themes later.

And also you can click an edit button to edit the R Markdown source document, and there are other buttons on the toolbar. And let me just quickly show you the example.

So the source document for this example is actually on GitHub. It's under RStudio's account. It's called bookdown-demo. So if you are interested in trying that out right now, you can just go to this GitHub repository.

So this is just a minimal bookdown demo. So as I said, a book consists of multiple RMD files, like I have an index RMD that is the first chapter or the homepage of your book. And then you have like 01, 02, 03 up to 06, and these are R Markdown chapters. So you have these chapters.

As this is only a minimal example, these chapters are just pretty much like one sentence. So you have multiple RMD files.

And if you have installed bookdown, actually I uploaded bookdown to CRAN just last night. So you might be able to install bookdown from CRAN. If it's not in your CRAN mirror yet, you can just install from GitHub using DevTools.

You can use DevTools, install GitHub, RStudio, bookdown. So that's how you can install the package.

Once it's installed, you will be able to compile these RMD files into a book. So the minimal configuration is this. So in the index.rmd, you have a site option, which is bookdown, bookdown site. Then the output option, which is bookdown, colon, colon, Gitbook. Gitbook is just one of the output formats in bookdown.

So let's just take a look at the Gitbook style first. So another thing I want to mention is that I'm using the preview version of RStudio now. So it's 0.99.1251. So that's the preview version. If you don't know where to download the preview version, you can simply Google for RStudio preview version.

So the reason to use the preview version of RStudio is that with RStudio preview version, you will see a build tab here, and there is a menu here listing the possible output formats of your book. And the first one is Gitbook.

So let's just click this build book button to build these RMD files into the Gitbook format. So this is the Gitbook format that I just mentioned. And let me just walk you through the layout of the Gitbook style.

So the Gitbook output is essentially some HTML web pages. So each chapter will be a single page by default. So you can navigate through the book by clicking the navigation button at the bottom. The button is at the bottom because my screen is very narrow right now. If your screen is wide enough, you can see the navigation button will be on the left and right side.

So you can navigate to the second chapter or the third chapter. So you can navigate through different pages. And so on the left side, you can see there's a table of contents showing the chapter titles. And of course, you can collapse the sidebar.

So especially when your screen is too narrow, you may want to just hide the table of contents on the left. And you can search in the book. For example, if you want to know more, if you want to search for tables in the book, you can just type in that search box. For example, tab. And you will see that your keyword will be highlighted on the page. So that's the search button.

And here is the button for setting the themes and font size of the book. For example, you can make the font bigger or smaller. You can choose different font families. You can use serif or sans serif. You can set the white theme, sepia theme, night theme. So different themes.

And as I mentioned, there's an edit button on the toolbar. Let me pop out the webpage in my web browser. So here's an edit button. And this edit button does not mean you edit the R Markdown source locally. This button is there for your potential contributors.

So when your reader clicks this button, it will take you to GitHub. As I said, this repository is on GitHub. When your reader clicks this button, it will take you to the R Markdown source document on GitHub. For example, it will be very helpful if your reader finds a typo in your book. And he or she can just fix that typo and commit that change on GitHub and send you a pull request.

In case you are not familiar with pull requests, let me just show you an example of pull requests. So basically a pull request is that you can fork other people's GitHub repository to your own account and make some changes and then send the changes back to the original repository.

So here is the repository of the book R for Data Science by Hadley and Garrett. And you can see many people have contributed to that book. And I guess the most common pull request is the ones that fix the fixed typos. And you can see many of the pull requests have been merged.

Let me show you how the changes look like. So basically when you read a book, if you find some typos, you can just click that edit button and make these changes and submit these changes back to the original repository. And when the author sees your changes, he can just click a button called a merge pull request to merge your changes back into the main repository.

So this little edit button, I believe it will be very helpful if you put your book in a public place.

Hosting and sharing books

We have a website called bookdown.org that allows you to host your book online for free. And when other people read your book and find some problems or they want to make suggestions, they can just hit this edit button and it will take you to the R Markdown source. And you can make whatever changes you want and send the changes back to the authors.

So that would be a very interesting way to provide feedback or help the authors write their books. So that is what I meant in the beginning. The books are often written by a small number of authors, but they should attract attention from many, many other readers. And these readers can contribute to your book.

And so the last button on the left on the toolbar is a download button. So as I said, the Git book style is basically a series of HTML pages and you can download the book in other formats, for example, PDF and EPUB.

So let's go back to the build tab again. So you can build this book using the Git book format. You can also create a PDF book to build the same R Markdown documents into a PDF document.

So basically, bookdown will just call the PDF book format and LaTeX to compile these R Markdown files into a book like this. You can see this is a PDF book. We have a table of contents. We have different chapters. We can have floating figures and tables in the book. So that's the PDF output.

And we also have the EPUB ebook. We can render the book to EPUB. And since I'm using Mac, it opened the ebook in iBooks. So still the same content, different output formats.

And another cool thing about this multiple output format is that pretty much all the features will work in all the output formats. For example, when you have cross-references of chapters or figures and tables, all these cross-references will work in EPUB and HTML and PDF. So you don't have to worry too much about your possible output format.

So that's the PDF and EPUB output. And finally, on the Gitbook toolbar on the right, there's a share button. So you can click these buttons to share the link of your book on your social network media like Facebook or Twitter.

Starting a book from scratch

So that's the bookdown demo. Because this is an existing demo, people may ask what if I want to start from scratch. And I can also show you how you can just start from scratch to write a book.

So basically you can create a project from RStudio, a file, a new project. And you can start with a new directory, an empty project. Let me create a project under the documents directory. Let's name the book directory as testbook.

So now this is an empty project and I can put a number of R Markdown files in this project. Let's say the title of the book is an awesome book. So you create a new R Markdown document and you just save that as index.rmd in this project.

And then, in case you have forgotten, the two options that you need to set. One is the output format. Let's use Gitbook as the output format. And another option is the site option that has to be bookdown site.

So you save this. Because this is a new project, let me restart this project so that RStudio can recognize this project as a book. Let me open it again.

So still the testbook project. And if you open it again, now RStudio can recognize, OK, this is a book project. So it will show you a build tab and now you can build this single document into a book, into the Gitbook format.

You may say that the chapter number is weird here because I should have used the top level section header here, which means one single hash. Let's build the book again. OK, so now we've got one chapter.

So it's pretty simple to start a book, right? You just set these two options. And if you want to add more chapters, you can simply add other chapters. So another R Markdown document. Let's save it as, for example, chapter2.rmd.

So a chapter title should start from one single hash. Let's use the chapter title, Hello World. Let's just leave two paragraphs here. Another chapter, let's build the book again. So we have two RMDs, so now we have two pages in the book. So that's how you can start a simple book.

Just to repeat, these are the two options that you need to set. So that's a simple project. Let's go back to the bookdown demo project again.

Key bookdown features

There are some details that I didn't mention, and I want to mention them here. So one thing that I mentioned earlier is that you are able to number your figures and tables. And the way to do that is that the code chunks will have some chunk labels. For example, I have a nice fig for this code chunk.

And then if you are using bookdown, you will be able to refer to this figure using the syntax backslash at ref in parentheses. Then the figure label, which is fig colon followed by the chunk label. So that's how you can refer the figures. And for tables, it's the same.

So see, you have this in the RMD source document, and you have the actual figure number in the output. And your figure is numbered here, figure 2.1.

And there are many other features. And you can know more about these features from the bookdown documentation, which you can find on bookdown.org. So this actually will be a printed book in the future. Let me just walk you through some features that I think are very cool.

So there's basic Markdown syntax in chapter 2. And I'm not going to repeat these features. And one thing I want to mention is that you should be able to write math expressions in your book. So you can see some math equations there. So that's just the LaTeX syntax. Basically, you just write your math expressions in a pair of dollar signs.

So talking about math, there is actually a feature in bookdown, which is there is an add-in. So if you use bookdown, there is an add-in called input LaTeX math. You can get that from the RStudio toolbar. And this add-in is to help you write your math content in a visual way.

So, you know, personally, I really hate reading a full screen of backslashes, especially when I type the math. You know, it's very easy to screw up the math content if you cannot see the math expressions visually.

So this RStudio add-in can help you type the math expressions. And it also shows you the LaTeX source of the math content. So when you are done, you can click done, and you will see the math equation in your R Markdown source document. And you just add a pair of dollar signs, and there will be a math equation there. So that's the math.

And you can embed R code chunks. You should be very familiar with that. You can number figures and tables and cross-references. And you can also have some custom blocks. Like sometimes you may want to write some notes or warnings or information in your book. So you can use these custom blocks. I'm not going to show you the detailed syntax, but I just want to let you know that this is a possibility.

And I mentioned HTML widgets and Shiny apps. So when the output format is HTML, you will see the HTML widgets live in your book.

For example, I'm using an HTML widget package called DT here, which can convert your data frames and matrices into a table like this in your book. So basically, you can print a data frame in multiple pages. You can sort the table interactively. You can search in the table. So there are many, many other HTML widgets packages that you can use. And this is just one of the examples.

And also for Shiny apps. So when your reader comes to your book and they can just interact with certain content in your book. For example, I embedded a Shiny app here. The app has been published to Shinyapps.io. So I just include that URL of the app in the book. So now this is actually live. So you can zoom in the map. You can interact with the map. You can see different tabs in this Shiny application.

You can see the source data or other tabs. So that's how you can interact with Shiny apps in the book. And when the output format is not HTML, this will just be a screenshot.

Customization

Let me talk a little bit about customization. So actually these output formats are highly customizable. I just want to show you one simple example. Let's just use LaTeX output as an example.

I'm not sure how many people have used a LaTeX document class called ComaScript. And that LaTeX document class is called SRCbook. And you may need to install a LaTeX package. But anyway, I just want to show you how you can change the style of the output. Just using some simple options.

Oh, sorry. It's not SRC. It's SCR. Let's build this book to PDF again. So it will be using ComaScript to typeset your book. So you can see now the style is different. And just with a very simple change, you can totally change the appearance of your book.

And of course there are a lot of other things that you can customize. For example, for the Gitbook style, you can choose which buttons to show there. For example, you may disable this sharing button. And you can hide this download button. Or set the font size or font family in advance. There are many, many options for you to customize the book.

Publishing and wrapping up

Okay, one last thing I want to mention about bookdown is that there's a function called PublishBook. So after you have compiled your book, you can call this PublishBook function to publish your book to this website, bookdown.org. So currently it's free for you to host your book. And just one simple call of this function and your book will show up here.

So now we have a number of books like the book for bookdown and R for Data Science, Efficient R Programming, R Programming for Data Science, and a lot of other books. And if you want to learn from other people's examples, you can start from here. And there are detailed instructions on how you can get started with RStudio and bookdown.org.

Oh, another thing I should have emphasized is that since you have multiple chapters, and just to save you some time compiling the book, the default behavior of clicking the Knit button on the RStudio toolbar is only to preview the current chapter. So that means if you are in this chapter and you click the Knit button, it will only compile this chapter.

And sometimes this can be confusing because if you only compile this chapter, other chapters may or may not work. They may not have been compiled. So in that case, you shouldn't navigate to other chapters because their content may not be correct. So this Knit button is only for preview purposes. So if you click this button, it only compiles this single chapter. So that may save you some time if you have some time consuming computation in other chapters. If you want to compile the whole book, always use the Build button here. This will compile all the chapters.

So just one thing to notice about the RStudio IDE. Yeah, that's pretty much what I have today. And if you go back and try out the bookdown package by yourself, you may run into certain issues that you don't understand. And just don't worry because I just released the first version of bookdown to CRAN yesterday. So I will not be surprised if there are still some bugs or issues.

So if you run into these issues, please just file them to the GitHub repository. Or you can ask questions on Stack Overflow with the tag bookdown. And lastly, if you can't remember all the things I talked about today, you can just go back and check out the website bookdown.org.

And if you want to send me any feedback, you can either let me know on Twitter or GitHub or find my email on my homepage. Or you can contact RStudio. I want to thank you very much for being here.

Q&A

One person asked, how does one reorder the chapters? Okay, that's a very good question. Alright, yeah, that's a very good question. So you have multiple RMD files here and the default order is just the natural order. That's why I named these RMD files as 010203. If you want to reorder these chapters, there is a configuration file called bookdown.yaml. So there you can have an option named RMD files. It's this RMD files option. So you basically provide your own order of these RMD files using this RMD files option in the configuration file underscore bookdown.yaml.

Someone asked, can cross references be forward references? And the answer is yes. It doesn't matter where your figures or tables are. You can simply use the chunk label with the syntax that I mentioned, the backslash at ref. You may generate a figure later in the chapter and refer to it earlier in the chapter.

And the next one is, can you use Python with bookdown? And the answer is also yes. So to use Python, instead of 3 backticks with R, you just use 3 backticks Python. Actually this is not perfect at the moment and we will improve this in the future.

So yeah, actually besides Python you can use many many other types of languages like C++, you can use RCMP, you can use C code, you can use Fortran. And also if you are familiar with Bash, you can use Bash. So all these features come from knitr actually.

Alright, the next question is regarding math, how to align and label equations in R Markdown or bookdown? How to align and label equations? I believe you can use the align environment. So even if this is converted to non-LaTeX output, the equations should be preserved. For example, for HTML output. I'm not sure if I'm typing the correct syntax here, but I think the align environment is in LaTeX and you can still use that for HTML output as well.

And for equation labels, you can just use the syntax backslash label. Actually this does not work very well in terms of the HTML output. If you label an equation, you can only refer to it in the current chapter. You cannot refer to an equation in another chapter.

The next question is, is the syntax backslash at ref generic to R Markdown or only available in bookdown? And the answer is, it's only available in bookdown.

Actually there's one thing I forgot to mention. If you don't want to use bookdown, but you want the feature of numbering figures and tables and cross-references, you can actually leave out this cite option, but you set the output format to be HTML document 2 or PDF document 2 or Word document 2. So these formats are designed specifically for the feature of numbering and cross-referencing figures and tables. So these output formats, for example, the Word document 2 format is almost the same as the Word document format in R Markdown, but it just added the feature of cross-references and the numbering figures.

The next question is, is Tufte book format supported? And I'm very glad that you guys are asking fantastic questions. And the answer is yes, Tufte book is supported. For example, if you want the HTML output for Tufte books, you can just use the Tufte HTML book format. If you want PDF, I believe it's simply handout 2 or book 2. Yeah, so you can use Tufte.

Next question is, is bookdown a replacement of Gitbook or a competitor? I'm not sure how to answer that, but the only thing I can tell you is that Gitbook is one of the output formats in bookdown, and the major difference between Gitbook.com and bookdown is that the Markdown renderer is different. So we use Pandoc to render Markdown documents, and they use, I don't remember, they use their own or a simpler Markdown renderer. So I believe there are many features that are missing there. For example, they definitely don't have the feature of numbering figures or writing bibliography.

Next question is, can you comment on differences between this approach and using read-down RMD template from RMD formats package? Yeah, I saw read-down package, read-down format a while ago, but I haven't tried that, and it looks very attractive to me. But I need to read more about it to tell you the exact differences.

Next question, is there a complete list of YAML options documented somewhere? And the answer is yes, it's documented in the bookdown documentation in section number 1.4.

And let me pick a shorter question. What is the possibility of using this in conjunction with something like IEEE journal data templates so that we can write conference and journal articles? That's also a very good question. And the answer is that instead of using bookdown, there's a package called Articles, also in the RStudio repository called Articles. And there are several examples of using RMD for journal articles.

Probably I can show you a very quick example of writing a paper for journal of statistical software. Let's create a new document under Documents. So this is just an RMD document, and it's using the Articles package to compile this document. And now you can see you have a journal article. And the only thing you want to do is that you have to write a LaTeX template. And as I said, there are many examples here. So you can probably study these templates and then use the IEEE LaTeX style. And then you can combine RMD with these journal styles.