
Rich Iannone || Making Beautiful Tables with {gt} || RStudio
00:00 Introduction 00:37 Adding a title with tab_header() (using Markdown!) 01:47 Adding a subtitle 02:48 Aligning table headers with opt_align_table_header() 03:48 Using {dplyr} with {gt} 06:03 Create a table stub with rowname_col() 07:35 Customizing column labels with col_label() 09:45 Formatting table numbers with fmt_number() 12:10 Adjusting column width with cols_width() 15:39 Adding source notes with tab_source_note() 16:55 Adding footnotes with tab_footnote() 18:55 Customizing footnote marks with opt_footnote_marks() 19:10 Demo of how easy managing multiple footnotes is with {gt} 23:41 Customizing cell styles with tab_style() 27:07 Adding label text to the stubhead with tab_stubhead() 28:15 Changing table font with opt_table_font() 29:25 Automatically scaling cell color based on value using data_color() With the gt package, anyone can make wonderful-looking tables using the R programming language. The gt philosophy: we can construct a wide variety of useful tables with a cohesive set of table parts. These include the table header, the stub, the column labels and spanner column labels, the table body, and the table footer. It all begins with table data (be it a tibble or a data frame). You then decide how to compose your gt table with the elements and formatting you need for the task at hand. Finally, the table is rendered by printing it at the console, including it in an R Markdown document, or exporting to a file using gtsave(). Currently, gt supports the HTML, LaTeX, and RTF output formats. The gt package is designed to be both straightforward yet powerful. The emphasis is on simple functions for the everyday display table needs. You can read more about gt here: https://gt.rstudio.com/articles/intro-creating-gt-tables.html And you can learn more about Shiny here: https://shiny.rstudio.com/ Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/ Content: Rich Iannone (@riannone) Design & editing: Jesse Mostipak (@kierisi)
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
So we got gt, we got polymer penguins, now we got the dataset penguins, pipe them to gt and in your viewer you get the table. Which is pretty sweet, you don't have to do too much just to get a basic table, right? So I'm just going to move this out a bit so we can actually see more of the table. Great, great. So say you wanted to do things like make this look nice, like maybe add a title because titles are nice because it tells you what's going on in the table. So we can actually do that with a tab header. And then there's an option right there, you just see in the pop-up help, title. Subtitle seems to be optional.
I like a subtitle sometimes. But I'll put in title. I'll put in penguins dataset. Alright, so we can just run it like that as it is. But we have this cool thing called markdown. So we can actually use that too. You just have to wrap this with md and you can do cool things like use markdown. And now we can have like these backticks to say like this is going to be like this code font. And there we go. Instantly, you can just sort of like, you know, work iteratively and see the result on the other side. And this is kind of nice because you can sort of like learn how to make tables. And then when it comes to like the time to like publish these tables, and then our markdown document or order document, you've already, you know, practiced up. This just becomes second nature. So yeah, tab header that gives you titles. Let's actually put a subtitle in because that's kind of sweet.
Adding a subtitle with markdown
Subtitle, we want to use markdown because it's a good thing. So what's something we can say about the penguins dataset? Three years, three years of data. That's not bad. On penguins on three islands. We can make this long because the subtitle is pretty small. So we're using markdown. Let's not let that go to waste. So let's actually accentuate something here. Not bad. Go with two asterisks on each side of three. We can make that bold. And just for fun and just to prove that works, we can do this in italics. So let's run this whole thing. Not too shabby. So we get like the markdown treatment here. It just appears.
And by default, it's just centered. I mean, that's a cool thing. We can also make it left aligned if you want that. That's like, you know, we can get to that a little bit later. Nevermind, let's do it now because no time like the present. So opt table align table header. Okay, great. So now we just have to say align equals left for that sort of nice left aligned look. Actually, that looks more pro than before, actually, for some reason. I really like it. Okay. So that is how to make a title like a header on a table.
Building a summary table with dplyr
So now we're saying like now we're thinking like, okay, we have all this data. That's a lot for a summary table. Let's actually just make a summary table, right? This is a lot to chew on. It's basically just like raw data in lots of ways. So the cool thing is we have things like the tidyverse, like dplyr, tidyr, all that stuff. I'm just going to load in the tidyverse. Great. And we can do stuff to penguins before we, you know, put in the gt. So we can do things like all the fun stuff like group by, we can group by maybe species. Let's do that. Great. And then we'll do a summarize. Here's the thing, I don't even know what to summarize. How about like these are like the mean body mass and mean, just means, we'll just do some means, means everywhere. Fine. We'll just do one, bill length. And then we'll just see if this even works with funds. Whoa, whoa. Okay, good. Going with it.
Maybe. So we have to do, okay, so let's move this back to where it was. And there's like an NA. Oh, yeah, it's sort of like the mean. It's going to be like this, where we have NA. I'm totally not sure if that's going to work. That totally works. Okay, good. Okay, so we're doing this. So we can do mean. I'm not sure how you do multiple functions. I think it might be list. I'm not sure if you just put, you know what, I'm not going to bother with that. I'm just going to like put a bunch of bars in here. And away we go. Okay, so we'll do bill length, bill depth. Great. Let's do flipper. Yeah, we'll just, we'll make it a bunch of things like that. I mean, I probably say like, just the indices, like three to like, I like to make things hard for myself. Okay, so flipper matches, seriously. Okay, and now body mass, g. Great, I think that'll do it.
Excellent. So now we have a summary table. Okay, so now we got this. This is kind of cool. So we have our summary table, which is up here, it's a little bit bigger. And we have species, and we have, you know, all the data involved. The cool thing about gt is it allows you to put things like in a stub. So these are kind of like labels for a row. So we can do something called row name call. And we can put species in there like that, that column name. And the great thing about that is that it just sets it off, puts it to the left, and sets it off with an automatic border. And there's nothing above that either, which you can change. But it's nice, because you can just like, have this as like, it looks more like a table than it, you know, just it being like, not separate like that. So we do that with row name call equal species.
So this is still cool. We can actually change the title. So it's like summary of the hanglers data set, it's actually a little more interesting. And we cannot, you know, this is no longer valid, I don't think. Three years of data on, I would say like maybe uses three years of data on hangings on three islands. Maybe that's a bit more correct to say. So now if you run this whole thing, let me make this bigger. Okay, great. If you run this whole thing now, so now we get this, uses three years of data, and can always change things like the title, like so. Run it again, pretty quickly. So now we sort of see the beginnings of a nicer table. The only problem we have is the column labels look not so great. I mean, they're great for like, you know, like your tables in dplyr. But they're not so great for like summary tables you're going to present to other people. So we can make those nice.
Customizing column labels
We can totally do that with something called calls labels. Okay, so if we do bill length, there we go, and then put equal sign, then in quotes, the thing you want to change it to, that works. And you just like have a bunch of those like sort of set off like it's like a list essentially without having a list around it. So we can just say bill length. And the cool thing is we can do things like this and put in mm below. Let's see if I'm right about that. So I'm going to tie this all together with that. No, okay, not quite. The thing we have to do is put md around it because the br tag is html. And markdown, it recognizes html. So that's a sort of a good pairing. There we go. So now we have bill length millimeters. We probably want a comma, as we often see. Comma would be nice to sort of separate those things apart. So now we have bill length mm. And it looks a bit nicer. It's a bit more narrow, which is great. And we can do the same thing for the rest.
So we just have to sort of copy this as a template, work down, change it. So now I'm going to change depth here. Depth. And then this becomes depth. Same sort of things apply. Add commas, read them. And next one's going to be flipper length. Great. And one more, body mass, g, k. Mass, comma, er, and then g. Great. Let's see how this looks just by like writing all this code. Not bad. It's looking pretty good. It's looking more table-like. A little bit less, you know, like wide. So these are just some cool things you can do.
Formatting numbers with fmt_number()
The numbers, we probably don't need that many digits, like after, you know, after a decimal point. That's a lot. So gt comes with a whole ton of like these formatting functions. They all begin with fmt. So you see a few of them here. The thing you really want here is format number, which has just a plethora of options that you can use. So we'll go with that. And just by using the defaults, it's like two digits after decimal place all the way down. And it operates on columns. So you need to have that, absolutely. And the cool thing is you can borrow things from dplyr. You can use everything. Because all the columns here that are in like the data are numeric. So you can totally use that. Great. And then the cool thing about it is that if you use format number, it comes with these, it uses these digit separators. You see commas. And that just gets put in where it needs to be put in automatically. That's kind of cool.
And another cool thing is that this might not be what you want. Body mass g. Maybe you want kilograms. So we can actually do that. So we can put another formatter function in, which is format number again. And we can target just the body mass column. So to do that, we use just like the original column name like this, body mass g. Like so. But now we're going to use options. We're going to do something a bit more interesting. We're going to do something, use an option called scale by. In this case, we want in kilograms. So we're going to scale it by 1 over 1000. So we're going to divide by 1000. Cool. So let's do that. Now we have it in kilograms. So we're going to go back, change the label to be kg. And this is kind of nice because like, you know, it takes all those digits, sort of compresses it down to like something that's a bit more readable. We have less digits overall. And you know, if you're starting with g, you might, you might as well go to kg. Like it's not a stretch of imagination. So you can actually scale numbers with the formatters in gt, which is kind of cool.
So you can actually scale numbers with the formatters in gt, which is kind of cool.
Adjusting column widths
Now we have this, we may notice like this a few things that are kind of like not great. Like for one thing, this probably just bothers me, but like the width of these columns is not the same. You know, across, it's basically governed by like the width of like the largest label. In this case, flipper length makes the column much more wider. But we can we can change that we can change anything. We can go to use this function called calls with right there. And what we can do is we can select the columns. I believe it's going to be okay. You can use tie select on the left side. So we can do things like okay, so we can use this right here, we can use just as far as list, I think. Okay, so bring that down. Like this, because we just need the names itself, not in quotes. Oh, I apologize. We actually know, those are the original names. So we get like this.
And then on the right side, we're going to put in a width value. So we can use this px helper function if you want. It just means that any value put in there is going to be pixels. It's the same thing as writing like in quotes, some number with px to the right of it. So let's put in like 50 to see where it starts. And we'll just copy that down to each of these. So, and see what it gives us. No, okay, so I got that wrong. Instead of equal signs, we should be using this. Great. Okay, let's try this out. Okay, not bad. I mean, a little bit small, but the cool thing is we can stretch this out. So you can make this 80, for instance. And then we'll get it to like, you know, a width that's kind of nice. Oh, not quite right. Let's make this even bigger. 120, I think. There we go. And we might even like bring it down a bit. It's all like iterative. You're going to like adjust things as you go. Luckily, it's pretty fast. There's, you know, there's almost no time cost. I think we missed a sweet spot here. I think it's going to be like 110 or thereabouts, where like things stop wrapping. Maybe 120 was good. Okay, I'm going to go with this. This is not bad.
So now we have like all the columns, like at least in the data part being like, you know, of the same width, which is kind of nice. It kind of leaves your eye like in a nice way, like each of these numbers are spaced out. They have the same number of space between them, which is kind of cool. And you can also do the same thing for the stub as well. You can just use at the very end, everything. And that means in this case, everything else. And we can make that a bit bigger. I'm not sure where it's starting at. But if you just do 60, it's definitely going to change things. And I'm going the wrong direction. Maybe we want that to be 100. And there we are. So now we have ideas, our table, with the width that we want. So we're basically focusing on presentation here. And we're trying to make it look good, I guess. But these are tricks you can do pretty simply. You just have to, and of course, you use tie sludge here. If you have like a statement that works nicely for all these four columns, that works. You definitely use on the left side there and just use PX equals 120 for all of them.
Adding source notes and footnotes
So that's that. So say you want to like add some nodes. And, you know, we add a header section. But what about a footnote section? So we can definitely have that. And there's two ways to put stuff in the footer. We can use tab source node. Great. These are just general notes that you pop in to the bottom. And, you know, a good thing we can say here, or put in is like where the data set comes from. So data set is from the Homer Inglis package. And we can just say our package and put a period. Bring a little bit of space. So I'm going to just line break here. And we can try that out. But I know it's gonna be a mistake. And the mistake is that we're using markdown when really we didn't declare it. So in all these parts, all these like opportunities to put in text, we can definitely use mark like empty helper function. And it just means interpret as markdown. And any output format you use will render that to be looking nice. And we might even do a thing like accentuate R just like this. There we go. So that's just like a source node. Basically, it's a node that's not tied into any particular place in the table.
But we can do that. We can use tab note. And this takes in like two arguments. So in the previous one, you just had to put in like a note. In this case, you have to put in like the note, plus what location the note was referring to. So let's add in the footnote first. And I guess now's a good time to talk about what we're going to actually point out. We can just say like gen two looks to be like the biggest penguin by most accounts. I mean, not so much in build length and build depth, but certainly flipper length and body mass. Gen two is like reign supreme, I guess amongst these three. So we just do that. We'll just say like gen two is largest. I'll just say it's the largest of the three penguins. Great. So that's great. That's the note that's going to appear at the bottom. So where are we going to attach this to? I mean, the most logical place in my mind would be like the label itself for gen two right here. So in this step. So in locations, we can use like this huge amount of like location helper functions. And they all begin with cells. And you see the auto pop, sort of like the auto help shows us, gives us a guide as to which ones are available. In this case, we want cell stub. So it's going to appear a little bit lower. There we are. And if you just take a look on the right side, we got rows is the only argument. So in this case, we want to say rows equals gen two. We just want to repeat the label that's in this step. Okay, great. So let's run this. Great. And there's our first footnote. It's really great. So we have like, as footnotes often do, they have like some sort of like glyph, like for where the note is referring to. And there's a note right there. So the footnotes, the number of footnotes, they appear above the source notes.
Customizing footnote marks and managing multiple footnotes
And there's a cool thing, we can also change the way the glyphs themselves, we can use opt footnote marks. And we can say marks could be something other than like numbers, it could be letters. So we do that, we begin with A, and we can do capital letters. And going further, we could have like a custom set of marks. So you might do star, double star. And who knows we have, if you have any sort of set, it will just run through the set. So we do that as well. So you can totally change the marks to be whatever you want to be, which is super nice. So that is the first footnote. I'm going to take this part out just like to demonstrate the order of footnotes. So the cool thing about gt is that you don't have to like manually manage your footnotes, you can just go nuts with like, you know, like adding tab footnote, tab footnote, and gt will handle the ordering and like, you know, all the other sort of small details when it comes to footnotes.
The cool thing about gt is that you don't have to like manually manage your footnotes, you can just go nuts with like, you know, like adding tab footnote, tab footnote, and gt will handle the ordering and like, you know, all the other sort of small details when it comes to footnotes.
So we can say things like, um, we can elaborate on what, say, flipper length is. I mean, we can say a footnote for length is, I don't know the details about the data set or how the data was acquired, but we can say flipper length was measured with a tape measure. It's a little bit silly here. Maybe it was, who knows, but this is maybe a little bit of fiction. Okay, so let's say that's true. Flipper length was measured with a tape measure. So the note would appear right here on the column label itself, flipper length. So, cool, we can do that. We can say locations equals cells, and autocomplete is your friend here. It would be cells, column labels. This takes the columns argument. And again, we just put in the column name. So I'm going to type that all out. Columns equals flipper. Good. Autocomplete again. Okay, I'm going to run this. There we are. So before where we had one, where gen two was, now it's number two because that appears later, which is kind of cool. We didn't have to sort of match that. We didn't have to say what the glyph was. gt does a sensible thing of, like, indexing footnotes from top to bottom, left to right. So we don't have to worry about things like that.
And here's another really cool thing. Say, for instance, you wanted two footnotes on the same spot. Okay, so tab footnotes. So we'll add an additional note about, this is very silly. I'm going to copy this whole thing because it's going to be on the same location. We'll say something like the tape measure unfortunately suffered some frost damage. Because, you know, they were in the afternoon. So, you know, things happen. So yeah, maybe that's, you know, you have to be like, you know, disclose everything. Okay, so now we have, like, two notes on the same column label. So what's going to happen? Seems like some sort of conflict. But no, gt handles that. You can just run this. We get one and two. So it's basically two notes, two different notes. They appear, you know, they alight to this same cell. Then gen2 is number three, which is great. But it'll handle multiple notes. And the cool thing is you can even have a list of cell column labels, or sorry, of locations. So we can have this note being applied to more stuff. So I can say cells data, or I think it's cells body. And we want columns to be the same column, flipper length. And basically what it's going to do now is it's going to, like, add the note, the glyph on the column label and all of the values in that column. So if we run this, you see that two is repeated down because, you know, the tape measure is compromised. So basically just, like, putting the notes, like, the glyphs on the actual values themselves, too. It's a bit silly, but I just want to show you that, like, for notes, it's very flexible. It does the right thing most of the time. Nobody's ever complained about it, which is good news. And I think people actually like it because they didn't complain about it. So that is good. So footnotes work is what I'm saying.
Styling cells with tab_style() and data_color()
So what if you wanted to highlight a certain bit of data in this table? Okay. We can totally do that, too. We can use tab style. Great. And it operates under the same sort of, like, principles as tab footnote. I mean, it has locations. And we use the same location helper functions. So let's say, for instance, for whatever reason, we wanted to, like, highlight the entire row of the Gen2 Penguin, just the data. I don't know why, but let's say you wanted to do that. Okay. So we can do that. So we do cells. Great. We'll use columns equals everything. Great. But in this case, we want rows to be limited to just Gen2. Okay. This looks odd, but it just means that we're going to target all the columns, but just the rows and all those columns that correspond to Gen2. And that, of course, isn't the stuff. When you have something in the stuff, that's an additional advantage. You can sort of, like, use the rows argument more precisely. You can actually just use the labels, provide that they're unique. And most of the time they are. Otherwise, it'd be, like, a little bit less value, I'd say. Okay. Great. So we have that now. We have the cells body statement. Great. So now what are we going to do with that? We're going to actually add a style. Great. So we have a bunch of style helper functions. So basically, tab style uses two different types of helper functions. One in locations, one style. In this case, it's singular cell. And in this case, you want to fill the cells with some sort of color. There's a default. But I'm going to go with color. And I'm going to go with, like, steel blue. I can spell it. There we go. And let's just see this work. So what I'm expecting to see is the Gen2 row, just the data cells will have that color in the background. And it worked. That's great. So now we have that right there. We can change the color. It accepts all color names, which is great. Light blue, I believe, is a color. That looks quite a bit nicer. And we can also use text colors, if you know them. So you can do, I don't know them. Let's try some random one here. Let's see if this gives us anything nice. Not bad. Not bad. I just guessed at that. Okay, that worked out pretty well. But if you wanted to say, for instance, you had too dark of a color, and you wanted to have maximized contrast, like make this the text white, for instance, you can totally do that. You can have multiple styles for the same location. So we can close out of the list. Cell fill is that. And now we can use cell text, and change the color of the text to be white. This, I'll just tighten this up a little bit, so it looks a bit better. Okay, so in this case, I have the cell fill color, and then using a text color, which is being changed. Great. Not too bad. So now we have like better contrast, because we chose a darker background color. So that is that. So you can definitely do things like style certain cells. This applies to everything. You can style the header. You can style the footer. You can style the step. Every single location here is stylable.
One more thing, just to complete this table, we have this top left corner here that's customarily empty. I mean, the default is that it's empty. But we can change that. There's a special function just for adding content to that top left corner. It's called tab stubhead. And it just accepts a label. And as we often do, we can use markdown to like make that nice. And in this case, it's kind of a wildcard. You can describe either the columns. You can describe the rows. You can describe both. It's up to you. I'm just going to put in penguin, so I'm going to treat it more like a column. So I'm just going to say penguin species. Great. So I'm going to add that in. And there it is in the top left. So you can definitely use that once empty corner. And you just have to remember the function name, tab stubhead. There's a little argument, so it's not too difficult to use. You'll get it. And that's my nice table.
Changing fonts and using data_color()
And there's tons and tons of options. So say you wanted to change the table font. You can do that with this opt function called opt table font. This is a pretty popular option because sometimes the default font is not what you want. And a great thing about this opt table font is you could use local fonts. So you have Helvetica on your machine. You can totally use that. And it changes the look. But say you want something that is not on your machine, maybe part of Google Fonts, you could use Google Font. Then choose a font from their service. As long as you know the name, it's not too hard to do. So Monsterat is one of them. Now let's use that. Great. And it totally changed. And we have all sorts of options here. We have things like the weight can be changed. Usually these are numbered weights. So we can change it to be heavier, like say 600. There we go. So we definitely see that now. And you'd have to go to the Google Fonts page to see what weight ranges are available. But that can be totally used. There's options to do like styles. You can choose like italic, oblique, things like that. Let's actually do that. Let's see if that works. There we go. So everything was changed to like the italic version of the font.
So what I'm going to do is I'm going to take out this tab style thing for now. I'm just going to comment it out because we're going to apply our style based on data color. Okay. So data color. This is what I do every time I use data color. It's really hard to use because it requires that you use functions within the function, like function calls. And they're not even like gt functions. They're functions from the scales package. So I would always use the help. I'm going to data color like so. And then of course it tells you what it does. And like all the help files, it gives you like some visuals showing you what it does, which are paired to the examples. But yeah, my point is I would go to the example that works and then just take it and go from there. So in this case, what we have here is it's working on a single column. And we have the colors argument. It requires a function call. Typically it's something from scales. And the help tells you about this. But I'm just going to copy this in because it's much easier to show than it is to explain it beforehand. And what it's going to do, we're going to adapt this to our particular example here. So if you look at our viewer again, we have these ones right here. We have columns right there. So I'm just going to say columns are going to be the same as before. So I'm going to choose maybe for the sake of brevity, I'm going to choose this to fill length and fill depth. Great. So now the colors will be like, those will basically be colorized according to the values in the columns. So that's great. So we use column numeric to create our sort of like, set of like, colors that we're going to like interpolate through. So the range is going to be the domain. And we can have this being null. And that just sets the domain to the range of the values. But I kind of like actually setting the domain. So I'm going to use C equals 30. And then 50 for the upper part of the range. And then going to set like maybe three colors, because it's gonna be like 30, 40, and 50. So I'm just going to choose red, orange, green, and omit the blue. Great. Great. So let's see if this actually works. The key thing is like, will this actually work? Okay. It does sort of work. The reason why it doesn't work here is because this is actually out of range. So if you want to have different domains, we can definitely do that with multiple calls of data color. Or we can set this to null. Like that. And again, this domain is actually inside the scales column numeric function call. So a little tricky. But when it works, it works awesomely. This will be nice, because this will treat each column separately.
Actually, no, it treats it holistically. So the point being, if you just had say like bill length here, this would be much more nicer. Then we can actually choose the domains that make more sense for the particular type of data we have in there. Let's do 30, 50. There we go. So we can see here that the red is not really going to be in the picture, because we're pretty far away from 30. We're getting more close to orange, which is sitting at 40. And then green sits at 50, which we do see quite a bit of. But now we have a template. We can actually do this for each and every column, just tweaking the values of domain. So for instance, we want to do this for bill depth. So we can change this to be like something like 10 to 20. And we're going to see that as well, basically. There we are. So on and so forth, it does that. You can choose even like super dark colors. Like if you want to have like brown, for instance, what we'll do, I think it'll actually reverse the text. If the color is like, it'll do that for you. It'll essentially like look at contrast and say like, oh, this background color is a bit too dark. It's going to flip the text color around. And you can also disable that, but it's just sort of there.

