Resources

Rich Iannone || New features in {gt} 0.6.0! || RStudio

00:00 Introduction 00:18 sub_missing() 03:51 Markdown formatting in sub_missing() 04:51 sub_zero() 07:34 sub_small_vals() 13:08 sub_large_vals() 16:25 final thoughts A new version of the R package {gt} has been released! We are now at version `0.6.0` and there are now even more features that'll make your display/summary tables look and work much, much better. Let's run through some of the bigger changes and see the benefits they can bring! New functions for substituting cell data We now have four new functions that allow you to make precise substitutions of cell values with perhaps something more meaningful. They all begin with `sub_` and that's short for substitution! sub_missing() (formerly known as fmt_missing()) Here's something that's both old and new. The sub_missing() function (for replacing NAs with... something) is new, but it's essentially replacing a function that is old (fmt_missing()). The missing_text replacement of "---" is actually an em dash (the longest of the dash family). This can be downgraded to an en dash with "--" or we can go further with "-", giving us a hyphen replacement. Or, you can use another piece of text. If you're using and loving fmt_missing(), it's okay! You'll probably receive a warning about it when you upgrade to {gt} 0.6.0 though. Best to just substitute fmt_missing() with sub_missing() anyway! sub_zero() The sub_zero() function allows for substituting zero values in the table body. sub_small_vals() Next up is the sub_small_vals() function. Ever have really, really small values and really just want to say they are small? With sub_small_vals() we can reformat smaller numbers using the default threshold of 0.01. Small and negative values can also be handled but they are handled specially by the sign parameter. Setting that to "-" will format only the small, negative values. You don't have to settle with the default threshold value or the default replacement pattern (in small_pattern). This can be changed and the "x" in small_pattern (which uses the threshold value) can even be omitted. sub_large_vals() Okay, there's one more substitution function to cover, and this one's for all the large values in your table: sub_large_vals(). With this you can substitute what you might consider as too large values in the table body. Large negative values can also be handled but they are handled specially by the sign parameter. Setting that to "-" will format only the large values that are negative. You don't have to settle with the default threshold value or the default replacement pattern (in large_pattern). This can be changed and the "x" in large_pattern (which uses the threshold value) can even be omitted. Final thoughts We are always trying to improve the gt package with a mix of big features (some examples: improving rendering, adding new families of functions) and numerous tiny features (like improving existing functions, clarifying documentation, etc.). It's hoped that the things delivered in gt 0.6.0 lead to improvements in how you create and present summary tables in R. If there are features you *really* want, always feel free to: File an issue: https://github.com/rstudio/gt/issues) Talk about your ideas on the Discussions page: https://github.com/rstudio/gt/discussions Learn more about the gt package here: https://gt.rstudio.com/ Got questions? The RStudio Community site is a great place to get assistance: https://community.rstudio.com/ Content: Rich Iannone (@riannone) Motion Design & editing: Jesse Mostipak Music: Nu Fornacis by Blue Dot Sessions https://app.sessions.blue/browse/track/98983

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

largely based missing values for something else. And the setup is, you know, we use that with the data data here just means like the GT table data. So you always start with GT. And then you use this function, and that's data and columns. Basically, you can focus this function over any columns you want. But by default, it's everything. So it'll just go through every single call. And no big deal. If like, there's no missing values in some of those columns, it'll just, you know, skip over those. So it's kind of nice, you can just like, almost like it's almost like paint by numbers, or I'm trying to find the right analogy, but I can't seem to do that. It'll just find things which apply. And if they don't apply, no big deal. And you can even run it if you don't have missing values at all. It'll still work. No change, but you can feel safe doing that in case you have some like, table inputs that may or may not have missing values. You don't know that. So that's kind of cool. They can do that.

So let me show you this. We have a built-in dataset in GT called Xybil. It's a little hard to fit. Although the idea is that it does fit. It's only eight rows. And a small table, it's just for like, messing around with tables in GT. So I'm gonna make this even smaller. Get rid of a few columns, get rid of row and group, and then put that into GT with this GT function. So if you know nothing about GT, this is how you sort of way we can use submissing. So in this case, we can use column names. But for this one, I'm just using like, indices. So 1 and 2 are num and char. And 4 to 7 are the rest of the columns here. And we're gonna replace in columns 1 and 2, we have any values. There's one here and one there. We're gonna replace NAs with the text missing. And just to be a bit different, in columns 4 to 7, we'll replace the NAs with nothing. Two ways of saying it's missing.

So you can totally do that. It's great. So I'm gonna run this. And immediately you'll see the effect. Missing, missing. And a whole lot of nothing here in these columns. So kind of cool.

By default, I'm just gonna change things here a bit. I'm gonna take out this missing text. It's not the default value. So if I do this and run it again, the default is an em dash. And it's hard to sort of like, write that to represent what an em dash is. Here the default in GT is three dashes. It magically takes that and converts it into an em dash. There's no way in Markdown to say I want an em dash. You just can't do it. This actually becomes like a horizontal rule, which is not what you want in a cell, probably. Maybe you do, but I don't think you do.

So with that said, you can even use 2 to get like a smaller dash. In this case, that's an em dash. And maybe you can see it, maybe you can't. It's a slightly smaller dash. And of course, you can go back to this, which is a hyphen. So there you go. So the dash becomes smaller and smaller with less of these hyphens used. So, yeah. So that's submissing. And this is actually nothing really new. We used to have this function. It used to be called front missing. It's just been renamed to submissing because I want to move things into a separate family and have a bunch of other functions which do the same sort of thing. You can still use the former, but it gives a warning, which you may not want to see each and every single time you want to make a table. So just migrate your missings over to submissing if you have some code that you're just going to run in the future.

Markdown formatting in sub_missing()

Let's find out. It does, yeah. You have to wrap it with md. And so, let's actually do this. Let's do just go on. Not there. Okay. Let's do bold just so we can really see it. So I'm going to run this. Yes. It does take markdown, which is great. Easy way to make, you don't have to style it after the fact. You can just use that. So let's do emo. So let's put inside this md thing. I'm pretty sure that is super important because that just becomes like HTML. It worked. It really worked. Okay. That's great. This is fun. I love like, you know, I didn't know for sure that would work. Will it work without md? Probably not, right? But I don't know. Let's give it a whirl. Let's see. Whoa. Oh, yeah, yeah, yeah. It just gives you a different flag. You don't need that md. Okay. That's cool. That's cool. That is very cool.

sub_zero()

So here's some really new stuff. Like it's new both in like, like totally new. It's like, it's a 0.6. Sub zero. Cool function. You can either think about like a blue ninja or a refrigerator. It's totally up to your imagination what that might invoke in your mind. But basically what it really is, is if you give it a zero value, like it's truly zero, not rounded to zero, but like really, really zero, then you can replace that zero with something. And it might be kind of fun to do that. Who knows? We'll just find out. Okay. So basically the same idea as submissing. Instead of missing text, it's zero text. And the default is nil. I don't know why I chose that. I think I saw it somewhere and I thought, that's kind of cool. You never see that. I'm going to put that in. Instead of just, you know, using this, that's even kind of like too boring for me because just replacing the thing you have with the same thing. Okay. For this, I'm going to actually make a new table using the table function from dplyr and the table package.

I can make a small table which has a bunch of values, but also some zeros. And these are like perfect zeros. They're not 0.0001 or whatever. They're just like zeros. Okay. So with this table, the zero values will be given replacement text. Okay. And in this case, this is kind of cool, right? Because you can like just use this without anything. And the default will just work. I mean, you may not like the default value, but you know, it's pretty easy to use. You have to admit. Okay. So let's try that. Let's run this and see if it works. It totally works. Okay. That's awesome. And the cool thing is like if you use format number on the same column, this will still work. Like it won't like cancel out of things like other formatters do. That's kind of like a different thing with these sub functions. You can run it within the same columns as the formatters. And it won't interfere. It won't do anything weird. It won't cancel out the formatting. It's kind of nice.

You can run it within the same columns as the formatters. And it won't interfere. It won't do anything weird. It won't cancel out the formatting. It's kind of nice.

Okay. So let's try with a different zero text. Again, markdown will totally work. Like zero should totally work. Yeah. That's great. I wonder if there's emoji for that. Let's give it a shot. if only to like check our knowledge of this package. Yeah. Yeah. Sub zero. Really simple. It's meant to be simple. Why would you use that? I don't know. I've seen it or like sometimes like rounding down to zero is different. Like it's been a true zero. It's almost like a truly missing value is different than like something that's not entered. Who knows? There's a difference there. It just hits different. These zeros, whether they're like really zeros or not.

sub_small_vals()

So now we have ones which are a bit more complex, but still useful, I think. So this one here called sub small vows. So you ever get this issue where like you have really small values, but you don't want to put in all like the zeros. You just want to say it's really small. It's like below this number. It's like, it's not zero, but it's like really small. So that's where you can use what you can do with this one. Sub small vows. So same as before, it's got the data. You can focus it on columns and rows. There's actually a threshold value. This tells GT what small is. Like that's the definition of small in this case. And this is a bit complex. Usually you don't put like an if-else statement in your pattern. So that's also another thing. We have a pattern in this case. The pattern is basically this like curly brace X. That'll be the value basically of, you know, it'll basically be the threshold essentially. This will go into there. And then you can like wrap stuff around like lower values. So basically if the sign, which means like if the values are positive, any values which are below this threshold will get this less than threshold value. Okay. And if, you know, we also have negative values because we have small negative values. It's a weird case. It's held separately. You just put sign minus like negative sign. And then the other case will be, you know, this is very strange, less than the absolute value of minus the threshold value, which is kind of weird, but I couldn't really find a better way to represent that. But I imagine the case is not too, you know, you don't get very many cases where you have small negative values, but it's there in case you need it.

So let's take a look at this. We'll play around. So we got this table here. Let me just get rid of this. Let's see it again. And I'll bring this down a bit. Okay. So we have some really small values and we're getting to like bigger values. So obviously like we're going to like target these values at the top here as being small. So the default value is 0.01. So somewhere around here, we don't know if it's going to clear this or not. We're about to find that out. But yeah, I'll show you the beforehand one. We can still format the number in that column. I'll make this bigger. Okay. So all the values are formatted, which just means that before we had this and we don't really want that. So now we have this after formatting. We have two decimal places as a default. We can't even see like the actual small values here. So this is just right for replacing with like something smaller than some threshold.

So let's actually run this. And there it is. So these are smaller than that value. It doesn't seem to include, it seems like it's less than, not less than and equal to that value. Which is sometimes, I think it's actually maybe a good thing to have it like that because like you probably want to keep these values and just have it less than, like not including that value. Okay. So that's how that works. And we can change the pattern. Let me try that, actually. We can even like not include like this at all. Well, let's actually just run this by itself. So that is basically that without anything around it. This is the default right here. There. And then later on, I'll show you that you don't have to even use this. But first I'm going to show you the negative case. So let's pretend, for instance, we have the tables we had before. We're going to make all the numbers negative. So going into GT is going to look like this. So it's all unformatted. So we're going to format that column first, which format number. Great. And now we have these values here, which did round, I think, to like positive. So it's really kind of like weird, right? But we know these are negative values. So it's kind of weird thing already. So let's actually just run this. In this case, the sign is negative. So it's going to consider all like negative small values, like really close to zero. Okay. So we're going to run this. Yeah. And now we get this sort of like less than the absolute value of this threshold in the minus direction.

So kind of cool. We use the same default. We didn't have to change it to like minus 0.01. It just knew by virtue of the fact that we had the minus sign, it just handled it as the negative case. So here's another example. The threshold is much smaller. I think it will get one value in there, which is the top value. And we just replace it with a literal string, like small, like this. So check it out. Yeah. Formatting as before. Yeah. We don't know how small these are because we cut off like the actual like digits, but this will reveal. There you go. That one's small. And we can always do this too. Like if you want to just touch up here formatting, you can do a decimals three. And I believe in this case, the top, yeah. The one below the replace value still shows decimal place. And if you really don't like that, you can do, I believe there's one called SIGBIG. And that's the number of significant digits you want to keep. And I believe that is very similar. Oh, not quite. It runs down some direction.

sub_large_vals()

Okay. So we have like, we did the small values. Now there's also like this sort of like compliment, which is sub large values. Because you never know, you might have like these large values. You don't want all zeros. And you know, you may not want to use that scientific notation, which is supposed to solve that. But yeah, we can totally do that. We can substitute large values in the table body with something else. And it uses me the same sort of like arguments threshold for one, obviously much bigger, like one times 10 to the 12. So big, large patterns different. It's not like complex like the other one. Could basically just write yourself if you want it to be less than and sign is there as well. So very similar. Let's run through some examples. Okay. So again, I'm making my own table. There we go. Basically, zero and a just for good measure to have those in there, just to see that's not affected by that. And super huge values. Okay, so I'm gonna make this bit smaller, because we looked at that. I'm gonna step by step, I'm gonna show you that, yes, you can format the numbers really big, if you didn't format the numbers, you probably don't want that. You probably don't really want this either. But if you did, and you really don't want the really big values, you could run this. And there you go. And these values are bigger than like, like, one times 10 to 12. Unfortunately, this is not really nicely formatted. And there's no real way of forming this actual thing, which is kind of crappy. But you can put in your own thing, at least. And I think in the later updates, I may be I'm gonna make it possible to format these itself. But at the moment, it's not easy to do work. It's really tough. But it's okay. Because we can actually change things. We can actually say, do I have an example here? Yes, I do. Okay. But let's get to the negative case, because never know, you might have large negative numbers, depending on what you're doing. So those are large negative numbers. Write those again. Same as before, except it has a negative sign. And in this case, we only have to do much the same thing, we just have to use sine equals. And we have this, and this automatically changes to like less than, because it's weird. Now we're going, yeah, it's gonna be lesser of a negative number. Man, this math stuff is tough. First scientific digits, and now this. Yeah. So we're gonna have values which are less than this large negative value. Okay. It makes sense. You just have to think about it for a sec, because we're in negative land. Okay. So we don't have to settle with, like as I said here, the default threshold value, or the default replacement pattern, in large pattern. I can change it. So in this case, I change the threshold to be a bit smaller, and we're doing positives. In this case, large pattern is just a word, hugemongous. We're gonna run this. And then here we go. All the way down, you get this thing. Yeah, mark down. Definitely. It should totally work. Just to prove it, it works across all these functions. Let's run this again. All right. Yeah. Great.

Final thoughts

Yeah. These are like, these substitution functions are like finishing touches, just in case, you know, if you need it, it's there. You know you need it. But if you don't need it, don't worry about it. But some people need it. Like, one case, especially, is like the, not the sub zero one, but the sub small values. You may have like values which are below a certain threshold, could be like measurement stuff, like below detection rate. You can have all sorts of like specialized words, depending on your area of focus, like your research subject, which denote that these values are small, but not zero. This is probably the most important one. But this is also important to you for like really huge values, just really big. You can just see that.

Like, one case, especially, is like the, not the sub zero one, but the sub small values. You may have like values which are below a certain threshold, could be like measurement stuff, like below detection rate. You can have all sorts of like specialized words, depending on your area of focus, like your research subject, which denote that these values are small, but not zero.