Hadley Wickham | An introduction to R7 | RStudio (2022)

Transcript#

This transcript was generated automatically and may contain errors.

Thank you. So I was thinking earlier, like, what idiot scheduled this talk at this time? And like, because immediately afterwards, I have to go and introduce Jeff for the keynote. And of course, the idiot was me. So today, I want to give you a bit of a first look at this new OOP system for R called R7. And obviously, like, I'm giving this talk, but I didn't do all the work for this talk. This is a joint effort from the R Consortium Working Group on object-oriented programming. So we'll talk a little bit about that later, because I think that's a really important part of this project. This is not just, like, some idea that I've had. It's a team effort from a number of very important stakeholders in the R community.

Why we need OOP

But to begin, I wanted to kind of talk about, like, why do we need OOP? And I want to do that by talking about this sort of this Bizarro function. And the goal of this Bizarro function is to take an R object and turn it upside down somehow. So for example, if we've got a numeric vector, maybe we'll flip the sign. Or if we've got a logical vector, maybe I'll turn trues to falses and falses to trues. Or if I've got a character vector, I'll flip the letters in each of the strings in that vector. Or if I've got a factor, I'll flip the levels. And if there's anything else, I'll just throw an error.

So what's wrong with this function? Well, there's nothing really wrong with this function, because it's so simple. Like, you can already see it on one slide. But there are some problems with this kind of general approach. And the first one is, as we handle more and more types of data, this function is going to get bigger and bigger and bigger. And you're going to have to have all of that code in a single file.

And if you think about functions in base R that have to have different behavior for different types of things, this is going to become really problematic. And I think the function in base R that does the most different things for the most different types of objects is print. And if you were to write print with this kind of if-else style, this is what it looks like just for the types of objects in base R that begin with the letter A. That's already not enough to fit on a slide, and I haven't even included the implementations. So you're going to imagine, as you deal with more and more types of things, it's going to get bigger and bigger and bigger.

And the other possibly more important problem is that there's only one person who can add new types of behavior to that function, that can teach that function to handle new types of object, and that's the original author. And again, you can imagine for R core, like if every time someone added a new type of thing, a new class, that some code in R itself would have to change, that's obviously going to be a huge, huge hassle. So that's really the inspiration for OO programming in R.

And the other possibly more important problem is that there's only one person who can add new types of behavior to that function, that can teach that function to handle new types of object, and that's the original author.

And so instead R7 is designed around, like rather than trying to fight this very natural tendency, R7 accepts it while giving you the ability to change the internals of your class if you need to with these dynamic properties.

And then finally we've tried to dig a pit of success with thoughtful function names, argument names, documentation and errors. So hopefully after you've used it a little bit you can start to guess what the code you need is and if that guess is slightly wrong, you will get useful feedback that points you in the right direction.

So really, we'd really, really appreciate it if you would try out R7. As I said, it's currently a GitHub package which you can install pretty easily if you want. We'd love to know not just what doesn't work or how it doesn't help you solve your problems, but we also want to know what doesn't make sense, what don't you understand. Where are the problems in the documentation? Where are the problems in the error messages so that we can do better and that we can make R7 just right. Thank you.

Hadley Wickham | An introduction to R7 | RStudio (2022)

Transcript#

Why we need OOP

S3, S4, and where R7 fits in

Implementing the bizarro function in R7

Classes in R7

R7 vs S3 and S4

Featured software#

rstudio

tidyverse