R Not Only In Production - posit::conf(2023)

Transcript#

This transcript was generated automatically and may contain errors.

Thank you for that lovely poem, Hadley. I am so excited to be here talking to you all at PositConf today. This is the conference that I look forward to all year long, and every year I get so excited seeing all of the amazing projects that everyone is working on. And when we were fully online in 2021, I got so excited that I literally decorated my office with streamers and balloons, just to capture that feeling of excitement that I feel being here.

PositConf feels like the club where everyone is in on the cool things that R can do, and we're all here to hype each other up. And for those of you who are here more for the Python content than for R, now that it is PositConf and not RStudioConf, I'm sure that you will have that experience as well.

When I stroll through these conference sessions and have conversations with all of you in the halls, I almost get this feeling that I'm in this amazing community garden, where all of these projects that people have lovingly tended and cultivated are being shared with the community. And that feeling is really nourishing and meaningful, I think, because when we go back to our day-to-day lives and our jobs, it doesn't always feel that way.

So I'd like to see a show of hands of who has ever felt isolated or siloed in their work. It's a lot of you, and I have too. When you go back to your day jobs and your workplaces, you may not have colleagues to share a lot of ideas with. You might be self-taught in R or your programming language of choice, like many others here, and you might be struggling to figure things out on your own. Or you might face resistance from people in your organization who say, R can't do that, R's not good for that, R's not a real programming language.

And it can be sort of jarring to go from environments like this conference where you can see all of the amazing things that R can do to being told that, no, it can't.

So I have been there before too. And I have also been the stubborn person at an organization insisting on using R because I knew that I could be productive and build what I needed to build with it. And having come out on the other side of that experience, I can say definitively that I was right. Not only is it possible to build quality software in R, I think everyone at this conference knows that, it is possible to have an organization where the strengths of R and the people who use R can be brought to bear to influence the entire organization and its mission as a whole.

Not only is it possible to build quality software in R, I think everyone at this conference knows that, it is possible to have an organization where the strengths of R and the people who use R can be brought to bear to influence the entire organization and its mission as a whole.

Jenny Bryan once said that if you use software that lacks automated tests, you are the tests. Well, if we build software that lacks automated tests, then that means our users are the tests. And by extension, they're patients.

Continuous integration allows us to automatically run these tests so that we are constantly making sure that our code is behaving as intended. So every time we push new code to GitHub, we kick off a pipeline that gets that latest code, it builds the R package that it contains, and it runs the tests. And then it's going to send its results back to GitHub in a way that we can clearly see.

So if we were to look at the git history of our project, we might see something like this. We have a commit where we add our function to detect AKIs, and we write our tests along with that so all of those tests are passing. And we get this nice green checkmark, which is very satisfying to me. And then maybe we realize that the performance of our function is lacking. We need to refactor it in some way to make it faster. And so we rewrite it, and we think that we've preserved its previous behavior, but actually we have changed how it handles some edge case like how it deals with missing data. And so tests that used to pass start to fail. And our continuous integration pipeline is going to send this angry red X back to GitHub, which I find incredibly unsettling and impossible to ignore.