Reproducible Examples with the reprex package

Transcript#

This transcript was generated automatically and may contain errors.

Welcome everyone to today's webinar. We're going to talk about reproducible examples from a conceptual point of view and why they're surprisingly important, and then also a great deal from a mechanical point of view, how to make your reproducible examples in a way that they're easy to share with other people. This short link, the rstud.io reprex , I promise it will always point to something very relevant to this package that will link to absolutely everything else.

Basic usage of reprex

So the first thing I want to do is show basic usage, and we're just going to get right into it and then we'll unpack what you just saw. So I'm sitting here in an RStudio session, it's fresh, and I have a little bit of code up here in my source editor. I'm going to make a factor X, a factor Y, I'm going to combine them and get what to most of us is kind of a puzzling result. So this is just going to be an example of a small piece of code that maybe you want to talk about on the community site or share with your local R expert and ask what's going on.

So this is how you would use reprex to turn this little snippet of code into a reproducible example. This is the path of least resistance, we'll talk about other methods later. So I would select a little piece of code and copy it to my clipboard. And over in the R console, I'm going to type reprex. And you will see that that little piece of code is run. And then basically a beautiful, attractive version of that is stored on my clipboard and I can preview it here.

So if I were to paste the contents of my clipboard right now, you actually see what's called markdown and this is what's necessary to create the attractive version of this code. And why is this helpful? Because you can go to places like GitHub, the RStudio community site or Stack Overflow and paste this markdown in. So I'm going to show you what this would look like in a GitHub issue.

So that's that same markdown that you just saw. GitHub lets you preview things. And you'll see that it looks just the way it did locally for me. It's been rendered, it's syntax highlighted. We have a tiny little ad down here that tells people how you did this. And I could submit that as a GitHub issue. So that is the basic process.

And the reason I can just type reprex is that I always have this package attached. And so you might need to call this and we're going to talk a great deal about that next. So that is what basic reprex usage looks like. It creates a small little piece of code, renders it nicely, and it's ready to paste into other formats.

Motivation: help me help you

So this is just a static version of what we just did. And this is the GIF that I use on the reprex website. I rewatched this clip. It's from a movie called Jerry Maguire, it's still highly recommended. And basically the reason for bothering to do all of this is if you're going somewhere to have a conversation about R, to have questions answered, or to describe a bug in software. Being careful about how you make your reproducible example makes it much, much, much easier for other people to help you.

Being careful about how you make your reproducible example makes it much, much, much easier for other people to help you.

And I want to explain where this word came from, reprex. So Roman first tweeted this and I thought it was just a great made up word. So it is short for reproducible example. So it is a completely made up word. But it's just very handy. And I'm going to use the word reprex over and over and over again in this webinar. So I want to be very clear that I'm using it in, I'd say, three distinct but related ways.

So I think people, at least in the small R community, are starting to say reprex just as a noun, like it is a reproducible example. And that has nothing to do with whether you use this package or not. But then today's webinar is going to show you use of a package with that same name, reprex, that you can install from CRAN. I'll show you how to do that in just a moment. And then this is a pretty small package. It has a couple of functions, but really the main function it has is also called reprex. So this webinar is going to talk about how to use the reprex function inside the reprex package to produce a good looking reproducible example.

The last selfish point that I'll make is it turns out when you sit down to make a good reprex out of your problem, and you keep it self-contained, you strip down your giant hairy data set to the smallest data set that reproduces the problem, it is amazing how often you end up answering your own question in the privacy of your own home, and you didn't have to make yourself vulnerable to other people.

So this is a great revelation. And I think the reason this works is that when you have a problem, it's very easy to just keep going in circles and banging your head against the desk. But there's something about preparing it for other people, and the reprex package is also being a real hard-ass about making sure that your problem is self-contained. It kind of knocks you out of that very unproductive place and gets you back on the path of actually working the problem. So most people report this when they first start making reproducible examples, is that it's kind of amazing how often this exercise means you actually answer your own question.