
Field Guide to Writing Your First R Package - posit::conf(2023)
Presented by Fonti Kar I recall writing my first package being an intimidating task. In my talk, I will share how a Biologist's mindset can make R package writing more approachable. This talk is an encouragement and a gentle stroll through the package building process. I will show you ways to be curious when you get stuck and how to prepare for the unexpected. I hope sharing my perspective will help others see package development as wonderful as the natural world and dispel any hesitation to start! Presented at Posit Conference, between Sept 19-20 2023, Learn more at posit.co/conference. -------------------------- Talk Track: Package development. Session Code: TALK-1135
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hi everyone, I'm super bummed I can't be there in person, but I'm really excited to be able to talk to you all today. My name is Fonty and I am a biologist. This here is my office. And for the past 10 years of my career, I've been trying to understand some of the complexities of our natural world. More recently, I came across an opportunity to work with other biologists to transform their ideas into R packages. And suddenly I found myself in this really unfamiliar landscape. There were all these new concepts I didn't understand, and I didn't know where to start.
Up until this point of my career, I considered myself as an R code consumer. I used code developed by other package maintainers to get to the answers for my research questions. And the expectation now felt like I had to become an R code producer. And the difference between these two roles felt really intimidating. And I think this is a really common experience when you're trying to first pick up a new skill. So in spite of all these mixed feelings, I followed my instincts and used my skills as a biologist to navigate through this R package landscape. And today I want to share with you all my field guide to writing your first R package.
So this field guide isn't a step-by-step walkthrough. If you want, you can check out my workshop that I wrote for R ladies. And it's not, this field guide isn't really about the technical stuff either. As you heard in Jenny slash Hadley's talk earlier on before, the R packages book is really an incredible resource for that. Instead, this field guide is an encouragement to start your first R package, a very gentle stroll through one, and some of my tips on how to deal with the things that you're likely to encounter along the way.
My hope is to support and help those that feel really overwhelmed about this process, which was me at the beginning, and to also remind those seasoned package maintainers of what the beginner's mindset is like, and to encourage them to use some of the ideas I share today as a way to support those who are learning the ropes. So you might be thinking, why go through all this effort in building an R package? But I hope from the content from all the other previous speakers, that the reason might be more apparent. But I'd like to also think about it like this. The code that you wrote for a particular project that's now hidden away in your directories may actually be a solution to another data scientist's problem. And maybe with some tweaks, it could be a solution to 10 more data scientists' problem. What I'm trying to say is that the impact of your code cannot be truly realized until it's in a shareable format. And that's where the R package comes in.
What I'm trying to say is that the impact of your code cannot be truly realized until it's in a shareable format.
A map of an R package
So for us to go on this journey together, we're going to need a few things. And these packages have already been mentioned in these previous talks. So use this and dev tools are essential to the package building tool belt. And as Matt has alluded, it's great to store your package on GitHub so you can unlock things like using the GitHub issues and all the goodness that GitHub has to offer.
So where would we begin? Whenever I enter a new landscape for a walk, I like to get my hands on a map. So here is a really rough sketch of a map of an R package. At its core, your R package contains your ideas, your functions or data that you want to share with the community. Next to those things are your documentation. These are your help files and vignettes to give your users some guidance on how to interact with your R package. There's also something to describe your R package. So this description file will give the user some higher level information about who wrote the R package, what it does generally, what dependencies it may have and where to go if they have any questions.
There's something called a namespace. This controls what comes into your R package. So what functions you're borrowing from other packages and what goes out of your R package, what gets made available to your user when they load your R package into memory. Most packages will contain tests. These, I think, are the most elusive part of an R package. These ensure that your R package is behaving as it should and to let you know when it isn't. What I really like about having a map of an R package is that it breaks up what seems like a really complex idea into these smaller manageable pieces. And so as you're learning, you can focus on one thing at a time. And so also using this map, you can see the interconnections between all these subcomponents.
The science poke: being curious when you get stuck
So as you're building your R package, questions will likely come up and this is totally not natural and normal. And when they do, I invite you to be curious. And one way you can do that is by using what I call the science poke. So the science poke originated from me going on nature walks and coming across something new and peculiar and me grabbing a stick and investigating it from a safe distance. So in your R package, science pokes are really simple ways to help you get unstuck and that are really low effort and low stakes.
So for example, I was showing my friend Mark how to write his first R package and he goes, how do you add package documentation? And I'm like, well, what do you mean? He said, oh, it's that piece of writing that pops up when you call the help file for deep liar, for example. And I said, off the top of my head, I don't know, but there are functions in use this to add a read me, add vignettes and add articles. Maybe there's something in there for package documentation. And so Mark and I poked around the namespace for use this and found a relevant looking function for his purposes. The next thing we did was to call the help file for use package docs to figure out what it does and how to use it and if it's really suited for his purposes. And that was all it took to get Mark unstuck.
Another way I like to get unstuck is actually to have a poke around in the GitHub repository of an existing R package. There's something super comforting in seeing another programmer's code doing exactly what you want to achieve. It's like holding the answers to an exam that you never got to prepare for. And it just gives you that much bit of confidence to take inspiration for their code and to try things out in your own R package. Which leads me to my next point is that experimenting with your code is a super effective way in getting unstuck. And the idea is that you change one thing at a time in your code and call the functions document, which updates your documentation, or load all, which loads a preview of the recent changes. And doing this, you can see the immediate effects of the changes you've made onto the R package. And it's through this iterative cycle that you can figure out how mechanics of an R package works.
Embracing the unexpected
So, as your R package takes shape, you're going to come to a point where you're going to release it to the community. And that can feel a bit scary. So, I just want to say, whether it's a bug that you've uncovered post-release or someone's launched a GitHub issue or a feature request or a question, all of these unexpected encounters are really part of the package building process. And when it comes to that time, I invite you to embrace all that uncertainty and all its glory. And to treat those steps as an extension to building your R package and not your reflection of your ability as a package developer.
As a matter of fact, whenever I go on field trips with my friends, and sometimes we're lucky enough to come across another creature in the nature, 100% of the time, all these unexpected encounters are considered as positive and enriching experiences. They teach me something about the diverse life forms across all these different landscapes. And I think the same can be said about the code bugs that we uncover in our R packages. They teach us something about the context that the bug has been produced. You'll learn how to detect them in your tests. And ultimately, addressing the bugs becomes in this iterative cycle that leads to the improvement of your R package. And in some cases, these bugs can actually be sources of inspiration for new features.
And I think the same can be said about the code bugs that we uncover in our R packages. They teach us something about the context that the bug has been produced.
But I understand that our initial reaction to bugs and anything uncertain is to resist. So if you want to feel more prepared for these things, I suggest you to seek out the bugs. So whenever I'm on field trips with my friends, we don't wait for some cool animal to cross our path. We go looking for them. So we'll be flipping rocks or turning over logs. And you can do the same with your R package. You can get your coding buddies to try it out and give you some feedback or get someone to read your documentation with a fresh pair of eyes.
So I hope by seeing how I've approached writing R packages has demystified some of that process and dispelled that hesitation to start. If there's one take home from this talk, I hope that it's whether you're a seasoned package maintainer or just starting out with your first R package, is that I hope we can all code more like a biologist. By that, I mean be more curious and be more willing to scientifically poke and prod at your code to get it to work and to overall be more embracing about all the things that you're going to encounter along the way. Because we as an R community are going to need all the ideas we can get to sustain this thriving, growing R ecosystem. And with that, I know this is a short talk. I want to thank you all for listening and tuning in. And I'm, yeah, thank you, Posit, for inviting me. I'm happy to hear feedback or questions. Thank you.
Q&A: marketing your first R package
Thank you, Fanti. That was great. I do have one quick question. I was wondering if you had any advice on, like, marketing that first R package to either your friends or people on the internet after you've built it.
Oh, I generally, because if you're building an R package, it's generally for a dedicated community. I think it's initially before the mass release, I would maybe shoulder tap or message a few folks that would find that R package interesting as a kind of soft release. After that, I would suggest relying on social media. Like, Mastodon is a really great thing for reaching out to the R community.
