R-multiverse: a new way to publish R packages (Will Landau, Eli Lilly)

Transcript#

This transcript was generated automatically and may contain errors.

We are all in great company here. I count myself as a package developer as well. And we write an R package, we want, chances are, we want to share it with as many people as will find it useful. We want it to see the light of day.

And that brings us to the dreaded ordeal of publishing. And that's something that we have in common with authors of books, of journal articles, and of other kinds of media that are published in general. We have the standard practices generally to go through a central publisher, like CRC Press for books or Journal of the American Statistical Association for journal articles. And for us, we have repositories like CRAN, Bioconductor, and ROpenSci.

This is generally the best way to help our work see the light of day, to make sure people use it, to convince people to trust what we put out there. But it comes with a challenge.

The challenge of central publishing

Inherent to the model of central publishing, whether it's a book or article or package, is editorial review. And for sure, these review experiences can be extremely rewarding and edifying. But they also come with a lot of frustration and a long wait time to wait for feedback. And then there's ingesting that feedback, coming up with revisions, and you might even get rejected. Even if you don't get rejected, you might have to do something to your package that you find difficult or maybe not even agree with.

And so, a lot of folks turn to self-publishing, which is possible, again, for books and journal articles and packages. We all have ways of doing this. And the reasons we might want to do this is that, as package developers, we control publishing. No one has to tell us what to do. The freedom is really nice, but it comes at a cost. Again, this other extreme of the trade-off, because we have a hard time convincing users to trust our work.

If on a self-published repository, as a user, I might express more skepticism of something that's self-published rather than something that's formally vetted in a central repository. And also, your work is harder to find if you self-publish. And speaking of obscurity, let's talk a little bit about siloing. If I'm a user and I install a package from GitHub or even from our universe, not only do I need to know the name of each package I want to install, I need to know every package's individual repository. And for a user with hundreds of downloaded installed packages on my system, I don't want to have to do that for every single one.

Introducing R-multiverse

Which brings me to R Multiverse, which is a new dual repository built on top of the R Universe project. And what it aims to do is provide a centralized place for self-published releases, and on top of that, automated checks and quality control for production scenarios. What we're trying to do is meet this middle ground to try to combine the best of both worlds of self-publishing and centralized publishing with review.

What we're trying to do is meet this middle ground to try to combine the best of both worlds of self-publishing and centralized publishing with review.

Now, let's get into how this works. With R Multiverse, it's very new. It's a dual repository, so it's not just one repository, it's two. And our first repository is this community repository. It's just plain and simple, an R Universe with all the latest releases of registered packages. And on top of that, we have a production repository with quarterly snapshots of the healthy releases from the community repository.

So my goal for the rest of this talk is to help you understand each of these repositories, what they mean for users, and what they mean for developers who want to participate and contribute packages.

So multiverse tries to centralize these releases and enforce quality for users, but also puts power and control in the hands of the maintainers to make it as much like self-publishing as we can.

And why I'm talking with all of you is this is the debut of multiverse. We are ready to scale, and we are inviting package contributions from all of you. And we also wanna scale up the moderator team to make sure that this reviewing experience is fast and simple. So if you'd like to participate as a moderator as well, please come talk to me.

I maintain multiverse with three other folks listed here, and we couldn't have done this without the R Consortium or R OpenSci, or R Universe. Thanks very much, and I will be happy to take questions.

Q&A

So we check against the R Consortium advisory database which exists to report security findings. We check new submissions against that database and we enforce that again in production. I didn't list that check in the slides because it's not new or in the sense it's not unique to production, but we do enforce that.

Okay, so injecting maybe a package with a similar name, but enough like an existing package, but that might do something nefarious. The advisory databases as well, we rely on that stuff. We rely on the community report to report those findings.

We also have, for most package submissions, we have a manual review process. And in terms of automatically accepting packages, we take a people first approach to safety rather than a package first. So there's a short list of trusted organizations like R OpenSci, R Consortium, and we're open to growing that list. But also, if a developer isn't part of one of those trusted organizations, then we always do a manual review of new packages. And we've thought about automated checks on, let's say, the fuzzy matching of the name and stuff to account for that kind of spoofing, but haven't implemented that yet.

R-multiverse: a new way to publish R packages (Will Landau, Eli Lilly) | posit::conf(2025)

Transcript#

The challenge of central publishing

Introducing R-multiverse

The community repository

Registering a package

The production repository

Summary and call to action

Q&A