
Personal R Administration
From R/Medicine 2025 -- Does the release of a new R version fill you with dread? -- Are there passwords in your R code? -- Do you look at the output of a failed package installation and think to yourself, “WTF?!” If you said yes to any of those questions, then you need Personal R Administration. You’ll come away with tips, tricks, tweaks, and some hacks for building data science dev environments that you won’t be afraid to come back to in a year. David Aja and Shannon Pileggi E. David Aja is a Software Engineer at Posit. Before joining Posit, he worked as a data scientist in the public sector. Shannon Pileggi (she/her) is a Lead Data Scientist at The Prostate Cancer Clinical Trials Consortium, a frequent blogger, and a member of the R-Ladies Global leadership team. She enjoys automating data wrangling and data outputs, and making both data insights and learning new material digestible. Resources R/Medicine: https://rconsortium.github.io/RMedicine_website/ R Consortium: https://www.r-consortium.org/
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hi, everyone. I hope everyone is having a great day three here at R Medicine. My name is Joy Payton. I use she, her pronouns, and I am a member of the organizing committee of the R Medicine Conference. And on behalf of everyone behind the scenes, we are so glad that you are here. And we hope that you're enjoying the diverse topics that our speakers are covering this week.
One of the topics that I love to talk about as a data science educator is shame. Because I know that shame can keep people from asking for help. And some of what I've dealt with in the area of shame is realizing that my personal R workflows were not the best. So I have not always used projects. And I have been known to store secrets insecurely once or twice. And I have also been known to have package dependencies that were brittle, which has led me to be a little afraid to update R.
So that is why one of the workshops I have most been looking forward to this year is this one. Offered by David Aja and Shannon Pileggi. Addressing personal R administration. So these two speakers are masterful presenters on any number of topics related to scientific computing. But I suspect that today they're going to treat all of us, not just to their technical genius, but also to their personal warmth and wisdom. So, Shannon, David, take it away.
Thank you, Joy. Yeah, and a pleasure to be here with you all today. I'm David. Shannon will be hanging out with me today. It's nice to meet all of you. And I just want to say, you know, all have sinned. I am here speaking because I have done a number of catastrophic and embarrassing things with my development environments. And I just I'm here to help you avoid repeating some of those mistakes.
Shannon, do you want to introduce yourself before we jump in? Sure. I'm Shannon Pileggi. I work at the Prostate Cancer Clinical Trials Consortium. David and I have co-taught various number of workshops under the umbrella of what they forgot to teach you about R. And I'm here to be his sidekick today and I'll help manage chat and questions and communication as we go along.
Course overview and project-oriented workflow
Cool. Thanks, Shannon. So let's let's jump in. So this is personal, our administration, and I think the goal here is going to be this title. Often you'll say, you know, I can get this to work on my machine. And I think what we're hoping to do is get you to a place where you can you can just get to it works. Right. That you're confident doing things sort of across different computers.
So just to provide some context, this course, I'm going to drop this link for the slides in the chat. There is a kind of there there are some portions that will be interactive so you can follow along at home. And then if you go to RStats.wtf, that has a book shaped website that that has some of the information that we'll be talking about today and the repository in which most of the other materials for WTF live is at github.com slash RStats.wtf.
Just to give you a little bit of context about me. When I started using R professionally, roughly 10 years ago, I had to use it on a bunch of different computational environments. So I had some laptops. I had some other laptops. I had some instances of Workbench. I had some instances of Shiny Server. And so I got into the habit of having to take the work I was doing and spread it across a bunch of different machines. A little bit later, I was a formally a data scientist at an advertising agency. And again, just kind of jumping all over different laptop environments, some on Windows, a couple on Linux. And now that I work at Posit, I was a solutions engineer for several years, doing a lot of demos, showing people how to do different things. I recently moved to software engineering, but a lot of that is still entails the same setting up demos, reproducing problems and doing that on a collection of really different environments.
And the reason I tell you all this is so that you understand that a lot of what I will be recommending is shaped by the experience of like how to get these things safely and reproducibly from one set of machines to another. So that's just some context about me.
And then our objectives for today, we're going to hope to try to answer some of these questions. How do I upgrade the version of R I'm using for this project? How do I track the package versions I'm using for this project? How do I move this project from one machine to another? How do I use credentials without exposing them? You'll notice that there's a refrain of for this project across all these things. And so a lot of what we're assuming is that you have decided to onboard a project based workflow.
Normally, we spend a lot more time talking about that. But today I'll give you sort of the briefest possible version of some of the things that it will be helpful to have be true when you decide to start working this way, because it's just going to make things a lot easier for you. You'll see this is actually a link to the full set of project oriented workflow slides. If we're doing this as a two day course, we usually spend much longer time on this section. But I'm just going to give you some highlights right now.
So the first thing we're going to talk about is embracing the blank slate. If you have RStudio open, you can copy and run this snippet right now. And the blank slate is going to mean that you set RStudio up so that every time you start working, you're working in a fresh session where you're not carrying around stuff that you computed two weeks ago. So this means disabling the option that will cause you to save your RStudio workspace. RStudio won't ask you to save it when you quit. It won't load it when you start. This last line is just like a sort of file hygiene thing. But embracing the blank slate, this is just going to make your life a lot easier as you prepare to start moving between different projects.
So the first thing we're going to talk about is embracing the blank slate. And the blank slate is going to mean that you set RStudio up so that every time you start working, you're working in a fresh session where you're not carrying around stuff that you computed two weeks ago.
In combination with that, one of the things you want to be prepared to do is to restart your R session often. If you have objects that you think are valuable, there are ways to save them. You can save them as RDS files. You can save them as a QS file. There's a bunch of formats you can use if you need to serialize something in particular about an R object. But most of the time, what you really want to make sure you've saved is the source for how you created that R object from your raw data, which you did not modify. And so being in the habit of restarting your R session is going to help you make sure that you're not carrying around stuff that you calculated weeks ago. Because that can often be the source of really confusing and difficult to debug problems.
And so these are the shortcuts for restarting an RStudio session or a Puzzletron session if you're in one. And if you have it open, you should run it right now.
And then the last thing you're going to do here is you're going to ‑‑ there are a couple of these. These two pieces of code here are often signifiers that something about your project workflow might not be quite complete. This is a link to kind of the tweet that people will cite often where Jenny Bryan, the original author of this course, has threatened to set your computer on fire if you do one of these two things. And the reason we don't want to do those is because if you're trying to work in a reproducible way where you're not carrying things with you between sessions, then if you're running this at the beginning of a script, it's because you're hoping that this resets you to a clean state, and it doesn't. Right? So there are things you can modify about a session, like your global session options or environment variables that you set as you're computing things, and this isn't going to reset those. And similarly, if you are changing your working directory because you're trying to access files that are in different places, that computation is much less likely to be successful if you're trying to take it with you across machines.
So if you have either of these things in your code, again, it's an opportunity to think about adopting that blank slate and thinking about changing that workflow. One of the things that we'll cite that's particularly helpful here is the here package, which gives you a way of referencing paths relative to a project's root directory, whatever that is. That might be a project where you have the RStudio project file. That might be something where there's a git directory. But using this to construct paths in your project means that you can get RStudio or R to reference files in the correct place without needing to change your working directory.
The project onion
All right, so I want to talk about The Onion. Not the satirical publication, though I'm a fan. But this thing that I've been calling the Project Onion. And this is going to be kind of our way of orienting the different conceptual things that we're trying to manage as we work on an R project. A dirty secret is that at some point I will try to convince you that this is actually true of all software projects. And if you want to ask me later about why I hate Conda, my explanation will reference this.
So what we're going to do is we're going to start by talking about first things you can change to exercise control over individual R sessions. Then we're going to expand to think a little bit about managing your package environment. And those are the two things that I think are the most common sources of frustration as you start thinking about trying to administer your own or other people's R projects. And then as we progress, we'll also kind of expand to think about what it might look like to manage the version you're using, the version of R you're using, as part of a project. That might mean adopting some software that helps you manage those versions explicitly. And then the sort of last piece, which we may or may not get to today, is talking about how you think about managing all the software on your computer, which makes it easy for you to take this workflow and reproduce it across machines, new computers, other things.
So we'll start. This is kind of the roadmap for what we'll be talking about over the course of the day. We're going to start with a little warm up exercise, just to get people, you know, computing, actually sort of doing things and set some projects up so that we have something to work with a little bit later. And to do that, we're going to start by working on our project libraries, right?
Exploring package libraries
So the library, the package library, and I'm being a little sloppy about referring to the difference between packages and libraries. For this purpose, it's not that important. So you have an R package, right? We think about those a lot. That's how you distribute R code. You have some collection of functions that you want other people to run. And that collection of packages is going to be stored in a library. If you install a version of base R in any of the typical fashions, you're going to get this set of 29 packages, 14 that are base, 15 recommended. Again, unless you compile R yourself, you might not get them. And if you do that, and you're in this class, I'm confused, but that's fine.
If you were, for example, going to try to draw graphics, but you wanted to do so in a base-only fashion, you could use the lattice library, right? It comes with your R installation. You could install vanilla R from CRAN type library lattice, and you should expect that to work. So the packages get stored somewhere on your computer in the default library. If you run this .library object, I guess if you enter this object in the console, this will tell you where the library is stored. If you run this .libpaths function, it will show you all the libraries that are available to you for your session. And then this will print all the packages you have installed accessible to your current version of R.
And so what we're going to do is we're going to take a little time for you to explore your package library. If you run use-this-use-course, I eliminated the package prefix there. That's a fail. So if you run use-this-use-course, this is going to download this rstats-wtf-explore-libraries thing to wherever it is you typically keep projects on your computer. You don't have to supply this destination directory argument. But when I run this, I'm going to put it in a particular place on my computer because I have feelings about that. And if the project doesn't activate automatically when you download it, then you can get the path that the project was installed to and then open that up.
So let me show what that's going to look like, and then I'll give you a little bit of time to do it. So I'm going to copy this, and I'm going to jump over to RStudio.
All right. You know, I'm seeing, I'm seeing some counts. 528 Kaylee. I think you win. Yeah, I was, I have 186 and I was like, oh, that's, that's, that's too much. But I've been doing some package development. Anyway.
So let's, let's step through it. And, you know, just to sort of revisit some of the previous advice. One thing I'll do right as I hold this off screen. I'm just going to restart the R session. You'll notice that the object I had in my global environment is gone. Because I have the code for computing that. And so I don't need to save that particular object. I can just create it again.
Okay. So. I'm going to load the tidyverse and FS packages. And then we're the first thing we're going to look at is, you know, which paths. My, my libraries are in. You can see I'm working on a Mac today. And so I've got these two library paths. We're going to dig into the meaning of this later. And then we'll, which of these is the default library of the default library here is under. Slash library frameworks, et cetera.
If we take installed packages. Right. If we run that function. On its own. Right. We see it returns this kind of matrix situation. Which we're going to jam into a civil. And then if you compute the number of packages there, you can see I have 186. You can see which of those are the base and recommended packages. Right. So we have those 29 packages I was talking about earlier. And you'll notice that these are. In this slash library. Path. Right. So these are things that came with the version of our I installed. And then you'll see that these other 157 packages are installed. Under user. We'll call these things the system library. And then we'll call this the user library. And again, we're going to get into more detail about that. In a little bit later.
What proportion of these packages need compilation. So it's about half and half. Half of them don't. And about half of them do. And again, that we'll talk about exactly what that means. A couple of sections. And then what version of our were they built on. In this case, you know, our 4.5. Is the most recent version of our. So all of my. Packages were built for this version. If I was to switch to an older R version that I have on the machine. That might look a little bit different depending on when exactly I installed the packages.
Cool. So, you know, we've, we've splashed around the library a little bit. And so we're going to keep this project open. There's some things we're going to do in this context. But let's jump over to the first. Section of the onion, I guess. I guess it could be innermost section. You have to fully dissect the onion to access the innermost section. Anyway, but so we'll talk about some things. Related to starting our. So the first thing we're going to talk about is. Taking some control over what it looks like.
To change how the session behaves. And the reason we're going to investigate this. Right. Is because this is. This gives us the opportunity to change things about what our code does. Depending on what context system. And you can do some of those things without having to actually change the code.
So when you start R, a bunch of stuff happens. And we are not going to talk about most of it, but a lot. There are a lot of opportunities for you to try to change the way R behaves as it starts. Sometimes if you're working on a system that's administered by someone else, you may change some of these things before you can start, so that you end up with access to the right data sets, or if you're working in an environment that's restricted in some way, your administrator may be using some of these.
The things we're going to focus on are in these two highlighted sections here. So we're just thinking about setting some environment variables, running some startup scripts. But we're, we're really just looking at a small subset of what you can do to influence the behavior of R as it starts. And then as you sort of expand your focus, you'll see other contexts in which, like for, if you, if you end up working with something like GitHub Actions, you might notice that some of these flags are getting set as you're trying to run things in a reproducible way on some other context.
The .Renviron file
So we'll start by talking about the dot R environment file. It enables you to set what are called environment variables. And environment variables are key value pairs that let you change the way processes are behaving on computer. So they just store some information about the state of a machine. And then you can use, you can get the name of a particular key and use that to change how, how the program executes.
So there are, when you're, if you're going to create a dot R environment file, there are some things you should put in there, right? So if you have R specific environment variables, so you want to change the the number of lines you retain in history, or if you have API keys or other secrets, right? Things that you want to be available to your code but you don't necessarily want to put them in your code. So the dot R environment file is where they should go. Importantly, what you should not put in the R environment file is R code. R code doesn't get evaluated in those files. It's really just those key value pairs.
So to edit an R environment file, you can, if you know where they are on disk, you can open them in whatever way you choose. There's also this usethis function, which will enable you to edit them. And it has this scope argument and you can provide either the user scope or the project scope, depending on which of those you supply. You'll either get the R environment file that is in your user directory. And just a note on this, if this is unfamiliar, this tilde or twiddle is a shorthand for your personal home directory, which has a slightly different meaning across different operating systems, but it's a directory that most of which might mean that you have to compile the packages from source yourself, or you can use Posit Public Package Manager. This is a lot to say, so I will probably refer to it as P3M going forward. And Posit Public Package Manager, P3M, has binaries available for Windows, macOS, and Linux. And for macOS, it's both for ARM and Intel, so it's just easier to, you can just get whatever kind of binary packages you need by always using Public Package Manager. If you want to know whether a binary is available for the specific package you're trying to install, there are a couple of places to try to get that information. So if you're going, if you're looking at the CRAN, like a standard mirror of CRAN, you can see this is, you know, I pulled this screenshot from cran.rproject.org What you'll see here is that we have the package source, so that's the, just the all the files that comprise the package, tarred up and then compressed, that's what the gz means. And then we also have Windows and macOS binaries. So you can see there are Windows binaries for, as of this screenshot, rdevil, which means the version of R that is the next one to be released. But then you can see, for example, that there are no binaries available for the currently released version of R, or the next oldest version of R. And this is as of this screenshot. This information changes over time, so if you're trying to understand whether a binary is available from CRAN, this is where you need to look. And then you can see, again, the same thing on the macOS binary side. There are binaries available for the current release here, and for the previous release for, and those are available for ARM but they're not available for x86. So that's just all the information that's being presented to you in this box here. Something that is sometimes true is that if CRAN does have a binary package, it may not be the latest version of the package that's been released on CRAN. So in this screenshot, you can see the parallelly package. The source is at version 1.32.1, but some of the binaries are a bit older than that. So the Windows binary, in this case, for the released version of R is slightly behind. And so if you see this note as you're installing a package from a standard CRAN repository, that's what it's asking you about, as you'll often have the option of either getting a slightly older version of the package, which is compiled already, or you can get the newest version and then compile it yourself. And again, that's just explaining what the message here, right, there's a binary version available, the source version is later, right, so CRAN has already compiled a binary version for this older version, but if you want the newest, then you might need to get it yourself. If you're looking at Posit Public Package Manager, and here I'll just pop open this link so we can take a look, if I go to a package like dplyr, and then I scroll down a little bit, these things are links, and so I can say if I'm looking for a Windows binary for R 4.4, this is going to tell me that a binary package is available for this, and I can see that for any combination of operating system and R version. The other thing you'll see is that the information you need here, so the package version, the R version, and the architecture are all exposed as things you can ask the server about. Right, so those things are links, so you can click on them to try to figure out if there's a binary version of the package available. And then once you're at the point of installing the package, there will be some information in the log that tells you whether you obtained a binary package, so you can see, for example, if you're downloading something from a standard CRAN mirror on Windows, you'll get this message about this. Binary packages for Windows are also served as zip files, so looking at this content type as a header is something else that will tell you whether you got a binary package. On macOS, again, the message will say the downloaded binary packages, but you can also look at this extension. Typically, binary packages from macOS have this .tgz extension. If you're downloading things from Package Manager, then the header is set a little differently, but your R version will still understand that you're getting a binary package here, and if you install packages using Rn, which we'll talk about in more detail later, but Rn will just explicitly tell you whether you installed a binary package or a source package. Is going to the bleeding edge going to be wonky? So when you say bleeding edge, do you mean released versions on CRAN, or do you mean development versions from GitHub or something? Yeah, I mean on CRAN, not necessarily getting from GitHub or some other source, but when I see, do you want to install from source, sometimes I think, oh yeah, give me the latest, but sometimes I think, oh, do I want the latest, or do I want to see how the latest is received by the community? So I wonder, is there a cognate to the long-term support versus generally available model, or is it just package by package? Yeah, so for R packages, as far as I know, no. There is, so I'm going to attempt to give a description of CRAN's behavior here, which is that they just, the packages on CRAN all have to work with each other. I think inspecting how CRAN builds binaries is not totally transparent, but if you get the, I think some of the stuff about needing to build it from a source yourself, I don't know. In general, I'd say for R packages, it's generally safe to take the latest version. It's guaranteed to work with the latest version of everything else on CRAN. That might not be true if it's, it might not work with what's in your environment currently, but it will work with everything else on CRAN. And so I often don't, I don't usually worry about trying to do a lot of compatibility solving. I just take the latest of everything. The other thing to sort of note in this context is that package manager, to the best of its ability, serves binary packages for as many things as are on CRAN relatively soon after they're released. And so if you want a higher chance of obtaining a binary package, you can just try it, try getting it from package manager, because that will often be a little bit more straightforward to do. So using the package manager interface, let's take a couple of seconds and try to figure out if package manager serves a binary package of the R PostgreSQL package. So I'm going to drop a link to package manager in the chat. And let's try to find out whether package manager serves a binary for this package. And we'll just give that two minutes. All right. So let's jump over to package manager, and Posit Package Manager, I should say. If we look up the R PostgreSQL package, you'll notice there's an R PostgreSQL package. This will be an important detail in a moment. So if we look at the R PostgreSQL package, and then we try to find a window, binary package for 4.3, I don't specify a distribution, but that's fine, because it doesn't serve a binary package for any of them. So if we change the version here, oh no, so it does serve it for Windows, but perhaps not for Red Hat 9. So, and this is a... Sometimes a binary is available for your distribution and version, sometimes it isn't. Do I know why? I do not. It is for 4.3, but not for 4.4. That's surprising. That said, and this will be like a general piece of advice, the R PostgreSQL package does a lot of the same things, and is a little bit less painful to work with. And that one does serve binaries that are easier to get. So sometimes you may want to look to see if a similar package that provides some of the functionality you're looking for, where binary is available, might be an easier thing to switch to. Particularly if you, for example, are using Excel packages that depend on Java, don't do it. But no, Konstantin, I actually, I don't know. I will ask the team about that, because right now we don't really surface information about why a particular binary build might not succeed. So it's a little bit difficult to discover. But yeah, I will ask. And if you find yourself in the situation where you have a dependency, you need that, and there isn't a version, package manager isn't serving a binary, if you post an issue in the package, in the community, in forum.posit.co, under the package manager category, someone from the team will try to respond to you. So just try to, you know, like I said, I'll get back to you about if we can figure out a reason why that might not have succeeded. But that's where to go to get the information, if for some reason you can't find it. I will say a lot of the time it's just about, like, the different operating systems are painful and complicated. And so the one thing I'll talk about at the end of this section is one, an approach we're taking that we're hoping solves some of these problems, right? So we've talked a bunch about binary packages, right? If you can obtain binary packages, that's because somebody else compiled them. If you obtain the sources of a package, then you have to compile them in your environment. For packages that are just R code, R is all you need. If your package has dependencies on other compiled languages, you know, so these are some of the examples, there are a couple of others that are on CRAN, excuse me, then you're going to need some extra tools at your disposal to be able to compile those packages in your environment. If you don't have those tools available, and so I've set up a couple of environments that are that just that don't have them, this is the kind of thing you'll see. What exactly you'll see will vary by the way in which those tools are missing. But, for example, if this is like a version of R running on Windows where I don't have the make command, and so I try to install a version of dplyr and it's going to fail because it needs to compile some C code and the make is not available, and the same thing is going to be true if I'm missing this compiler, which R is looking for in a specific place. To get the tools you need to compile packages from source, if you're on Windows, you want R tools. R tools just installs a collection of things that R can use to compile packages in a sort of known location, and that just that makes the process go a lot more simply. If you're on Mac OS, then what you'll need is Xcode. You can also install Xcode is the easiest way to get the tools. And then if you're on Linux, then you'll need to kind of install the tools in the way that you install typical packages. If you run devtools has devel, you should see a message about if your system has all the things it needs to compile packages, then this is what you'll see. If you don't, then you'll get instructions about how to install the tools we were just talking about. The other important piece of installing R packages, particularly on Linux, but you can also encounter this problem in other operating systems, is with what we call system dependencies. So sometimes your R code depends on, say, C code that's written by the developer of the R package. So that's true of, for example, dplyr. dplyr has a collection of C code in it that comes with the source of the dplyr package. In other cases, there are Linux system dependencies that people use to kind of expose the same functionality into at a lower level to a number of different things. So if you've spent any time trying to install packages at all, you might be most familiar with geospatial packages. System dependencies, something like gdol or units, these are going to be things that R will look for. They're sort of low level Linux libraries that try to make some of that geospatial functionality available across programs. And if you don't have those system dependencies available, then the package is going to fail with messages that look like this, where it's checking for some other thing. And you need to take some action to install it, right? So the help you need is often going to be in the error message for how the package fails to compile. But this is the kind of thing that will let you know that you're missing a system dependency. One other important piece of the puzzle occasionally, this is infrequently true, but when it's true, it's quite painful, is if you have a package, and I'm picking on the geospatial libraries in particular here, because this is where the problem most often presents. In addition to being required to compile the package, sometimes you need to have these system dependencies available when you run the package as well. And so, if I call library SF in an environment that has those relevant system dependencies at runtime, then you'll see it call, this is an error message that it, sorry, just a status message that it emits when you start. That you're linking to these things, which are on your system. If you don't have those available, right, then you're going to, you can install the package successfully. And then when you try to load the package, you'll see an error message like this. And so, again, that's going to be an indication that you're missing some important dependency that you need to run the package, not just get it installed. If you want to figure out which system dependencies you need to ask your administrator to install, if you go to package manager and you type in the name of the package, there is a system requirements section that spells out the commands you need to ask your administrator to run so that you can get those installed. One of the reasons these are called system dependencies is because they have to be installed system wide. They're not something that typically you can just install yourself. However, in the last couple of days, this is very new. And still kind of in preview. But this promises, this might solve a lot of problems for people who work in environments where getting an administrator in the loop is very slow. We're calling these many Linux binaries. There's a sort of parallel Python project that solves the problem. And that project, the things that that project produces are called wheels. The product manager and I are in a dispute. I think we should just call these reels. He is not with me, but I'm doing it anyway. But this is a link to the blog post where we talk about the strategy we're taking to make it so that it's much easier to install packages, even on Linux, without administrative permission. And get all the system dependencies you need at the time you need them. And so, without getting into too much detail here, this is a container that has no system dependencies installed in it. But if I use this many Linux repository on package manager, then I can actually get those geospatial dependencies installed and running without any issues. Or not without any issues, but I can get them installed and running. So, this is currently in preview. If you have this kind of problem where you're working in an environment where you often find yourself going back and forth with an administrator, trying to get things installed, check this out. Give us feedback. It's available on public package manager, so you can use it. But I'm very excited about this as a way of solving the system dependency problem. Okay. So, in order to test or try out installing packages for ourselves, what we're gonna do is we're gonna install a package from the R universe. So, the R universe is another place you can get packages. Those packages are often things that don't necessarily make sense to put on CRAN. But it's just another way of thinking about distributing R packages. And so, what we're gonna do is we're going to try to update our R profile and install a package that's not on CRAN. So, in the project we have open, first try to install this get seller package and then add this to your project R profile so that you end up with the R open sci R universe and then restart R and try to install the package again. And we'll give people five minutes to do that. I can make a suggestion, David. How about we extend this to ten minutes and let people take a stretch break and do the exercise since we're about halfway through our time. Thank you, Shannon. Excellent idea. We'll give it ten minutes. And on behalf of Big Water, also drink water. Have some water, y'all. We'll see you in ten minutes. Okay, so we're gonna walk through installing this package from the R universe. So the first thing I'll try to do is install this package and restart. I stretched, I drank some water. So I'm gonna attempt to install this package. And you see this failure, right? So git seller is not available for this version of R. And so what we need to do is find a way to let R know about this additional repository we want to install the package from. And so in this case, I'm just going to grab this setting where I'm going to set my repositories to include the R opensci R universe repository in addition to public package manager mirror. I accidentally pasted this into the console. If I run this, this will also work, but then it won't persist if I restart my session, right? So if I run that and then I try to do it again, right, I'm still going to see the same message. So if I want something like that to persist, then a good thing to do is to put it in a place where it will be available to each R session I start. And now if I try to install the package, then you'll see I get the package from the R universe. And you'll notice, this is a question here, that the package I downloaded is a binary package as well. So if you publish things through R universe, those do get compiled. So you do have an alternative way of distributing binary packages. So if you're in an organizational context where you have something that's very complicated to build, and you want to try to set that up once and distribute it to people publicly, then R universe can be a good option for that. In addition to sort of the standard distribution. All right. So, yes, we installed Git seller from binary. And we know that in this case, because the message says that the downloaded package is binary. When you're streaming things from package manager, they're all going to say this, but sometimes, again, there are tips about whether you've got a binary in the header that comes back from the CRAN server. So if you can get binaries, you should, because it makes your life easier. Things go a little bit faster. It's a little bit less work. Before you move on, David, do you want to make any recommendations? So like in that exercise, you said the options at the project level. Do you want to make any recommendations for what people can do at their user level just to streamline installation? I would say it depends a little bit on, well, you know, some of what you do depends on how you decide to make your environments reproducible. So a thing I would probably do. Actually, you know, I can't explain why I haven't done it on this machine. So, you know, we're learning in real time. But a thing you could do, use our profile. Edit our profile, for God's sake. I'm going to edit it in the user scope, is I would set my default repository to Package Manager all the time. And that's because it's going to, for most of the time, it will have binaries for whatever distribution of R you're working on. If you're doing this on Linux then it's a bit more important that you find something that might match the Linux distribution you're using. Once you start isolating project environments then you might want the way in which you record that information starts to look a little different and we'll talk about that in the next step. Okay, so reproducible environments, right, we're going to be thinking about recording both the package environments and we're also going to start capturing some other information about the version of the language we're using. We'll talk a little bit about what that means. There are really, the thing I want to help you think about here, there's this strategy map of reproducibility strategies and it describes on the x-axis we have who's responsible, on the y-axis we have how open the environment is. Most of the strategies that we're discussing for this class mostly focus on this top right area where you're in control of where you get your packages and the environment in which you're obtaining them is relatively permissive. We try to discourage people from, say, moving into a system where they're in control, they might be able to, if your environment, for example, has restricted connectivity, you don't necessarily want to end up in a situation where you try to install something and it just fails with a networking error and there's nothing you can do, right, because that's going to put you in a kind of miserable situation. Down here on the validated side, there are, when a lot of administrative control is exercised over the environment, then you want to try to sort of have a collaboration with the people who are managing that environment to make sure they understand your needs and the shared baseline is kind of a good jumping off point for a lot of organizations where you can have sort of a base set of packages installed and then people who know what they're doing can push themselves up into the right. So like I said, we'll be mostly talking about the snapshot strategy but know that there are other ways of thinking about managing this problem and understanding what you're doing up here also helps you talk about what happens down here if that's what needs to happen. So the two tools we're going to focus on for thinking about how we construct reproducible environments are going to be public package manager and then the renv package, both of which give you a little bit more control over what your package environment looks like and where you're getting things from than you might have in kind of the standard workflow. So we'll start with public package manager. Something that you'll have noticed when we set the package manager address is that the url contains this latest and latest is just going to track the current state of cran. It is not like it's latest by about a day. So within a day typically public package manager will reflect the set of packages that are available on cran. And so if you just need something that behaves like a standard cran mirror then grabbing things from latest is ideal. If instead you want something from a little bit further back in time then you can use what package manager calls a date-based snapshot and that gives you the ability. I'm just going to jump over to package manager and jump into the setup tab. So I'm working on macOS today and right now you see that this url points to the latest. So it's cran latest. If I want a repository that behaves the way cran behaved in July of 2022, I can use a url that looks like this where I have this date string in the in the url instead. And that means that if I supply this as the repository to my install packages command or to other things then when I type install packages I will get packages the way they looked on cran as of July 1st of 2022. If you're trying to bring an old project that you haven't worked on back to life and you need to just try to get a set of packages from a date that is like too difficult to figure out what set of packages you need, using the date-based snapshot workflow can be pretty nice. It's also really helpful if you're trying to like reproduce specific package installation problems. So doing things via that date-based snapshotting workflow can be really helpful. So what we're going to do is we're going to in the project we're going to check out what our current version of dplyr is. If you don't have dplyr installed you can do this with some other package like jsonlite is another one that will be relatively easy to install. So check out what the current version is and then put a date-based snapshot into that R profile. Restart R and then install a version of that package and see what version you get. All right. Cool. So let's let's see what we got. I'm going to grab this. Open RStudio. And check out the package version I have which is 1.1.4. Okay and then I'm going to set my package repository to December of 2022. The repos option here it's a named vector and it is probably helpful to call it the same thing which I'm not doing right now. So I'm not going to change this to rspm. I'm just going to call this p3m here. So I'm going to restart. I'm going to install dplyr. Did I? Then I'm going to run the library dplyr which loads and that's filed. Now I'm going to try something. There we go. Yes. Hey that's new. I'm going to try that one more time. First I'm going to confirm that my repos are set correctly. They are not because I'm not in a project. We're doing it live folks. Okay. And if I check my repos again I'm still not in a project. Let's edit our profile. Oh you know what it is? I think the project file exists and I didn't set it in there. Hey so you know when you describe the short circuiting behavior and then fail to understand that it has implications for the demo you're doing? That's me. I know. Okay so we'll restart again and now that my repository is correctly set to the previous states I should get an older version of dplyr which I might have to compile from source. So you can see many other people got the version of 1.0.10 and since that version is much older than the currently released version of R right because I'm running this in 4.5 I had to compile it in my environment. But that results in me having a packaged version of dplyr that is now older than the one I had installed previously. Any questions about what went wrong there? Does everyone understand what I missed? Okay so great. So in order to back ourselves out of this situation what we're going to do is we're going to since I had I put that configuration into my project R profile I'm going to take it out now. And if I restart you'll see that oh it's options repos but it's been reset to the standard and now if I was running I was using 1.1.4 before so I can run that again. I was using 1.1.4 before so I can grab this and update this to 1.1.4 and install. Wow typing is amazing. And so you can see I'm going to reinstall my packages and now I'll end up back at the version of dplyr I was before. Martine thank you. Yes typos on the slide. You can tell these slides were written by a human being. So this process should feel a little janky to you. Installing a particular version and then you know not liking the results and going back to a different version by calling this. And so if that does seem like if you imagine that there's a better way to do that there is. It's renv and that's what we'll be talking about next right is trying to take some of the things we've got and managing our project environments. Shannon. Yeah just before we move on I want to be the r mom here and just say like if you installed the older version of dplyr make sure you go back and install the newer version again otherwise you're going to try to execute some code and be like pull your hair out and be like why doesn't this work. So just make sure you reinstall that newer version of dplyr because that is on your system library now. Hey Shannon what's a system library? No we're gonna get into it. Okay so the way things work if you don't do anything right if you just open up R and start working is that you might have you know project one and project two and project three and a little bit of mermaid clipping of your numbers which I will anyway right but all those things depend on a shared project library and so if you update something in one of them in if you when you update something because you're working on a particular project you're actually updating this shared project library and so this is where we're going to talk about libpaths right so if I run this libpaths function anyway okay so if I run this libpaths function right which I ran before you can see I have these two paths right one of them is under slash users slash me all right this is my user library and then slash library slash frameworks are frameworks right this is the system library. Right these all the base R packages right the base packages the recommended packages those are the only things that are installed in the system library for me. All the other packages I install get installed into my user library and so this means that any project I'm working on where I'm using R 4.5 if I don't do anything else I'm getting all those packages from here in this user path. And the path looks slightly different depending on what operating system you're running on so when I wrote this example I wrote it on Windows and so under the C drive in the user path my name and then you know Windows paths but this is my user library on Windows and then similarly there's a system installation looks a little bit different on Linux but the same idea is going to show up in basically every operating system. And what renv enables you to do is to take each of your projects and operate them with a library that is isolated from all the other libraries and so your project one has its own library project two has its own library so on from project three. Now that doesn't necessarily mean they are taking up three times the space on your computer because what you're going to do is you're going to maintain a global cache and as you need things that are in the cache you're linking them into each of these project libraries but from the perspective of your project what it will look like is that when you run libpaths and I'll show you what it looks like in the session right is that you're going to end up with the project having its own library in within the project directory and then is also going to make reference to a cache directory that has all the cache packages for the project. So if I install the renv package here and then I initialize it oh before I before I initialize it I'm going to call libpaths right so we see this is the those are the packages I'm working with and then when I initialize that's new you'll see this that my R session got restored restored restarted sorry a bunch of things got created and we'll talk about those in a second but if I run libpaths now right again we're looking at right this is the project directory right so me documents projects the project we created renv library and then right but this is in this directory there's an rm directory here and then you can see there's also a library caches rn thing where my the other packages are getting cached so I've gone from taking that specific that that user library to instead having a package a project specific library. Um and so some of the advantages of doing this right is that it makes it much easier for you to do things like what we just did where you install a package without worrying about doing things to your other projects right. Um renv also has some machinery for making it easy to write down the set of packages you're using um and then uh share that with someone else and you can use that to they can use that to build an environment that looks exactly like yours um and you know the caching thing is nice because if you've already installed a package then you can just link it in from the library and that speeds up some of the workflow a little bit. So what we're going to do is we're going to go through that same process I just went through you're going to create a new project um and you're going to call it wtfrnv and you're going to put it not in the current project um you're going to install the renv package and then you're going to call init to initialize the package you may see a message that's different from the one I see if you've never saw if you've never used renv before um and then we're going to you're going to call status and we're going to give people a couple minutes to do that. If you're in a project right now like the explorer libraries one I would recommend exiting out of that project before you execute the create project. Okay so I'm going to so what Shannon mentioned um a thing you don't want to do is create a project within a project so there are two ways to avoid doing that one is to um so in the slide for example I say wherever you typically put projects so for me I have a place I could typically put projects and so I could say that I want to create a project called wtfrnv in my projects directory right that is not below the project I'm currently in. So if I use this create project uh in a place that's not my current directory um and I've created this project before it seems so I'm going to overwrite it um and so that will launch RStudio in that session so that's one way to do that. The other the other option is to if you go here and close the project um this will just kick you out into an RStudio session that is in your typically in your home directory and then from there you could go through the new project flow and create a project in a new directory so you can create a new project and I will do this as a subdirectory of my goodness projects uh and I'll do it yeah so there's a subdirectory of projects and I'll call it wtfrnv2. I can choose to create a git repository I can also have RStudio initialize renv for me when I create the project in this case I won't do that I'll just I'll switch into that projects install the renv package here. Oh ha ha guys I am I'm committing lots of own goals today uh so the version of the package I installed in this case was uh 0.16 which is quite a bit older than the latest renv uh and so uh let's let's go uh that it's our R profile uh and maybe come back to the future. All right I'm gonna restart R we're gonna do it again. Okay uh so now that I have the latest version of uh renv from CRAN then I'm going to initialize repository um one of the things you'll notice right so we get some messages here about uh things that will be updated in the lock file um the other thing you see we're capturing is the version of R um and so now if I open uh my renv lock file we'll talk about the contents of this file in a second but did everyone get there. Did did I drive anyone else into the ditch of installing an old version of renv sorry Martine. Okay so we're going to talk a little bit about what it looks like to manage dependencies with with renv um if you already did the init um if you already did the init you may want to yeah you can just it's it's safe to do again. If you upgrade the version and do it again you may be prompted uh to uh chain like it will renv will ask you if you want to reinitialize the project and throw away the existing information and you can say yes so uh sometimes if you find yourself in a sticky place with uh renv reinitializing is a totally valid way to just kind of move forward if that's safe for you to do. So what I'm going to do is I'm going to create a new R script and I'm just going to call it main.r and what I'm going to do and this is described in the activity here is I'm going to add uh I've added a file uh I'm going to uh invoke I'm going to write some code that looks like I'm assuming a new dependency uh for my project um and then I'm going to run status and snapshot uh so that I see what things renv is changing so I'll do this and then I'll give you an opportunity to do it so if I say library parallelly. Which I'm did I spell that correctly that's double l double l. Okay so I'm going to erase this first I'm going to check on the status. No issues found the project is in a consistent state if I add a call here and then I add status the power project is in a consistent state. I'm going to follow the prompts to install the package. Okay I don't think this is like a renv-backed project yet this is really confusing. I think the first one I created was not actually in this directory. Okay, so I say that I want library parallelly in this directory, and then I call renv status. Then I'm going to get some information from renv about the sort of discrepancy between what my code says, and what my package library. And what my lock file. So there's there's kind of three places that the information is recorded right so I've installed the parallelly package. It's in the library. I'm using it in my code right so it appears on our code file. What, what I haven't done right now is recorded that I'm using the package in my renv lock file. So the lock file is where we store the information about which packages we're using. And right now if you look in the lock file. There's a lot of information here but the main thing we see is there's only actually one package listed here, and it's renv. So what we need to do is we want to try to get our lock file into a place where the package is installed. It's recorded and it's being used. Right. And so those are the three states we're trying to harmonize. In this case, what I can do is I can call snapshot. You can see I'm capturing some information here about what the change this is going to entail. In this case, it's taking the parallelly package and recording the version of that package that I'm using the lock file. So I'm going to say do I want to proceed. Yes. There's a message that the lock file has been updated. And you can see now the parallelly package is also being recorded in the lock file as something I'm using. And if I do the same thing with another library, for example, jsonlite, you can see the ID knows that jsonlite is required because I've asked for it in my code. It's not installed. If I call status, I'm going to see the same thing. The package is used but it's not installed. So I'm going to install jsonlite. And in this case, I already had jsonlite installed. So you can see rather than fetching a fresh version from CRAN or Package Manager, I'm just linking it in from my package cache. And now if I call status, you can see, again, we're in the same situation. jsonlite is installed and it's being used, but it's not recorded. And so to update my lock file to record that I'm using jsonlite, I'm going to call snapshot. And if I look at the lock file, jsonlite, parallelly, and the renv package. And then the last piece is if I decide to remove a dependency, now I have the situation where parallelly is installed and it's recorded as being used in my lock file, but I'm not actually using it in my code. And so, again, calling snapshot, you can see the operation that's going to be taken. We're going to remove our recording that parallelly is a package we've installed. And if I go back to the lock file, you can see now it's just the jsonlite and renv packages. So I hope everyone stepped through that same kind of process, right? Add a library to a file in your new project, make sure the package is installed, call status to make sure you understand how it's getting recorded, and then remove it, call status again, and snapshot as appropriate. I'll give people five minutes to do that. So, yeah, one nice thing about renv is that there are a lot of package installation workflows that it makes a little bit simpler. So if you are, you know, if you're using, and I'll show my instance of RStudio, if you see when I type install packages here, you can see that rather than, or I guess what you see is it says renv shimps. So what this means is renv has taken over the behavior of install.packages. And so if you type install.packages, you're actually calling renv install, which makes it possible to do things like install jsonlite github by just typing that into install packages. That won't work if you're not using a session where renv is active. But it can be a nice convenience if you are trying to, for example, work with the development version of a package or if you want to install a specific version, or if you want to install a package from a specific commit hash right as your workflows become more complicated. It will handle all of these different sort of workflows for you. And then, these are all also things that you could do with the dev tools package, but if you're already used to typing install packages, it's convenient to do that. The other thing about renv install is that it works even if you're not using an renv package, or sorry, if you're not in an active renv session. If you, for example, are running in a session where renv is not managing your session, you can still use it to install packages. And so again, if you're trying to grab something off github or solve some problem like that, it's just a useful piece of shorthand to have at your disposal. One question I sometimes get about how to work on things where you don't care about reproducibility. There are such things, you know, if you're trying to reproduce a problem for someone else, or you just feel like tweeting some code or whatever. One pattern I like for this is I have what I call a scratch directory. I actually have a bunch of them because my directory structure is a little bit insane. But if you have a scratch directory, I just have, you know, I have a scratch project that renv is active in, but I install, just YOLO install things into it all the time. And that works for me as a way of keeping my projects that I care about separate from things where I care a lot less and I'm willing to install incompatible versions of things and generally make a mess. If you are so inclined, another thing it can be nice to do is like set a git ignore rule where you just globally ignore your scratch directory. So, when collaborating with co workers in the same project, each on their own machine with own RStudio is renv then a suitable solution to avoid clashes with packages and things. Or should we each work and a copy of the project with the renv file. Yeah, so I would say the, it is a, I think of it as a, as the best, the best. It's a good solution for for doing that. Hopefully that means that you're collaborating by sharing files with git. Like, are you, are you using git are you using like, is everybody on a, like a shared drive of linking to the similar thing like what's what's the situation there. If we go back to the reproducibility strategy map that we were talking about a little bit ago. The one of the assumptions implicit here is that it's not just that everyone is working on their own machine, but also that they're working on their own file system, you definitely can all work on a shared drive. I have found that experience to, to be in the long run ultimately pretty painful, because you're all you're all just like mutating the same thing. Unless you're working in an environment that's really specifically configured to support that workflow and most environments, honestly, are not and even when they say they are they're kind of lying. You're, it's just going to be someone's going to update something right like one of the things you've seen is that for example, the, the user and project, like the project, the user and project libraries kind of vary based on the person who's running them. I just like I've seen enough weird conflicts there that that I think having an renv lock file that keeps track of the state of the project, it's better than nothing, I would, I would start moving in the direction of like getting people to work on their own copy because ultimately that's going to be an easier, more auditable way to figure out like who is making what change right so you don't end up in a situation where somebody just decides, they want to pull in the latest version of the thing in that project, without telling you and then you like you your expectations about what the project does change silently. So yeah, having renv on a shared direct like a shared drive is an improvement because then at least you know which packages you're using and you have some kind of explicit way of tracking that I think moving to a situation where each person is working on their own copy of the project and then you have a process for negotiating how you make changes to the project is like the best place to go. Yeah, I think it's also important to talk about the onion in the context of renv. So, you know, we said that the slide said, renv helps you create an isolated project environment so like in the context of the onion What does that do and not do for you. I think that for example, renv does not control that's what doesn't do for you, which is noted in the documentation on this is very clear like renv is like not a package. Right, like, I know we're kind of talking about it lightly in this context, but you should really read all of the vignettes on the package documentation to really understand what is happening. Yeah. So, and what so one thing that renv does not do, where you could work I can imagine things that get we could get weird, is that it's possible that you and your collaborator do not agree on what version of R you're using. Right. If you're both working on shared file system. You could but you're working on you're working on a shared file system but different computers, you can have different versions of R and renv will like, it's not that that's a situation that it can't handle, but it's just one symptom of a thing that you're not actually controlling by working on the project in the same place right it's like you haven't quite negotiated all the things that you need to do to say that you're that you're, you're all collaborating at the same layer because you're not managing the version collectively. Okay, so we talked about where we're writing down the language version, right, that's giving us some information about what our expectation is when we open the project. And ultimately what we what we might want is like a system for for bringing the version of R we want to use to the project we're working on. Right, so that we are so that like we can we actually sort of exercise control over each of those facets of the project. So, a thing I believe, though I did not link this image correctly right is that if you, if we come back to the onion one more time. The, the, the operation when you install R right. And when you upgrade R you're actually doing the same thing. Right. And so if you think about taking an existing projects and moving it to a different version of R the process you go through to install R and then get packages for that project can be the same as like starting a new project and bringing things in. And so the last part of this will just focus on some things you can do to make it a little bit easier to set your environment up so that you are able to work on different versions of R without worrying about what it means for the rest of your workflow. And this is this is also this is just how I live. Now, so it makes things a little bit. It's, it's the most it's the part I'm most excited to talk about so there was a question earlier about how you, how do you like those focused on in particular on like say presenting things to to other people. I am I am doing something slightly precarious by like running this workshop on the laptop I use for work, which has like real production credentials that I have managed not to show you. But if you want a way to say experiment with things without actually breaking your machine, or if you just want a way to, you know, get a an R session where you can try to reproduce somebody's behavior from scratch. There are some tools I really recommend checking out depending on what operating system you're using. If you're on Windows, the Windows sandbox for any version, greater than Windows 10 really makes it trivial to just spin up a new session of Windows that doesn't have anything in it, which you can use for like testing software of uncertain provenance, but I find it really useful for also just like getting things in a clean session where I can sort of build something start to finish. If you use Mac OS, there are a couple of virtual machine options that you can use one of them is called UTM. One of them is called Tart, I have experimented with both. If you're on Linux. I don't use a Linux desktop often. I'm usually using a Linux servers, and so sometimes the easiest thing to do there's something called multi pass from canonical, that makes it very easy to launch a Linux command line. And then you also have the option of like using a VPS provider like DigitalOcean or Linode to start a Linux instance and just, you know, play around with stuff without, without breaking anything. So, you know, some of the stuff that I'm going to show in the next couple of minutes is stuff that you may not feel ready to do on your, the machine you need to do your work. So thinking about installing one of these things that makes it possible for you to get some reps on experimenting that that's going to be a good first step. The other thing that we're going to start thinking about are ways of being able to tell people, go run this, instead of like, go click through the world's least navigable website to download the version of R you need. There are ways of doing things where we're just going to make it much easier to like copy paste a command, execute it, and then sort of be on that next step. So if you use Mac OS, this may be familiar to you already. If you're on Windows. This is a less familiar workflow but it's becoming more familiar. There is a thing called a package manager, which is unfortunately it has no relation to Posit package manager it's just like a different layer of it's the outermost layer of the onion. And this is a way of installing stuff like anything that you that you like want to install to run software so like I install RStudio using homebrew for example. I'm not going to go into a ton of detail, but I will say that I've had, even on laptops that I don't manage a lot of success using things like these to get software in a place where I want it to be. And then the thing that I want to show you in the last couple of minutes is a package called rig. So you can get rig. If I go to the rig website, or the rig GitHub page. You can just kind of grab it off the internet. But you can also use one of these package managers to get it installed. And the thing rig does for you is it makes it really easy to install and switch between different versions of R without, without driving yourself crazy. So rig is what we call a language manager. There are most software packages have some kind of language manager rig is the one that is written for R. It's written in Rust, which is useful to know because it means it doesn't depend on R to work. And if someone tells you that they have written something in R that's going to help you manage R. You should be a little bit skeptical. And that's, that's not just true of R. That's also like if someone says I have a Python. I have a thing that manages Python that's written in Python. That's a, you should be skeptical. So I have R 4.1 through 4.5. In this case for my arm CPU. So I'm going to type rig. And I type rig. You'll see like rig is what I have. It's a, it manages R installations. You can see a bunch of versions of R installed, right? If I say, you know, I've listed the versions of R. That I have available and I'm going to say, I want to change my default version of R. To R 4.4. I run rig list again. You can see now the default is 4.4. And then if I start RStudio again. Right. You can see now 4.4.2 is the version of R that I've started with. There are some different ways of switching around the version of R that's running when you start RStudio. Some of them only work on windows. Some of them are in RStudio pro. If you use positron, this is much easier to deal with because you can just switch R sessions on a kind of per console basis. But if you depend on using R scripts or anything, just being able to make sure that the default is set the right way across all your operating systems and everything using rig is going to make that a lot easier to do. And then you can also use that to do things like if you wanted to rig install R 4.6, which I'm not going to do now because I don't need to trash my internet connection. But if I wanted to install R devel right to sort of do some preview work there investigate things on a new build. I can also do that. So having access to something like rig is going to make that a little bit easier. You can see I'm here I've opened R 4.4 point two. I'm not in a project, I'm going to reopen the previous WCF renv two projects. And what you can see now right is I get some messaging from renv about how the version I'm using is different from the version that wrote my lock file. And now I can call as, as the instructions say renv restore. And this is going to result in me having the right R package installed for the version of R I'm using right so I've changed the version of R And then I've reinstalled the packages I need for this project. And so, when new version of R comes out right if you adopt this project based workflow. Then you don't need to be worried about like whether updating a version of R is going to wreck all the projects on your machine, you can upgrade them as you have time. That's basically all I have to say today. Thank you all for coming and for your attention, happy to answer questions in the remaining couple minutes we have here, or just let everyone go. And thank you again, Shannon. Well, while people type any last questions that they have, I do just want to take this opportunity to thank you both. Thank you so much for a fantastic presentation. It's clear that you are pros at this and did such a great job of curating your content, really helping disambiguate and demystify a lot of terminology and a lot of concepts that underlie a lot of what we do every day that I, I at least kind of knew about I've sort of poked at my R profile, but I've been afraid to poke it too hard. So thank you for making this a very friendly approachable topic for those of us who really want to take the next steps in administering our own R experiences, a little bit more and giving us the tools to make the next steps. So, I know I'm speaking for everyone who is present here today and saying, I learned a lot. I had a really great time. You're really engaging in the way you present your content and we're so, so grateful that you brought your talents here to R/Medicine 2025. So thank you very much. With that, I will let you guys answer anything that popped into chat, but just want to make sure I thank you on camera and on video. Thanks for having us.Checking binary availability
Q&A: bleeding edge packages and CRAN compatibility
Live demo: checking RPostgreSQL binaries
Compiling packages from source
System dependencies
If you don't have those available, right, then you're going to, you can install the package successfully. And then when you try to load the package, you'll see an error message like this.
Installing packages from R universe
Walking through the exercise
So if you're in an organizational context where you have something that's very complicated to build, and you want to try to set that up once and distribute it to people publicly, then R universe can be a good option for that.
Reproducible environments
Public package manager snapshots
Understanding libraries and renv
What renv enables you to do is to take each of your projects and operate them with a library that is isolated from all the other libraries.
Live demo: setting up renv in a new project
Managing the renv lock file
So what we need to do is we want to try to get our lock file into a place where the package is installed. It's recorded and it's being used. Right. And so those are the three states we're trying to harmonize.
renv and package installation workflows
Using a scratch directory
Collaborating with renv on shared projects
because ultimately that's going to be an easier, more auditable way to figure out like who is making what change right so you don't end up in a situation where somebody just decides, they want to pull in the latest version of the thing in that project, without telling you and then you like you your expectations about what the project does change silently.
Managing R versions with rig
