
RStudio Cloud Demo with Dr. Mine Çetinkaya-Rundel
Much has been written in the statistics and data science education literature about pedagogical tools and approaches to provide a practical computational foundation for students. However a common friction point for getting students (and faculty) started with computing is installation and setup. Circumventing the installation and setup steps early in the course by having students access R and RStudio in the cloud can minimize frustration and improve buy in. RStudio Cloud is a lightweight and easy to set up / use solution to this problem. In this talk we will discuss pedagogical reasons for teaching computing with R on the cloud as well as share best practices and tips for setting up your learners for success on RStudio Cloud. We will also provide an opportunity for the audience to experience computing in RStudio Cloud first hand, demo its newest features, and highlight a suite of ready to use resources for teaching R to new learners. Read more in the follow-up blog post: https://www.rstudio.com/blog/teaching-data-science-in-the-cloud/
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Those of you who have not come across me in the past, my name is Pete Nest, and I'm going to be your host once again today. So this is the third and final session of our first RStudio Cloud Live series.
Now, in terms of housekeeping, unlike prior events, we're actually going to start with a live cloud demo, then to follow, much like in the past, we're going to have a live Q&A. Now, there's no need to wait till the demo's over. Feel free to ask any questions along the way in the YouTube live chat.
And at this point, I'd actually like to introduce my colleague, Mine Çetinkaya-Rundel, who is a professional educator and a data scientist here at RStudio, as well as a professor in academia.
Hi, Mine. Hello. Thanks for having me here today. And I'm excited to talk about RStudio Cloud today. So I use RStudio Cloud both as part of my life at RStudio and also as an educator at Duke University as well. And my goal today is to talk to you a little bit about both how I use RStudio Cloud for teaching and also give you a little bit of tips and tricks. So let's go ahead and get started.
So if you are interested in getting a hold of these slides, you can find them at rstd.io slash rscloud-demo. I will show this link at the end as well. And there will be a bunch of kind of demos throughout. So the slides only give you a glimpse of part of what I'm going to say. But if you would like to get a hold of the PDF, you can do so there.
What is RStudio Cloud?
So let's first talk a little bit about what RStudio Cloud is. RStudio Cloud looks just like this. When you land on rstudio.cloud and you log in, you can see that you have a place for your projects and you can start with a new project. And we'll start with an RStudio project for now. And you'll see that in my browser, I have something that looks just like RStudio if I had it locally on my computer as well.
And I can give it a name. So that outer shell is a little bit different than what I tend to have locally on my computer where I can name my projects. But beyond that, I have my console. I have my environment tab. Everything is as I know it. What's happening is that the compute is happening on RStudio Cloud as opposed to happening on my computer. Well, what that tells me is that if this lives on the web, I should be able to easily share it with others as well.
Why use RStudio in the cloud for teaching?
But before we get there, maybe let's answer the question, why would I want RStudio in the cloud in the first place? So instead of working locally, why might I want that? Well, particularly in the context of teaching, if you are going to be teaching R with RStudio, but your students are going to be using RStudio locally, they're going to have to do a few things when they get to the first class. They're going to have to install R. They're going to have to install RStudio. And they're going to have to install a bunch of packages that you might use throughout the semester.
That's actually not that difficult necessarily, but you do have to utter these words to them, right? These things tend to work relatively easily for a majority of us, but it becomes a thing you have to walk them through before you get to the exciting part of doing data analysis. Then you also need to talk to them a little bit about what's installing a package versus loading a package. And if you, like me, teach with version control, you also have to get your students to install Git locally on their computers as well.
And trying to hit all five of these major bullet points in the first 10 minutes of the class is probably not going to happen. Realistically, you'll end up having to set aside a bunch of time for it and potentially do quite a bit of debugging for maybe a small number of your students, but there will be some who will get stuck on some of these stages.
Instead of starting off your teaching with lots of friction points, you might want to be living in this land with much less friction, where you go to rstudio.cloud in your browser, you log in and you can start coding. What I'm always trying to get my students to, where I'm always trying to get my students is a place where when they sit down to work on their computer and they're like, all right, I'm going to work on my stats for my data science course, it doesn't take them more work to get started than launching a video on Netflix, for example. And that's how easily you can get started with something like this.
What I'm always trying to get my students to, where I'm always trying to get my students is a place where when they sit down to work on their computer and they're like, all right, I'm going to work on my stats for my data science course, it doesn't take them more work to get started than launching a video on Netflix, for example.
Options for cloud-based RStudio access
Now, we talked about why RStudio in the cloud. So where they go to could be rstudio.cloud or could be other options as well. So how do we achieve an access to RStudio in the cloud? One option is using the RStudio Workbench, which I'll just talk about briefly here and then move away from, but there may be reasons why this may be what you prefer.
So if you have sysadmin experience yourself or IT support, perhaps from your university, and you also have some either hardware or a local virtual machine or some cloud computing credit where you could be running the compute. And obviously you yourself have a bit of RStudio experience to be teaching with it. This might be an option for you. So RStudio offers RStudio Workbench, which you can run in your browser once it's been set up for you. And you can get the software for teaching purposes for free from RStudio.
If you would like to set things up in this way, there is a bit of setup. And that's why I have the first bullet point about either having sysadmin experience or IT support available to you. But if those are things that you are able to do, and you are able to set things up in a way that where your students can authenticate and use the platform, do, you may find this article we have in the American Statistician called Infrastructure and Tools for Teaching Computing throughout the statistical curriculum helpful, where we describe our setup that we have at Duke University, where we use a version of the setup for getting our students to do computing in their browser.
Another option is instead of having to, needing to have all three of these, if you just have RStudio experience yourself as an instructor, but maybe not the sysadmin experience and not the kind of the machine where you could be running this on, RStudio Cloud might be the right option for you. So the setup is a lot less cumbersome and a lot less resource intensive. So that's what we're going to be mostly talking about today.
RStudio Cloud features for teaching
Let's talk a little bit more about why RStudio Cloud. Beyond the fact that it does not require IT experience to set up, it also has some really nice bells and whistles that are designed particularly for teaching. For example, there is the notion of a workspace where you can organize your class as a workspace. So either if you're teaching multiple classes in the same semester, or from semester to semester, you're able to kind of have them organized in a nicely contained way.
There is a notion of roles. So things like instructor, teaching assistant or student can be assigned to particular roles on RStudio Cloud. You can turn your project. So I just showed you earlier how to fire up a project in RStudio Cloud. You can turn that into an assignment that your students can quickly get started with. And as an instructor, something I really, really like is that you can peek into student projects. So that is really very similar to being able to, you know, be say in a computer lab with your students. And if they have a question kind of peeking over their shoulder, and being able to do this has been particularly helpful when we've had to do a bunch of remote teaching.
The other thing is that you can set up a base project that contains any R packages you might need, or any other documentation that you would like for your students to have for every single one of their projects. You can set that up for them and modify it throughout the semester, and ensure that each time a student starts a new project, they're all starting with the same base project.
Certain system libraries work out of the box. For example, Git will work out of the box. And also you can render to PDF or Word, and that works out of the box as well. These are generally things for local use. The user would have to set up themselves, or if you are teaching in a situation where your students are using their local setup, they would have to install for themselves. And usually the installation and setup of these tends to require more skills than just being able to do some computing with R.
Demo: projects in RStudio Cloud
All right, so now that we've kind of done an introduction to RStudio Cloud, I'd like to do a few demos where I can actually take you through RStudio Cloud and talk a little bit, and show you a little bit about how life in RStudio Cloud works. So let's start with this notion of a project. So this is what an RStudio Cloud project looks like. And here I am in an RStudio session.
And if you do use RStudio locally as well, this is basically the same idea as a local project. And there is a nice UI, as I said, up top as a banner, where you can kind of keep track of which project you're working on.
Another thing that you can do is so when we're starting up a new project, you can actually open up a Jupyter project as well. So this is currently in beta, but it is available. And when you fire this up, you basically get an IPython notebook. And you can see that in the same environment. So I've just, I'm still in RStudio Cloud, but I'm able to run a Jupyter notebook as well.
And you can see just like a regular Jupyter notebook, I have a little bit of kind of text and a little bit of code intermingled and I can run my cells, so on and so forth.
Other things that you can do is you can also start with a project from a Git repository. So you can simply put the kind of the URL of your Git repository here. I'm cloning the praise package. That is an R package that, you know, gives you nice pleasantries as you run your code, gives you some positive reinforcement. And you can see that I'm able to clone that project and kind of start working with that as well. This does require that you kind of sync up your GitHub account with your RStudio Cloud account, which you can do so from your profile. But once that's all set up, you're able to interact with Git easily as well.
Sharing approaches for teaching
So teaching with RStudio Cloud beyond just using it for yourself really does not require any further kind of setup than this, except the fact that you probably will want to prepare some projects for your students, and you might be thinking about various ways that you can share it with them. So there are really two ways that I think about teaching when I'm kind of thinking about what sort of sharing approach I'm going to use for my students.
Often I ask myself the question, is this a shorter engagement, something like a workshop or a short course, where I can organize content in a single RStudio project or just a few projects, and I have no need for keeping track of my learners. So that's usually for me a setting where maybe I won't be grading them, or I'm not necessarily responsible for keeping track of their work over an extended period of time. If that's the case, you can simply create a project like we've shown before and share that project simply by sharing its URL.
If you have a longer engagement, so that is perhaps a semester long course or a multi-day workshop where there's lots of content. And instead of kind of putting all of that into a single or just a few RStudio projects, you can think about it as many, many projects that need to happen throughout a semester. So maybe many homework assignments, a final project, some lab work, so on and so forth. And you have a need or desire to keep track of your learners, either for grading purposes, or because you want to assess either for your kind of enjoyment or to be able to give feedback to the students, how much they're interacting with both the platform and also the learning materials that you've prepared for them. In that case, I recommend creating a workspace and then inviting your students to that workspace.
Demo: sharing a single project
So I'm going to demo both of these approaches next. So let's start with sharing an RStudio Cloud project. So we're in the instructor view here. You can see that I have prepared a single RStudio project. And I have a few folders in there where I've organized my teaching materials. This happens to be a short course I gave on introduction to Shiny. So we have four modules in this short course, and I've created those and put them in my files pane.
And what I'm going to do is I can kind of show you the inside these folders. We can see that I have kind of the R scripts that I want to share with my learners. And I also have packages installed as well. So I am installing any packages that I'm going to use in this short course for my students so that when they come in, they can simply run the code and they don't have to worry about installing packages as well. It's not that difficult to install packages, but this ensures that everybody is using the same version of the package.
I have prepared everything for them. And what I can do next is make my project shareable. It's private by default, but I can go to access and say, anyone can view this project now and just grab its URL. Now, when I say anyone can view the project, I should mention that even though anyone can now see this, it's only if I share the URL with them. It's kind of privacy by obscurity. You can see that URL has some random numbers at the end and just about anyone couldn't just guess what that is, obviously.
So I'm going to have to kind of grab that URL now that I've enabled sharing. And then I share that with my learners. So I maybe email it to them or message it to them or something. And all they need to do is put that in their browser and they're gonna need to log in. They might choose to log in with Google or GitHub or something else or like an RStudio cloud account. And you can see that they're in the same kind of looking environment and they don't need to install a package anymore. They can simply load a package because the package for them is already installed and they have the same file structure as I have and they're able to run the code.
I want to grab your attention or point your attention to one more thing here, just under student view on my slide, it says temporary copy. So when a student receives this link, they're in a temporary copy of this project. And what they can do is next to that, there's a button that says save a permanent copy and they can make it their own. At that point, RStudio cloud is simply grabbing a snapshot of that project and making a copy for the student that they can continue working on. And when they come back, they're able to continue working on their own copy and any changes they make basically does not propagate back to my project and also does not get lost either.
So this idea of sharing an RStudio cloud project on the plus side is super easy. First of all, I created my project and all I had to do was to flip a switch for shareability and share the URL. Other pluses associated with it is that students land directly in a project upon login. So, you know, we're telling them we're teaching RStudio and we give them a URL and they're right there. There's no additional steps they need to do.
I find that beyond teaching, this sort of sharing a project is actually great for just collaboration, particularly if I have collaborators with whom I want to share my code, but maybe they're not, you know, Git or GitHub users. So that's not the path we want to go or to create some reproducible examples that might be a bit more, you know, a bit more extensive than like a little snippet you might share with somebody on Slack or GitHub. So oftentimes if I'm trying to debug, for example, an extensive Shiny app, even if I make it pretty minimal, I find that there's a lot of code there to just post into a message. So often I'll create a single RStudio cloud project and then it ends up having the packages that I have installed ready to go as well to share with somebody else to ask a question.
Now on the negative side, that temporary copy thing with even with the handy save a permanent copy button next to it can sometimes get students because they need to remember to make a permanent copy of the project, which honestly, as an instructor to you means you need to remind them to do so. Otherwise their work can get lost.
Also as an instructor, you can't keep track of which students started their assignments because once they make a copy, they own that. And if they want you to take a look at their project, you need to be granted permission back. You don't have that by default, even though you have originally shared it with them.
Demo: teaching with a workspace
So we've talked about creating a single project and sharing that. And I mentioned that this is a useful approach for kind of short engagements, but what about for longer engagements where maybe you're teaching a semester long course and you need things a little better organized. So in that case, the approach that you might take is inviting your students to an RStudio Cloud workspace.
So I'm again here on the instructor view, and you can say that on the top bar, it says your workspace. So what I'm going to do is I'm going to click on the little menu next to it and create a new space. So in this case, I'm creating a space for a new course that I'm giving, I'll call that Shiny Essentials. And now that I have moved into this new workspace, I can create projects in here. So these projects are only going to be accessible to folks that are in this workspace.
So I can similarly create modules for my students to learn from, and I can set things up in many ways. So when I go to share this project, you can see that it now says everyone in Shiny Essentials, and it allows me to turn it into an assignment as well. What this means is when the students land in this workspace, they're going to see this first project, and it says instead of assignment next to it, it will say start for them. So they can get started with a permanent copy of their own that lives in this RStudio Cloud workspace. So they don't have to remember to make a copy for themselves.
Once students have started their assignments, I as the instructor can track their projects as well. So in this case, I was teaching this Shiny Essentials course, and over the four modules, you can see that there are many kind of derived projects from each of these. So 187 students came to this project and they made their own permanent copies and they started working on them. And I can actually go in, I can click on that view derived projects link, and I can go in and see my students' names and even open up their RStudio Cloud projects. So if somebody is having difficulty to the point where we're unable to like help them get unstuck by maybe discussion forum posts or something, I can actually go in and directly help them in their RStudio session.
Another thing is, so how do we get students into this workspace? Under the members tab within your workspace, you can see the names of your members. So I have grayed those out for privacy reasons, but you can see that I can scroll through and see who the members are of this workspace. And I can invite more folks either via an invitation. So I would need to enter their email address and send it to them, or I can use a sharing link. So I can create a sharing link, set a particular default role for that sharing link and simply share the link with them. And at some point I can reset that link if I want, if I want no further people to continue to join the workspace.
So these particular roles are admin, moderator, contributor, and viewer. I think about these as the admin is the instructor. Admin is the instructor. So that's me. I can do whatever I want. I can manage users. I can view, edit, and manage all of the projects. The moderator level to me in an instructional setting is more like my TA. They can view, edit, and manage all projects, but they don't have the privileges to get more people in or kick people out of the class. A contributor are my students. They can create, edit, and manage their own projects only. They don't get to touch other people's projects. And then the viewer role is something that I think that, you know, that you may not even use at all, but instances where I've used it have been either auditors to the class or guests. So these people can only view projects that are shared with everyone and no other projects.
Sometimes I have colleagues who want to, who want to perhaps take a peek at what's happening in my class to kind of take a look to see how I've set up my RStudio Cloud workspace, for example, but I perhaps don't want them to see my students' names or their projects. So that's the role I have mostly used that for.
We can also set permissions and these come unchecked by default. And I would recommend leaving them that way unless you have good reason to do so. One of them, for example, is about whether contributors can see the members list. Another one is whether they can, contributors can make their projects visible to all members. I usually leave that unchecked because I don't want my students to readily share their assignments with each other.
Base projects
Another perk of using this workspace setting is perhaps my favorite one is this notion of a base RStudio project. So what that is is a project that you can set. So it's an RStudio project that you can set to say, this is the image I want every single project to start with. So I can actually put, decide on the version of R I want everyone to use, the installed packages and their versions that I want everyone to use, and also any files that I want every single project in my workspace to contain. Usually for me, those are things like a statement of code of conduct, maybe some instructions for how to submit work. They're not things that change from assignment to assignment. They're things that need to stay constant. They're the sort of things you might tell your students, go read in the syllabus. So I ended up putting those there.
These base projects can be updated throughout the semester, which means that say a newer version of a package you use comes and you want to be able to teach with the newer functionality that's been implemented. Mid semester, you can update the base project so that going forward, you're using a newer version of that package and every project that's starting from that point onwards is affected by this update. But it's not retroactive and that's good because you're not gonna break old work students may have done.
I think this is also a really nice perk because when you are using kind of maybe an RStudio in the server type setup that may be shared throughout the university or throughout your department with maybe other instructors, oftentimes IT professionals would be hesitant to update this sort of base image because changes that you make could affect others as well. But by the definition of a workspace, it's only your class and your class is the only set of projects who are going to be affected by this change. So you really can do as you please.
Tips for using RStudio Cloud
So the first tip that I would mention is that changes to an assignment that you make, that after a student has started, won't propagate to their copy. So this is not like Google Docs in a way, is I think the best way to describe it. When your student makes a copy of a project you have set up for them in either of the sharing approaches, they're really taking a snapshot at that time. So if you need to change your assignment, and I'm sure we've all been there, you assign a homework assignment, then you realize something is wrong. You're gonna need to get that information to them in another way, or have them start with a new project, or have your projects be linked to a Git repository as well.
Packages in the base project are installed, but not loaded. So when you're writing your instructions, you want to remind yourself, remind your students to do library, whatever package name, because they won't be able to use it immediately in the console.
This information is always on the RStudio Cloud guide and the What's New page. RStudio Cloud is still evolving. It is not changing in terms of, it's not changing in terms of like, in a way where it's going to be a completely different UI the next day your students land in it, but there are new features being added. And so it's nice to keep up with those. And if you encounter any sort of slowness or glitches, say your students are reporting, we can't log on, there's a current system status link on the side that's super handy. It allows you to check to see if there are real outage, and if not, you can go ahead and report it.
Other tips is I strongly recommend creating an additional account. So I have this like additional account with another personal email address I had, and I invite that persona as a student, so as a contributor into my workspace, so that I can log in and see things the students see them. That can be helpful when you are kind of taking screenshots for your students as well, if that's the sort of thing you like to put in your instructions.
You can set computational resources allotted to projects. By default, you get a decent amount, but if you're working with either bigger data sets, or doing something that's more computationally intensive, you may wanna up those computational resources that are available to each project. And the one way to figure out if this is going to be needed is to actually test your assignment. So if you write answer keys, for example, you may try running them in RStudio Cloud to make sure things hold up.
I would, the peeking at your students' functionality is super nice, but from a pedagogical perspective, I would recommend that you use that feature sparingly. It's important for us to teach our students how to ask good questions without us saying, hey, just move away from the keyboard, I'll do it for you. Because that's a little bit of that. Move away from the computer, let me take over and do it. And sometimes that's exactly what a student needs. But it's useful for them to try to articulate the question first, and not go there right away.
I also use that assignment feature for things that are not assignments as well. So even if I don't necessarily expect the students to complete and turn in something, you might still use that feature, because it's just a handy feature to say, let me get started with where you left things off. And so one of the things I do is if I'm doing any sort of live coding in class, I'll create a skeleton of what I'm gonna get through, maybe with some prompts throughout an R Markdown file, and leave that as an RStudio Cloud project for students to get started with as I start coding as well.
There are various kind of plans that you can choose for RStudio Cloud. And there's a free plan as well. And for kind of either hobby usage or kind of short amounts of teaching, what's allowed there, allotted there might very well be sufficient. But I've found that if I'm teaching a course with many students over a 15 week semester, I probably need either the instructor or the organization plan.
I'll also mention the RS Cloud package. It's not currently on CRAN, but it is available on GitHub. And our goal is to get that on CRAN, perhaps sometimes in 2022. But it has a nice set of functions where, as you can see, you can query the API to get a list of your users, to get some information on their statistics, like are they still engaging or not? You can use it to send invitations. So it's almost like you get to upload a roster and send invitations to all of your students. So if you are using RStudio Cloud heavily, and particularly with large groups of members and students, the package might very well be helpful for you, especially if you're like me and you don't like clicking around things too much, and you'd rather script things up.
If you do start working with RS Cloud, you're going to need an API key, which you can get in the RStudio Cloud UI. So just from your profile, you can go ahead and request an API key. And if there's features that you would like to see in this package, please open an issue. I'm actively working on it, and we'd be happy to try to implement them or at least let you know if that's not feasible due to the setup of the API.
And if you would like to learn more about RStudio Cloud, Jesse Mostapak has put together this really nice playlist of kind of short videos on YouTube that are about individual things you might want to do on RStudio Cloud. So I recommend taking a look at that as well. I think there may be stuff you're familiar with, but sometimes it's nice to see how somebody else approaches it. And there very well might be things that you didn't know existed there as well.
All right, I think that's it for me. Once again, the slides for this can be found at rstd.io slash rsclouddemo. And I would be happy to take any questions. Thanks for listening.
Q&A
Hey, Minae, thank you so much. That was super interesting. Hopefully our guests walked away learning something new. In terms of Q&A, I think we have a few lined up. For anyone that might've joined a little late, if you do have any questions, feel free to enter those on the YouTube live chat and we'll be sure to surface them.
I'm gonna start with a question that I actually get pretty often. So I get asked a lot about base projects. Could you talk a little bit more about how you handled those scenarios before the base project concept existed?
Yeah, so if I don't have access to this notion of a base project, what that means is that every time I set up an assignment, I have to install the packages that are needed for that assignment. For me, that often means I need to install Tidyverse, for example. So homework one, install Tidyverse. Homework two, install Tidyverse. And any other projects as well. Another thing is submission instructions is also another thing that comes up a lot, or any sort of like, these are the things you're allowed to do for this assignment. These are the things you're not allowed to do. So I would have this like text file that's like a plain text file that has this information that I would upload to every single assignment.
The way I tend to teach in a workspace, my students are rarely creating a project just out of the blue in there. They're often starting off of an assignment that I have prepared for them. So the base projects have mostly made my life easier as I create new assignments. But every once in a while, there may be a situation where your student just wants to create a project in there and work on something related to the class, but maybe not necessarily attached to any assignment. And then they get the benefit of using the same R version and the version of packages that you have pre-installed for them as well.
The other thing that I really like about the base projects is that I can update them throughout the semester and they don't break projects that happened in the past. So student work doesn't get affected, but I get to benefit from say a new update that may have come up in a package.
I see Brad just sent over a question that I think you somewhat touched on, but just to make sure that they heard the answer. Is there a way to distribute data files to students without setting up a project?
So if you're not setting up a project, so if you're setting up a project, upload it there and then it's distributed. If you're not setting up a project, there isn't necessarily a RStudio Cloud specific way of getting it to them. But ways that you might use is if you have access to a place where you can upload your data sets and that very well might just be a GitHub repository, that would be a free place to host them.

