Using RStudio Connect in Production

Transcript#

This transcript was generated automatically and may contain errors.

So if you're new to Connect, Connect is an enterprise publishing platform for static and dynamic content that you're creating with R. So that can include documents, presentations, dashboards, and Shiny applications, and basically anything that you're producing in R. It is an on-premises application, so you run this on your own server behind a firewall if you want, and you're going to host it on your own equipment there. And the goal is really around sharing data science artifacts that you produced inside your organization. So if this is new to you, I would recommend actually that you check out the introductory webinar that Bill referenced a moment ago. So we recorded this two weeks ago, the recording is available online, as are the slides. And I'd recommend that you start there because we're going to be building on some of the foundation that we laid a couple weeks ago.

So if you are new to Connect, you might want to start there before you jump into some of these more advanced topics. But without any further ado, let's go ahead and dive into some of the content that we want to cover today.

So my goal for today is really just to give you pointers to the different tidbits that I think are important for you to know if you're going to be managing RStudio Connect in a production environment. And so I won't go into too much depth about a whole lot of these, but really I just kind of want to give you links and pointers to the different things that you might be interested in if you are responsible for managing a production environment with RStudio Connect. So pardon me if I'm kind of flying through some of these topics, but feel free to ask questions and we can come back and do a little more detail on some of these things later. Or just check out the links once we post them online and hopefully you can kind of find the information that you need there with the links that I provide.

So today I want to cover first of all user management, then secondly the management of resources on the server, and then lastly we'll talk a little bit about kind of the system management and system security implications for the server.

User management

So we covered this a bit last time so I won't belabor this, but we have three different user roles on the system. The first is an admin user who has all privileges on the server and they can access and kind of manage anything that they need to, but irregular actions are audited and I'll dive into what that means here in a moment. Below that there would be a publisher who's someone that can upload or publish content onto the server, and then lastly there's a viewer who's just a consumer of content and can't author content that gets executed on the server.

So let's talk a little bit about what the experience of a Connect administrator looks like. So first of all they have access to special admin only actions and so for instance this includes the admin tab, so if you're an admin on the user you're able to access certain pages within the dashboard that other users aren't able to manage that shows you things like metrics, you're able to manage users and things like that, and then lastly you're able to customize some application settings that other users aren't able to. So you can define vanity URLs for an application, you can customize the run as argument for an application, and I'll show you what all this means here in just a moment.

But the trick is that you don't get everything for free as an admin and so some of these things you're going to be able to just hop on and start changing. Other things you're going to actually have to go out of your way to explicitly grant yourself permission to things that you otherwise wouldn't have had permission to do. So let's, I think that's probably best covered by just an example, so let's look at an example of this.

So first of all this is me logged in as an admin user on Connect. And so you can see here that first of all I have this admin tab and so that takes me first and foremost to this metrics page where I'm able to view information about the CPU and RAM usage across the server. I also have an audit logs page here where I can view kind of different changes that have been taking place on the server recently. And then lastly when I dive into particular content, even though this is not authored by me, this is authored by someone else and I don't have particular permissions on it, I am still able to go in and define custom settings for this content. So for instance here I can define a vanity URL.

Let's take a look at kind of a more rich document here. So in this case we have a schedule. I as an administrator, even though I don't have special privileges on this document, I can go in and customize the schedule for this content. I can define a vanity URL, change the run as user, etc. So I as an admin have free privileges to be able to do some of these things.

However if you look at something like a document here that's private, so here I'm logged in as a different user. This is the publisher user and you can see that I for this content here I've defined it to only be visible to myself. So this means that the admin user should not freely have access to this content. This is sensitive content that the admin shouldn't be able to see. And indeed if I go as the admin and I go to look at that content, this is the view that I get. So you can see that I'm still able to manage the settings for that content here but I do not have free access to be able to view that content.

And so this is kind of what we're talking about when we say the admin has the privileges to do whatever they need to do on the server but they don't get everything for free. So I am not able to view this content although I can go in and I can add myself as a publisher or as a viewer because I'm able to manage the settings on this content and at that point I would be able to view the content.

And so the trick here is that all of this though is managed when you make these explicit actions to add or remove yourself to a particular bit of content, all of those actions are going to be captured in the audit log that I referenced earlier. So when I go look at the audit log you can see here that the admin user assigned themselves as an owner as a collaborator on this app and then they remove themselves as an owner as a collaborator on this app. So now while I am able to do everything and navigate whatever I need to cover to be able to manage the server, anytime I take those explicit options of kind of going out of my way to take special privileges on an application that's going to be captured in the audit log and that's kind of the balance that we try to strike here with an admin.

So now while I am able to do everything and navigate whatever I need to cover to be able to manage the server, anytime I take those explicit options of kind of going out of my way to take special privileges on an application that's going to be captured in the audit log and that's kind of the balance that we try to strike here with an admin.

Lastly if you've missed this in our latest release the 1.441 release we also do have the ability to download the source code for content. And again that's only available to users that are explicitly granted collaborator privileges. So I as an admin do not get free access to source codes published on the server, however if I add myself as a collaborator then I am able to download the source code for an application. So this is kind of the balance that we tried to strike with an admin but I think it's important if you're going to be managing a Connect server in production that you kind of understand what the privileges of an admin actually encapsulate and what you get for free and what you don't.

So next we'll just kind of move on and a lot of these things I'm just going to cover in rapid fire but the next one that I wanted to cover is the default user role. So this is the setting that's managed in the authorization section under the default user role setting and this is basically the role that fresh users should take when they sign on to the server. The default right now is publisher which means that when a user signs up on your server or when they you know log in using whatever authentication protocol you're using that user is going to become a publisher on the server they're going to have access to publish new source code. If you want to change that if you want to limit that so that default so that new users coming into the server are just viewers you can change this configuration setting here and make that a viewer and this is actually subject to change we've considered actually making the default viewer and so in which case if you wanted the default to be publisher you could of course override that here.

Another tool that you should be aware of if you're managing Connect is the user manager command-line interface and this is a root only command-line interface that allows you to interact with Connect in kind of a batch way and so right now there's kind of a limited subset of what you can do with this command but we envision this growing over time to capture more of the interactions that you might want to take within Connect. So right now one restriction is that the server actually does need to be stopped in order for you for you to use the command-line interface here and that that's a restriction that may be lifted here in the coming months but right now you can access the tool at this location optr-studio-connect-bin-user-manager and then you can run commands such as list to list all the users in the server you can run alter to change a user for instance promoting a viewer to a publisher or promoting a publisher to an admin user and you can also actually dump the audit logs even in the CSV format here so if you wanted to navigate and browse the audit logs on your own time or using your own tooling you could do that here using this user manager tool.

Next there's an idea of user locking in the server that you should be familiar with. So as of the time of this recording in early 2017 we don't have a notion of user deletion and the reason for that is that there are a lot of open questions around what should you do with a user who's deleted, what should you do with their content, should you migrate it to another user, should you keep it alive on the server or should you get rid of it. So until we settle some of those questions right now we've kind of settled on this compromise of user locking. So if you for instance have an employee leave the company and you don't want them to have access to the server anymore you can lock their account which forbids any further login or interaction on the system they can't publish updates or new content to the server and they also don't count against your license so they're not going to count as a named user they're not going to take up the seat for your license when they're locked. And also you should be aware that you can rename users and so if your goal is just to get rid of a user who's taking up a username that you want you can certainly just change the username for that user and lock it and then create a new user account with the username that you desire.

Lastly the last topic I want to cover is the notion of private packages and so PackRat is going to gracefully handle all the public packages that you have whether they're on CRAN or Bioconductor or even if they're you know hosted on GitHub or GitLab or anything like that you know these are all going to work pretty well within PackRat without any customization but private packages are a bit more complicated and so when you have a private package the best way to actually facilitate this is to use a private CRAN.

Additional resources and Q&A

Alright so that is kind of the the tour of the things that I wanted to introduce around RStudio Connect. A few additional resources that I'll point you to here so first of all if you haven't tried out Connect yet you can download and install a 45-day free trial using this link here. The admin guide is available all of the information that I've just covered is described and documented in the admin guide and so the goal of this webinar is really just to kind of highlight the specific points that you might want to consider out of the admin guide but certainly if you're going to be managing a production Connect server it would be well worth your while to spend some time familiarizing yourself with the admin guide here. We do if you're kind of have an IT bent we do have an IT Q&A page that's available for Connect that'll answer some common questions that we get from IT folks. If you're looking to set up authentication the authentication details are available here and lastly we do have our release notes available online as well and so if you're interested in kind of checking out the differences between particular versions or seeing what the kind of you know more detailed changes that take place every time we release a version then all of this would be available at this link here.

And so that is everything that I have and so let me go ahead and just take a moment and pull up some of our questions and see if if we have any questions that might be worth. Okay so the question is does RStudio Connect have the equivalent of an obscured URL anonymous view only access like Google Docs share URL. The use case I have in mind is so that we can have external clients allowing them to view a shiny app hosted on our RS Connect server. So this is not something that we support today. Basically everything that we that we do is around kind of explicit authentication and so if you want users to have access to a particular application then you would have to explicitly add them to your content and then they would you know when they log in they would be able to view that in your content listing and it would be kind of fully enumerated for them. But that is an interesting feature request and that's something that we should definitely keep in mind.

Yeah so one question here that you should be aware of around self-signing SSL certificates and I'll point this out here so back on the HTTPS section it is not a problem for you to self-sign a certificate when you set up HTTPS and the server basically any valid certificate and key pair we would be happy to host for you. One thing that you should be aware of though around custom certificate authorities is the fact that while your browsers may be entrusted within your organization while your browsers may be instructed to trust this custom certificate authority when you're publishing from the RStudio IDE that is using a different set of network connectors than the browser and so we're usually using curl or you know whatever networking systems are available in your server or on the desktop if you're using RStudio desktop. And so you do need to be aware that not only do your browsers need to trust your SSL certificate but also whatever systems you're using from the RSConnect package and IDE to publish content onto the server which is often curl those also need to be instructed to trust your custom certificates and so if you have any custom SSL certificates or custom CA or an internal CA you do need to be aware that all of your RStudio clients would also need to be instructed to trust that custom CA.

Can you provide an example of a functional run setup? Yeah so we definitely if you look into our documentation we definitely have some some information some more details on that and you would see certainly you know the most trivial application would be just to customize a hard-coded user here and you can just define that you know whatever user in whatever group that you want you can just define here in this custom setting that should be pretty straightforward. Kind of these more advanced applications around you know run as current user really just depend on your environment and so I guess the reason that we don't have kind of more details or kind of an example setup for this is that it really just depends on what your environment is how you have you know Kerberos and PAM configured and so a lot of that is just going to depend on kind of your environment but definitely even if you're doing a trial even if you haven't purchased the product yet I'd encourage you to contact support at RStudio.com if you're you know encountering any troubles here or if you want kind of help in reviewing the architecture or working with your IT folks to kind of provision an environment that's going to be successful for your Connect instance we'd be happy to work with you on that.

But other than that I think that is about everything that I was hoping to cover today so I think we are about good to wind up.

Using RStudio Connect in Production

Transcript#

User management

Resource management

Disk usage and storage

Multiple R versions and process configuration

System security

Additional resources and Q&A

Featured software#

rstudio