Resources

RStudio - Shiny Server Pro Architecture | RStudio Webinar - 2016

This is a recording of an RStudio webinar. You can subscribe to receive invitations to future webinars at https://www.rstudio.com/resources/web... . We try to host a couple each month with the goal of furthering the R community's understanding of R and RStudio's capabilities. We are always interested in receiving feedback, so please don't hesitate to comment or reach out with a personal message

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Okay, so I want to talk to you today about the Shiny Server Pro architecture and give you a bit of a feel for how things are organized and how things are run within Shiny Server and Shiny Server Pro.

All right, so first of all, Shiny Server Pro can be viewed as an extension of the open source Shiny Server package, which we generally just call Shiny Server. And so basically, you can view Pro as kind of enhancements and new sets of features that add on to the open source package. So when you talk about the architecture of Pro or open source, it probably makes sense to start with a discussion about the open source Shiny Server, and then we can kind of grow from there into what Pro adds on to that.

But, you know, I mentioned here a couple of the features that are often most compelling for Pro. So, you know, we do user authentication with whether that's LDAP or Active Directory. And then also we do application scaling across multiple processes for a single application, which is helpful.

The Shiny package and WebSocket communication

So okay, so to go kind of all the way down the hole, we get down into the Shiny package and I suspect that most of you are familiar with Shiny, but basically Shiny is, as you know, an R package, but what's important to realize is that Shiny speaks WebSocket. And so what that means is that any browser that's compatible with WebSocket can connect directly to a Shiny process and do, you know, the communications there back and forth between those two entities.

So that's great. And you probably notice this when you're developing applications locally. So if you're using RStudio, for instance, and you press run app, you're going to get a little viewer. That viewer is communicating directly to your R process or, you know, if you open in a browser, you're communicating directly with an R process and there's no Shiny server intermediary between those two. And that's because Shiny speaks WebSocket. So it is possible to open a direct connection from a browser into the Shiny process.

So it is possible to open a direct connection from a browser into the Shiny process.

So that's great. But there are a couple of limitations on this. First of all, as I mentioned, that only works for browsers that can communicate over WebSockets. And so that rules out Internet Explorer 8 and 9 and some other versions of Firefox and Chrome and things like that.

The other problem is that the Shiny process has to be running at the moment you try to open a connection to it. And so that can be a little complicated in that if you have, you know, as an organization, maybe you have, you know, hundreds of Shiny applications that you've developed and at any given time, maybe, you know, a handful of them are actually being used and interacted with. The problem is here that, you know, in order to communicate, Shiny would have to be running all of, like, you would have to be running one process for each of those applications all the time around the clock. And that creates a lot of problems in terms of resource utilization if you're trying to run these, you know, hundreds or even thousands of processes all the time, even though many of them may not be used, you know, but, you know, a few minutes a day.

So there are a couple of limitations to just running Shiny on the bare metal, but it is possible.

What Shiny Server sets out to accomplish

So Shiny server sets out to accomplish a few goals. The couple that are interesting for this discussion are, first of all, it should start and stop Shiny processes on demand and as needed, and also it should translate non-WebSocket traffic into WebSocket traffic that Shiny can understand. And then finally, it can also map different URLs to particular applications.

So if you were just to run an army of Shiny processes on your bare machine, all of those would have to run on different ports, and so your users would need to enter a port number in order to get to an application, and it would be, you know, a little sloppy. And so Shiny server offers a kind of a cleaner, more convenient way to be able to navigate between the different applications, and we do that by offering a few different hosting modes. So you could just host one application. You could allow users to host their own applications, or you could host an entire directory of applications, and you can do these at different URL subspaces. So it just allows you to kind of more meaningfully manage, you know, many Shiny applications.

And so a couple of these to kind of revisit, so starting and stopping Shiny processes, so again, this is, you know, if you're running this naively, you would need to be running that Shiny process all the time, but because you're running Shiny servers, kind of this long running, you know, always on daemon, what that means is that when a user goes to connect into the Shiny process, if it's not running now, Shiny server can kind of hold that request for a second while it goes to start the Shiny process, and then once the Shiny process is ready, it can let the traffic through into that process. So it can kind of broker and manage the communications between applications that may or may not be running at the moment they're requested, and then also if it notices that all the users have left and that application's been idle for some configurable length of time, then it can go ahead and reap those processes to free up some more resources for other applications.

And again, like I mentioned, older versions of Internet Explorer are not going to be able to handle WebSockets natively, and so Shiny server uses a library called sock.js, which basically allows you to work as if you were working with WebSockets, but it will kind of gracefully fall back into other protocols as needed, and that's actually really important, if you're not planning to support older versions of Internet Explorer, because many networks, if you're using older network equipment that was designed before WebSockets gained popularity, then you'll actually end up in a situation where they may be dropping or blocking WebSocket traffic, and we see this quite a bit, especially at universities and academic settings. They for some reason tend to have network equipment that isn't friendly with WebSockets.

So you can kind of end up in the situation where even if you're using all the latest and greatest technology, there may be some router between you and your client that is going to interrupt your WebSocket traffic, and so it's nice to have a technology that will allow you to fall back to something that is supported if needed. So that ends up being a pretty important feature.

Okay, and then as we already discussed, being able to map URLs to different applications.

Architecture overview

Okay, so kind of an overview of the architecture. You can envision it looking something like this. So you have a variety of users over here. I don't know if you can see the mouse. Let's try the pen here. You have your variety of users over here, which could be a handful, or we run servers where there are hundreds of concurrent connections from different users around the world, but they're all opening connections into your Shiny server instance, which, again, is kind of this always-on server process that's running as a daemon on your server. And then it's going to open and close these, or start and stop these Shiny applications and kind of proxy the traffic as it goes through from users' browsers into the Shiny applications.

And so what this typically looks like, just to give you a feel if you're going to get down to the network level, the first request, of course, is just for kind of the root URL of some application. A Shiny server would go in and inspect, is that application already online? So let's say it maps some URL to app one. So it would go through and say, is app one already online? If it is, it'll proxy the traffic through. If it's not, it'll start app one and then proxy the traffic through. And then just kind of continue to keep these channels open between the user and the application.

So what this looks like at a network level, the first request is going to be for kind of the root URL of that application. And then that's going to return some HTML that is going to kind of point the browser to different static assets that may be of relevance to CSS files and JavaScript files and things like that. And so it'll open up more connections for all of those files. And then eventually it'll actually open a WebSocket or WebSocket-like connection that will be persistent between that browser and that Shiny application. And that's kind of how Shiny, the Shiny metric works is via that WebSocket or that WebSocket-like channel that allows the user and the Shiny process to communicate back and forth for as long as that connection is open.