RStudio Shiny Server Professional Architecture

Transcript#

This transcript was generated automatically and may contain errors.

Okay, so I want to talk to you today about the Shiny Server Pro architecture, and give you a bit of a feel for how things are organized and how things are run within Shiny Server and Shiny Server Pro.

All right, so first of all, Shiny Server Pro is kind of an, can be viewed as an extension of the open source Shiny Server package, which we generally just call Shiny Server. And so basically, you can view Pro as kind of enhancements and new sets of features that add on to the open source package. So when you talk about the architecture of Pro or open source, it probably makes sense to start with a discussion about the open source Shiny Server, and then we can kind of grow from there into what Pro adds on to that.

But, you know, I mentioned here a couple of the features that are often most compelling for Pro. So, you know, we do user authentication with whether that's LDAP or Active Directory, and then also we do application scaling across multiple processes for a single application, which is helpful.

How Shiny speaks WebSocket

So okay, so to go kind of all the way down the hole, we get down into the Shiny package, and I suspect that most of you are familiar with Shiny, but basically Shiny is, as you know, an R package, but what's important to realize is that Shiny speaks WebSocket. And so what that means is that any browser that's compatible with WebSocket can connect directly to a Shiny process and do, you know, the communications there back and forth between those two entities.

So that's great, and you probably notice this when you're developing applications locally. So if you're using RStudio , for instance, and you press run app, you're going to get a little viewer. That viewer is communicating directly to your R process, or, you know, if you open in a browser, you're communicating directly with an R process, and there's no Shiny server intermediary between those two, and that's because Shiny speaks WebSockets. So it is possible to open a direct connection from a browser into the Shiny process.

So that's great, but there are a couple of limitations on this. First of all, as I mentioned, that only works for browsers that can communicate over WebSockets, and so that rules out Internet Explorer 8 and 9 and some other versions of Firefox and Chrome and things like that. The other problem is that the Shiny process has to be running at the moment you try to open a connection to it, and so that can be a little complicated in that if you have, you know, as an organization, maybe you have, you know, hundreds of Shiny applications that you've developed, and at any given time, maybe, you know, a handful of them are actually being used and interacted with.

The problem is here that, you know, in order to communicate, Shiny would have to be running all of, like, you would have to be running one process for each of those applications all the time around the clock, and that creates a lot of problems in terms of resource utilization. If you're trying to run these, you know, hundreds or even thousands of processes all the time, even though many of them may not be used, you know, but, you know, a few minutes a day. So there are a couple of limitations to just running Shiny on the bare metal, but it is possible.

What Shiny Server sets out to accomplish

So Shiny server sets out to accomplish a few goals. The couple that are interesting for this discussion are, first of all, it should start and stop Shiny processes on demand and as needed, and also it should translate non-WebSocket traffic into WebSocket traffic that Shiny can understand. And then finally, it can also map different URLs to particular applications.

So if you were just to run an army of Shiny processes on your bare machine, all of those would have to run on different ports, and so your users would need to enter a port number in order to get to an application, and it would be, you know, a little sloppy. And so Shiny server offers a kind of a cleaner, more convenient way to be able to navigate between the different applications, and we do that by offering a few different hosting modes. So you could just host one application. You could allow users to host their own applications, or you could host an entire directory of applications, and you can do these at different URL subspaces. So it just allows you to kind of more meaningfully manage, you know, many Shiny applications.

And so a couple of these to kind of revisit. So starting and stopping Shiny processes, so again, this is, you know, if you're running this naively, you would need to be running that Shiny process all the time, but because you're running Shiny server as kind of this long-running, you know, always-on daemon, what that means is that when a user goes to connect into the Shiny process, if it's not running now, Shiny server can kind of hold that request for a second while it goes to start the Shiny process, and then once the Shiny process is ready, it can let the traffic through into that process, so it can kind of broker and manage the communications between applications that may or may not be running at the moment they're requested.

And then also if it notices that all the users have left and that application's been idle for some configurable length of time, then it can go ahead and reap those processes to free up some more resources for other applications.

And again, like I mentioned, older versions of Internet Explorer are not going to be able to handle WebSockets natively, and so Shiny server uses a library called sock.js, which basically allows you to work as if you were working with WebSockets, but it will kind of gracefully fall back into other protocols as needed. And that's actually really important, even if you're not planning to support older versions of Internet Explorer, because many networks, if you're using older network equipment that was designed before WebSockets gained popularity, then you'll actually end up in a situation where they may be dropping or blocking WebSocket traffic, and we see this quite a bit, especially at universities and academic settings, they for some reason tend to have network equipment that isn't friendly with WebSockets, so you can kind of end up in the situation where even if you're using all the latest and greatest technology, there may be some router between you and your client that is going to interrupt your WebSocket traffic, and so it's nice to have a technology that will allow you to fall back to something that is supported if needed, so that ends up being a pretty important feature.

And that's actually really important, even if you're not planning to support older versions of Internet Explorer, because many networks, if you're using older network equipment that was designed before WebSockets gained popularity, then you'll actually end up in a situation where they may be dropping or blocking WebSocket traffic, and we see this quite a bit, especially at universities and academic settings, they for some reason tend to have network equipment that isn't friendly with WebSockets, so you can kind of end up in the situation where even if you're using all the latest and greatest technology, there may be some router between you and your client that is going to interrupt your WebSocket traffic, and so it's nice to have a technology that will allow you to fall back to something that is supported if needed, so that ends up being a pretty important feature.

So this is, as you probably know, called sticky sessions or sticky load balancing. What that would mean, then, is that when a user comes into your load balancer, if they've already been directed to a single instance of the Shiny Server Pro instance for some application, that they would continue to have all of their traffic, either HTTP or WebSocket, directed back to that same instance, and then from there, we'll handle the rest of the complexity of making sure that they get back to the right worker.

The one other piece here, and this is kind of nuanced, we can talk about the details of this later, but so, again, since we're using encrypted cookies to store the user's authentication information on their browser, you would just need to be aware that the two Shiny Server Pro instances should probably use the same key when going to encrypt those cookies so that they can both understand the encrypted cookie that may be stored on the user's browser, regardless of where they get balanced in the future.

So, again, we've done this before, and we've seen this model set up very well, where you do this high availability setup, just using sticky sessions, you'll just drop a cookie on the user's browser, either at that subdomain, or make it applicable to a certain path for the application, and then just make sure that for the rest of the session, any time that they go to access that application, that they're always deterministically back to the same Shiny Server Pro instance. And that is it.

RStudio Shiny Server Professional Architecture

Transcript#

How Shiny speaks WebSocket

What Shiny Server sets out to accomplish

Shiny Server architecture overview

Shiny Server Pro features and architecture

Load balancing and high availability

Featured software#

rstudio

Shiny