
Damian Rodziewicz | Scaling Shiny to Thousands of Users | RStudio
From rstudio::global(2021) Shiny X-Sessions, sponsored by Appsilon: in this talk I will discuss how to scale Shiny dashboards to thousands of users. About Damian Rodziewicz: Damian is one of the four co-founders of Appsilon. Before founding Appsilon he worked at Accenture, UBS, Microsoft and Domino Data Lab. Learn more about the rstudio::global(2021) X-Sessions: https://blog.rstudio.com/2021/01/11/x-sessions-at-rstudio-global/
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Today, in this presentation, I would like to share with you how to scale Shiny to thousands of users. My name is Damian Rodziewicz, and I'm one of the founders of Appsilon. I'm also a technical person, so I work hands-on on a lot of projects that we have. That's why I have first-hand experience in scaling Shiny applications. Be sure to leave your questions. I'd be happy to discuss them, and you can contact me later if you have any further questions to discuss.
So, let me start with a very simple success story that I keep seeing over and over again, and I'm really excited about this. Thanks to Shiny, you and your team are able to build a successful app very quickly. Sometimes, it is just a matter of days. You start having your first clients, customers, users. Very often, these are internal users in your own company. You're helping them understand better their process and go through the process much faster. Suddenly, you have more users that are using your application. Very often, we see that people who start using your Shiny application love it from the first day, because the fact is that, thanks to Shiny, you are able to implement new features very fast.
Shiny is super flexible. You just make the changes on the go. Everyone loves the application, and it does exactly what people want to do, because this is a dedicated software that you built for them. Suddenly, you realize that you have a huge amount of users that are using your application, and it starts to become a problem, because some will report to you that your application is actually slow, or some will tell you that they are blocked and they cannot click around your application. This is usually when you decide to seek for the solution. These are usually the questions that we are asked. How to make my app faster? How to scale my Shiny application? How to build a scalable Shiny enterprise application?
Asking the right questions before scaling
One thing that I would like you to first think about is the fact that most of those questions usually start with a question, how? We are very used to trying to find the solution as fast as possible. We want to find good tips and tricks. We want to understand what's going on and just apply new techniques. But there are two other questions that are even more important that you should be asking. The first one is why, and the second one is what.
Let me share with you a quick story. There was a client that had their car broken down, and they went to the mechanic. The mechanic looked at the car. They realized that the engine is broken. They took the hammer and just smashed the engine. Then, suddenly, the engine started working again, and the client asked, okay, that's awesome. How much do I pay for this simple service? The mechanic says, you're going to pay $100. The client wasn't happy about it, and they asked for the invoice to see why exactly hitting an engine with a hammer actually costs $100. These are the contents of the invoice. The $1 for just hitting an engine with the hammer, but the $99 is because the mechanic understands why the engine is broken, understands what the root cause is, and then knows what exactly to do to get rid of this root cause.
The $1 for just hitting an engine with the hammer, but the $99 is because the mechanic understands why the engine is broken, understands what the root cause is, and then knows what exactly to do to get rid of this root cause.
What: vertical and horizontal scaling
Now, before I jump into why, which I think is the most fundamental question that you should be asking, let's talk about what so that you have the whole context of what we usually do when we want to scale the Shiny application. When you think about scaling Shiny, there are two groups of actions that you can make. The first one is called vertical scaling, and this is increasing the amount of users for one machine. You can do it in two ways. The first one is just adding more resources to your machine, so add more CPU, add more memory. The second way of vertical scaling is making your Shiny application leaner, so allowing more users to use it. The second group is horizontal scaling, which is just adding more machines.
To put it in this context, the first two are actually fairly simple. I know I'm very often asked about how to scale Shiny horizontally, how to have multiple servers connect. Let me tell you this. This is actually very, very simple. I'm going to share with you resources at the end of this presentation so that you can take a look at yourself on the contents. Basically, the first simple step that some people usually do is just ask the DevOps IT if they can increase the size of the server. They add memory, they add CPU. This right away gives you more users that can use your application. Of course, you have to pay for that. Very often, it is difficult to find additional budget for increasing the machine, especially if suddenly you have 10 times as much users and you need a really, really big machine then.
The second step that you can do is talk to DevOps IT and ask them to add more servers. This is as simple as just spinning up additional virtual machines if you are using cloud or a little bit more complex if you have physical machines that you have to turn on and configure. The good news is that RStudio Connect is super easy to configure. There are just simple steps that you need to go through. Suddenly, all of your machines are going to run RStudio Connect. All of them are going to run your application. The only thing it costs you is actually real money for the machines.
Making your app leaner: the hard part
Now, the third thing that you can do is the most difficult one and the one that requires you to understand why the application might be slow. This is for you as an R Shiny engineer to find the bottlenecks and to understand what slows the application down and to make the application leaner. There are three main things that you can do. Of course, there is much more techniques that you can apply, but the key parts are first, to leverage the front end, to use JavaScript. As you saw from the previous presentations, it's not that complex and sometimes you don't even need to understand the JavaScript. If you don't use server to generate HTML and send it back to you when you make changes, it already saves you a very valuable time of your processor.
The second one is to extract computations. This also decreases the CPU usage of your application. If your application is doing something heavily, think about a way to extract those computations somewhere else to leverage the ability to use external services and not put all of the pressure on your Shiny application server. The third one, which is I think the most commonly used, is to just use a database. I have heard a huge amount of success stories when someone just decided to move all of his data or her data into the database and suddenly the application was easily scalable up to hundreds or thousands of users. This gives you two advantages, which is less memory usage and less CPU usage.
Why: understanding Shiny's architecture
This is more or less what you need to do to scale your Shiny application. Now, I would like you to fully understand why we actually have to do this, because when you think about it, RStudio Connect is a great product that allows you to scale Shiny applications up to tens of thousands of users. You might have seen Sean Loop's video where he shows how this is possible. Also, Shiny is very fast on your local machine when you just run it. Why should it be slower for others when there is just more people using it?
Let me tell you a story based on a very simple application. If you take a look at the code, this is the most simple one that you can write. There is a slider and there is a text output. The only thing that the server does here is the server gets the input slider value whenever it changes and sends the value back as output text. Now, what happens behind the scenes when you send such application to the server? First thing when the user connects to the server is the browser is going to download HTML that is generated by your UI function, CSS, JavaScript, all the static files like images. At the same time, server is going to create a separate process for you so that you can start your Shiny session there and you can execute the R code.
Now, the thing about processes is that R is single-threaded and you can specify how many users you want to have for one process, but you need to realize that if there are two users that have the same process assigned and R is single-threaded, then when one process does something, then the other user cannot actually do anything. I will show you this in a moment, but what you should understand here is that the server creates a process or adds the user to already existing process to handle all of the operations on your reactive graph. Now, the second thing that happens is the browser creates a connection through WebSockets, and through WebSockets, there are going to be data and information being sent back and forth. WebSockets are slightly different than the typical REST API that you know because they allow you to have a bidirectional communication, and that's why the server can actually push some messages to the browser, which is not that possible with typical applications that have REST API.
So, the second step is the browser actually binds the Shiny inputs and outputs, and it starts the WebSocket connection with the server. Now, let's say that you are moving the slider and you set the slider to five. The browser is going to send this value to the server. The server is going to check this value, trigger the reactive computation, and return the resulting value. So, in our example, it is just going to respond to the browser that text is five. Now, coming back to Pedro's presentation, if instead of just returning a simple value, we returned an updated widget, then you would actually have to send the whole HTML if you don't use the update function that Pedro was talking about. If you used update function, then the browser is smart. The browser receives only the data that has changed, and on the JavaScript part, it's going to update the HTML. So, this is a huge value already by using the update functions.
Long-running computations and blocking
Now, when it comes to the browser and the server, after all of the computations are done, the browser is just waiting for another signal either from the user or from the server. Now, this is a very simple example. Now, let's take a look what happens when you actually start doing some difficult computations. Instead of just returning the value, I'm going to perform a long complex CPU operation based on the input slider value. Now, the same thing that happens, you send the value through the web sockets to the server, and within that process that is started for you, the process is just calculating the long-running computation that you required him to do. And the problem is that your user no longer is able to make any actions within the application. Everything is gray. It is waiting for the output. And at the same time, new users or old users that have already been using this process also cannot interact with the server and have to wait for this computation to finish. And this is one of the main issues that we see in applications.
When there is plenty of complex computations happening for one users, the other users are blocked or the CPU usage is so high that the other users are seeing that things are going slower for them. So, in order to get rid of this problem, you can use multiple solutions. First one, extract some computations to the database. I will show you this in a moment. Second one, you can use Shiny Promises, which is a great package by Joe Chang that allows you to move those computations to a completely new process and makes Shiny free from any computation. You can use Shiny Worker, which is our package for similar computations. Or you can simply move some of the computations to the JavaScript to the front end.
Memory and the database solution
Now, let's talk about the database. This is the second biggest problem that I see in applications. You might create a successful application that works locally fine. But in fact, the application, in order to run, loads a lot of data into memory. And we need to realize that our computers are similar to what we have as servers. They have their own RAM. They have their own CPU. And RAM usually is around 16, 32 gigabytes. It's not a lot. If in your application, you read one gigabyte of data, and then you filter this data to do some actions in the application, let's see what happens in the real life.
You can see a machine here. You can see five users that are connected to RStudio Connect. And in this configuration, we create one process for every two users that access our application. This is configurable. This is easy to configure. But for this purpose, let's assume that we have two users per one process. Right now, if you have five users, you already have three gigabytes of data being loaded, because every process is like a separate box that contains everything that you need to trigger all of that to compute everything. And now, when you have, for example, 13 users, suddenly, you need to have seven processes and use seven gigabytes. Let's see if you have 26 users. It already uses up 14 gigabytes of data. And when you think about it, 26 users is not a lot.
So you should be very aware of the fact that the usage of your memory is going to multiply by the amount of users that you have in your application. And you should try to avoid loading too much data into memory. What you can do instead is to set up a separate database. And it doesn't have to be a separate SQL server. It could be files on your drive, on the machine's drive, that are accessed in a different way than just loading all of that into memory. You can check out Search Online for Christian Ygras' talk about different possible ways of reading data. And even Uber is having a separate package that basically reads a lot of files, the terabytes of data that you can search. This is just files. But when you have an external database, that database contains this one gigabyte. And you just execute a filter, a query, and you get the result. And here, you have two gains. First one is, of course, the fact that you don't have to load the data into memory. But the second one also is the fact that sometimes such filtering is even faster because the database is specially indexed to allow you to make the queries very fast.
Putting it all together
So just to recap, when you have a successful Shiny application, most likely from the start, because it is a prototype, it is going to look like this. There is a server. There is UI. A lot of communication through the web sockets. And the server is doing a lot of computations by themselves. The first thing to do, try to leverage JavaScript. Try to move some computations to the browser. The second part, use external server. You can use Plumber API to create a separate server that is going to do the computations for you. And then just ping the main server when the computations are done. You can use the database, which is the simplest way of already giving you a lot of edge. And then you can just scale horizontally by adding the servers with your IT team.
Now, the other way to think about it is, you don't want your application to be a slow chess player. You want your application to be the Forrest Gump of table tennis. You want it to just take the ball and give it back right away. When the application, when the front-end is asking for something, you just respond very quickly, hey, yeah, okay, I'm going to do this. And if front-end knows that something is triggered, then you just delegate your job somewhere else. And once the job is done, you let front-end know that something has changed.
You don't want your application to be a slow chess player. You want your application to be the Forrest Gump of table tennis. You want it to just take the ball and give it back right away.
This is similar to the comparison that I have in my head. You don't want, for example, your mother to come into your room and tell you, hey, now you have to do your homework, and I'm going to stand here and wait until you're finished. You want to just say, hey, do the homework, and I will be back or let me know when you're finished. So, this is the kind of way you want to structure your Shiny application.
Now, to sum it up, I would like to tell you also how to do this. There is plenty of resources that you can reach. We will share the slides with you, so we can click through those things and see different articles. There is a separate section for leveraging front-end. There is a very nice book about JavaScript for R, and there is a section for extracting computations, the Shiny Worker Package, Shiny Promises Package, Plumber API, great resources that you can just jump into and start working with. For using a database, I recommend reading the main article from RStudio that actually goes through every step that you need to have the database in your application. And to scale vertically and horizontally, there is a very nice page about scaling and performance tuning in RStudio Connect that gives you an overview of all of the configuration options, especially how many users you use per the process. And there is a separate doc about high availability, how to scale your application horizontally, how to add additional servers. As I said, this is really simple. Don't worry about it. Just try it. You can even log into AWS or Google Cloud, set up three virtual machines, install RStudio Connect, and see how easy it is to configure. That is all from me. Thank you very much. Hope this is useful, and I'm looking forward to the next talk by Marek and Filip.
