Resources

The Road to Easier Shiny App Deployments - posit::conf(2023)

Presented by Liam Kalita We're often helping developers to assess, fix and improve their Shiny apps, and often the first thing we do is see if we can deploy the app. If you can't deploy your Shiny app, it's a waste of time. If you can deploy it successfully, then at the very least it runs, so we've got something to work with. There are a bunch of reasons why apps fail to deploy. They can be easy to fix, like Hardcoded secrets, fonts, or missing libraries. Or they can be intractable and super frustrating to deal with, like manifest mismatches, resource starvation, and missing libraries. At the end of this talk, I want you to know how to identify, investigate and proactively prevent Shiny app deployment failures from happening. Presented at Posit Conference, between Sept 19-20 2023, Learn more at posit.co/conference. -------------------------- Talk Track: The future is Shiny. Session Code: TALK-1089

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hello everyone, thanks for having me here. Welcome to my talk on making Shiny app deployments a little bit easier. So, imagine you're a passionate data scientist. You've got this great idea for a Shiny app, and you're like, right, okay, let's go ahead and make it. So, you spend a bunch of time creating your app on your development machine, and you might spend some days, weeks, maybe even months in some cases, and you're really looking forward to the end result. And that's a valuable Shiny app. Something that's super useful.

Something that could be maybe used in a business context. Yeah, so you have your valuable app, and it runs super smoothly on your machine locally, and it's a testament to, God, that's an awful picture, all the hard work that you've put in. So now comes the moment when you want to gift your creation to a wider audience, and that means thinking about deployment.

So if you're using posit connect, it has a nice easy workflow for deploying and managing apps. It has a great user interface. It has this smooth one-click deploy functionality. The thing is that someone has to maintain this platform. If you don't want to maintain a platform, then Shinyapps.io has a very similar workflow. It's maintained by posit. You don't have to worry about keeping the underlying system updated, but maybe your project has restrictions on data residency. If you're a developer, and you don't care about any of this stuff, you might be in a company where you give your code to another team, and they might look after the deployment, and they might come up with some solution using the open source Shiny server or Shiny proxy. There are a whole bunch of ways to deploy a Shiny app. That's just a couple of them. Whatever deployment method you've gone with, on the journey making this app, there's been a bunch of hurdles that you've had to jump over, and I guess the final hurdle of that is deployment. So it would be a real shame if the app fails to deploy, because it would be like falling at the final hurdle.

Two personas in the deployment process

So what can we do to help avoid this? Well, who's we in this question? So I like to think that there's two personas involved in the deployment process. Now, these could be separate people, it could be two separate teams, or it could be the same person that does both sides and sees the whole deployment process from start to end. So I like to think that the first persona is a developer, so someone who writes code to create applications, they're responsible for the design, testing, debugging, they work on functionality, user interface and features. And the other persona in the process is IT operations. So these are people who are responsible for setting up, deploying and maintaining applications on servers. Now, yeah, they make sure it's properly configured, hosted and available for the users. So it could be possible that this persona looks after more than just Shiny apps, and they could be maintaining a platform that Shiny apps are deployed to, such as Posit Connect.

So a little bit about me, I'm someone who fits into this IT operations persona, I'm not much of a Shiny developer, as you can probably tell, I don't develop Shiny day to day, but I know enough to make silly cat apps. So yeah, I'm in this IT operations persona because of my role in this company, and that's Jumping Rivers. So I've been with them just about two years, we offer all sorts of services related to the data science fields, we offer training in a variety of topics, so primarily for R and Python, but things like Stan, Git, Scala and more. So out of these topics, like what I specifically do, are probably these three things. So in particular, it's managed Posit services and infrastructure. So that's the service where we aim to take the pain out of creating, managing and maintaining cloud infrastructure and data science tools. And that means that data scientists can get on with the problems that matter most to them. So yeah, I provide support to customers who use those platforms, primarily through email and sometimes video calls. So in this IT ops persona, Support & Connect, I sometimes get queries from clients asking like, hey, can you help me figure out why this app doesn't work? It works on my machine. So yeah, this talk is aimed to briefly skirt over some of the more common root causes that I've seen in my role.

Reading deployment logs

So how do you know when a car is having problems? If you turn the ignition and it doesn't start, that's probably a pretty good indicator. You might also see some warning lights on your dashboard, that's usually a sign that something is wrong. If you have a bit of know-how, there are a bunch of things you can check yourself under the hood of a car. Similarly, with Shiny app failures, deployment logs give us a peek under the hood of our deployment, which might reveal some information about why it's not working. So you can find deployment logs in different places for each method of deployment. It's like the first port call when you're troubleshooting your apps. So for RStudio Connect, the deployment logs are nice and easy to find, it's in the top right corner. There's this handy logs tab on shinyapps.io, and sometimes you need access to the server and the logs will be located in something like the VAR log directory if you haven't changed the configuration.

So just like looking under the hood of a car for the first time, looking at logs might be a little intimidating at first. They contain a lot of text, but there's often like telltale signs and if you know the sort of things to look out for, like connection errors or resource bottlenecks or missing libraries, they can all be easy to spot log messages.

Hard-coded secrets

So what are some common pitfalls that I've seen? Well, a common pitfall that I've often seen stems from hard-coded secrets. So they're not great from a security perspective, especially if you're putting code into something like a Git repository. If you imagine something like a database password being put into source codes, then you need to remember to remove the secrets and then add them back in. It could be an extra step that's sometimes forgotten. So this is an example database connection with some credentials that might be used for developing locally, and this might be what's required when you actually deploy. So unless you change the codes between each environment, you're going to get database connection issues. While this is bad from a security perspective and involves some extra effort, it doesn't necessarily cause deployments to fail, it's just, yeah, when you forget to change them between the two.

So I'm going to go into some niche European driving laws. So in the UK we drive on the left-hand side, and this is how our car headlights are configured. So basically the right-hand side headlight is dipped so that you don't blind everyone in the oncoming lane. Now the rest of Europe drives on the right. I don't know why these pictures are oriented this way. I wish they were just the same way. Slightly annoying, but that's where we are. As you might expect, the lights are the opposite way around in the rest of Europe. So driving a UK car in, say, France without adjusting your headlights is illegal, because you would be blinding everyone. Some fancier cars have the functionality to switch the configuration at the press of a switch, and some other cars didn't have that functionality, and that meant they couldn't easily adapt when you, say, went to France or something like that. So hard-coded variables kind of limit the configurability at deployment time. To get this sort of European, UK car headlight switching capability, you can do this with your code. Replace your variables with calls to environment variables. Now, this code could be committed to a Git repository and you won't be showing any sensitive values, and yeah, you'd have to change the code between your dev and broad environment. So you might be thinking, where do you actually set the values if not in the code? Well, here's PositConnect. It's a little bit small, but the section I want to point out is on the right-hand side, and this is this Vars tab, and this is the section for setting environment variables, and here at the bottom, we already have our environment variables already set. We've got the names of the environment variables that were in the code. When I click edit, I can only set a new variable. It doesn't show the old value. The variables are stored securely from other users of the platform. So this in particular is good for things like personal API keys and the like.

Now, with ShinyApps.io and ShinyServer, it's not quite as swish. You need to have this .R environment file in your app directory to set the environment variables.

Availability of external resources

So environment variables, like this point on environment variables, kind of leads me nicely into my next point, which is more the recourse to this actual problem, which is the availability of external resources. So just as cars need fuel to run, your app probably relies on external data. So your app might rely on APIs that allow external systems to communicate and interact with your Shiny app. You might pull data from a database to populate some nice visualizations. Similarly, you might pull resources from some sort of remote file or something like that. If these resources aren't available, then it's kind of like your app's fuel source isn't available. So some common causes for this are things like incorrect credentials, which I've just covered. It could also be things like connectivity and things like firewall rules. For example, with ShinyApps.io, in order to access a database that's behind something like a company firewall, you need to whitelist some ShinyApps.io addresses in the firewall, and you need to ask a sysadmin or someone who is in charge of that.

Another thing I've come across is permissions of the user in the database. So you want to ensure that your database user that's specified in the Shiny application has the necessary permissions to perform the operations. These might be read-only, and if you try and do a write operation in your Shiny app, then obviously that's not going to work.

Dependency issues

The next reason apps fail to deploy is probably the most common, and it's issues with dependencies. So libraries, packages, and files that make up your application. Just as there's a bunch of parts that fit together to make up a car, it's crucial to ensure that you have a clear understanding of the dependencies that your app requires, and that they're included in the deployment process. So maintaining this slight detail car parts list, it's kind of like what it's like, and if something's missing and someone wants to build a car from that car parts list, and something's missing, then the car's probably not going to work. The RxConnect package and write manifest function can be pretty helpful with maintaining things like this, so it's great for R package dependencies and files, but it doesn't help when they rely on things like underlying system dependencies. So many R packages contain code written in C or C++ for performance reasons, so if you use these, you need to have the development tools installed on your deployment environment. Certain packages might need system utilities, such as wget, curl, or git to download data and interact with external resources. If you plan on using R packages for generating PDF documents, you need to have like a tech distribution installed.

Quarto seems to be all the rage at the moment. I'd love to know how to use it, but my role doesn't really stretch much more than installing it. So yeah, I think I've drilled home that there's this obvious point that if something, a dependency is missing, that's the reason the app doesn't work.

Resourcing and system configuration

So another critical consideration is the resourcing on the environment in which your app's deployed. So just as a Formula 1 car needs a suitable engine to perform well, your Shiny app needs appropriate computing power and memory to operate effectively. So obviously a lack of appropriate resources would lead to a sluggish performance. It's kind of like imagine this car had a small engine, it wouldn't perform as well in a race.

So system resources. Just how you can tune a car's engine to run optimally. As a developer, you can tune your app's code to use fewer operations, like CPU cycles and memory. I think Joe covered that pretty well in the previous talk.

Sometimes there's platform configuration settings related to memory. So there's this max app image size config setting in the connect settings. So if your app uses large assets like images and data files and dependencies, and it exceeds this total size, then obviously it's not going to work.

So collaborating with IT operations to ensure that your app's deployment environment has the appropriate resources and it's configured correctly is vital for having a successful deployment.

So collaborating with IT operations to ensure that your app's deployment environment has the appropriate resources and it's configured correctly is vital for having a successful deployment.

Proactive approaches

So I'm going to quickly touch on some proactive approaches you can take. I don't have the time to really go into them in any depth. So one effective approach is continuous integration and continuous deployment. So that involves automating the process of deployment. So any time you make any code changes, it will, you know, as with any automation, will save time. You don't have to redeploy it every time you make a code change. So yeah, this will help save time.

Another strategy is containerization, and that involves packaging your app and its dependencies into isolated environments known as containers. This approach ensures consistent behavior across different environments from, like, development to production. So, like, you can encapsulate, like, your environment variables and things like that all into the Docker container. Well, I haven't even mentioned Docker yet, but that's a tool that you can use for containerization.

Yeah. Moreover, monitoring and alerting systems play a pivotal role in catching issues before they escalate. So, yeah, just as modern cars come equipped with sensors that, you know, come up with lights on your dashboard, monitoring tools can notify you of things like performance anomalies or resource usage spikes and errors. And, you know, alerts can help you take action if, like, it reaches a certain threshold or something like that.

Recap

Yeah, so just to recap some of the things I've covered in this talk, if you can take away at least one thing that I've mentioned that you haven't heard of before and you can do some Googling around it, then I guess I'd be really happy. Yeah, and you can look out for it in your next Shiny app deployment. So, yeah, as I said, like, logs will give good clues and nowhere to find them is good knowledge to have. I went into hard-coded secrets, probably a bit more in-depth than the others. But most often the different types of dependencies, either external or internal, are the main reason why deployments will fail. So, thanks for listening. If anyone's got any questions, I've got some contact details on the right-hand side if you wanted to email me about anything. And we've got some blogs at the link on the bottom. Yeah, thanks for listening, guys.

Q&A

We do have time for a couple of questions. And your first question is, do you think there is anything that the Shiny library and POSIT developers could do differently to make the deployments easier for IT administrators?

I guess I'm a little, yeah, I feel like a bit of a deer in headlights. Can you repeat the question, please?

Sure, so the question is that, could the Shiny, the developers that create Shiny, or the POSIT developers in general, could do something differently, could they help you somehow, you know, to make the deployments easier?

I guess things like the manifest and like the RSConnect package help make deployments a lot easier. Yeah, I guess like guidelines to provide to developers and things like that would be, yeah, beneficial.

Okay, you have a second question, which is, one thing that can be frustrating is differences between dev and prod environments that do not raise errors in the logs. So, do you have any tips for diagnosing these, so you don't have any errors, you know, for the dev environments in the logs, but you do for the production environments?

So well, I guess you have errors in the logs for the prod environment. So I guess, yeah, looking through the logs, I guess scanning them for keywords and maybe some of the sort of bullet points that I've mentioned. Google, Google's a good, yeah, Google's a good place to look.

I guess one strategy is to make the prod system as similar to the dev system, you know, as possible, then you would catch these things earlier.

Yeah, yeah, that's a pretty obvious one, yeah, I guess, I should have said.

So let's thank all the speakers of this session.