Resources

Marcus Adams | Putting a GMP Shiny App into Production | RStudio

Full title: Not The App We Deserve. The App We Need: Putting a GMP Shiny App into Production In February 2020, the Digital Proactive Process Analytics (DPPA) group within Merck’s manufacturing division officially launched a Shiny app to automate the creation of Continuous Process Verification (CPV) reports into production. That’s right – the almighty, mysterious, coveted production. From a technical perspective, the app is nothing particularly special (except other than getting LaTeX successfully installed to support the use of R Markdown). Users enter a few parameters and out pops a PDF with a series statistical analyses of a product’s quality testing data. The R blogosphere is filled with examples of similar Shiny apps. What mattered was the app was in production, and furthermore it was approved for GMP use. This meant these reports could be submitted to the FDA and other regulatory agencies. This meant the data could be used to support product release decisions. This meant Merck’s engineers were about to save thousands of hours per year in compiling data, generating charts, and calculating summary statistics. This was the app manufacturing sites needed. Most of the work in getting this app into production was not implementing the top-level features. Sorry, no discussion of fancy statistical process control methods here. Instead this talk will discuss some of the many things the development team (none of which came from a software development background) needed to learn in order to create a robust, secure, and maintainable production application. About Marcus: Marcus Adams is an Associate Director, Engineering at the biopharmaceutical company Merck. He earned his BEng and MS in Chemical Engineering from the University of Delaware and Villanova University, respectively. His more than decade of experience at Merck spans the bio-pharmaceutical spectrum and includes experience in pre-clinical PK/PD modeling, product commercialization, in-line technology support, procurement, and vaccine distribution technology development. Currently, he works as a part of the Digital Proactive Process Analytics team, leveraging Merck’s Big Data Platform in the development of manufacturing information data models, report automation tools, and integrated-systems analysis applications. His professional interests include effective digital visualization, reproducible research/analysis, and convincing his coworkers of the diverse, flourishing world beyond Microsoft Excel

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hi, I'm Marcus Adams. I work in Merck's Pharmaceutical Manufacturing Division as part of the Digital Proactive Process Analytics Group. This is our mascot, Rex. See, like many things for our prehistoric friend, we were finding that the data, we could see it, was often just out of reach. And that made us sad. Now we work to get that data into the hands of our scientists and engineers at our manufacturing sites around the world.

I am incredibly honored to be here with you today. I've had the privilege of attending RStudio Conf in the past, and I've always found it to be an incredibly enriching experience. You get to hear about exciting new ways people are using the R language. You get to have great conversations with wonderful members of the R community. And you get to take bets on who will have the longest line at the book signing event. Hint, in 2020, it was not Hadley and Chukow. Unfortunately, I can't help you with any of these things. In fact, I'm about to show you possibly the world's most boring Shiny app. But it is a Shiny app in production.

Here it is. Text input box, paste in our search. Could have uploaded it. Too fancy. Lock it. Check the sand text. Great. Click a button and wait some more. I could sing for you, but my go-to karaoke song is Rap God, which is six minutes, and we shouldn't wait that long. Great. It's done. Click a button. Download the PDF. Here it is. Here's our report. Here's our table of contents. Quality attributes. Skip to the first one. It's our control chart. Some run rules. Summary statistics. And histograms. Look at the distributions before and after a process change.

And I know. Why am I showing you this? This is R Markdown 101. Creating a PDF report? Are you kidding me? That's not what's important for us. What was important for us was that this is saving tens of thousands of hours each year, the equivalent of millions of dollars in productivity. More important is that engineers and scientists can now use those hours for much higher value activities instead of just doing rote data polls and visualization. Most importantly, this is what our sites needed. When we went out to one of our largest manufacturing sites in Singapore, they told us, we can't even think of supporting an advanced analytics project. We need to get resources off of required activities like CPV. CPV is a continuous process verification report you just saw. And that report is considered GMP. If you're not from pharma, that means it adheres to good manufacturing practices. In other words, this is the kind of data and reporting that we would submit to the FDA and other regulatory agencies around the world. This is the kind of data and reports we would use to make release decisions that can have impact on millions of patients around the world.

Most importantly, this is what our sites needed. When we went out to one of our largest manufacturing sites in Singapore, they told us, we can't even think of supporting an advanced analytics project. We need to get resources off of required activities like CPV.

What it really takes to reach production

And so when we think about this, we think production. We think production plus high level of regulations. When the features may be a few and it may seem boring, it's sort of like a transformer though. There's more than meets the eye. And Fred Brooks in his iconic mythical man month, he describes this. He says there's this tar pit and you start out with a program in the upper left hand corner. And this is your core features. This is what you crank out in a weekend. But this is not what you deliver to your end users. If you want to make a programming system, integrate that with databases, your authentication server, that's going to take three times as much work. And if you want to make it a programming product, document it, test it, generalize it. That's also going to take three times as much work. And then if you want to move to a programming systems product, what we might consider a production app, that's going to take nine times the amount of work. So your original program, that one ninth, that one ninth, that's just your recurrent neural network. That's only such a small part of the end application. The other 89% is what it takes to create a reliable, secure, and maintainable app in a production environment.

Now, if you're not an app developer or you don't come from a computer science background, you might not be familiar with the bulk of what it takes. We certainly weren't. We're just a bunch of chemists and chemical engineers, and we're not the exception. In fact, a Birchworks study in 2019 found that only 21% of data scientists come from a computer science background. The rest of us, we come from places like statistics, business, economics, or the natural sciences. Now in true Pareto fashion, luckily that 21% has done a lot of heavy lifting for us. And for us to get to a production app, we had to learn a lot of things, but I don't have time to share all those. Instead, I'm just going to share three of them with you today.

Code like a team

First, no excuses. Code like a team. Before this, a lot of our scripts and apps, we were able to develop by ourselves. They were small enough we could handle that. But when you multiply your workload by 9x, you're going to need some help. And when you're working with others, it's really good to kind of remember golden rule. Coding with others, it's sort of like when people come to visit your house. You got to clean up and put on real pants.

And first and foremost, you're going to use version control. And you may have heard of Git before. You may have seen it with packages on CRAN where the source code is. But I think what doesn't get enough attention is the workflow and the branching strategies. You don't want to be working on a single branch because I guarantee you, if you do, somebody will make a commit that will break the code and it will be five minutes before you're supposed to give a demo. Now there's many options, but for us, we went with a modified Git flow. And this means at the very low level, we have feature branches. And this is where we're doing a lot of the work. We're doing the experimentations. Things can break and that's fine because they're isolated. But then we merge them back into our development branch. This is kind of the functioning application. This is our working copy of the app. And as we test and develop that more and it's ready, we then merge that into our master branch. And that master branch is what gets deployed into production.

And to get there, you have to document, test and absolutely repeat. So Roxygen makes it so easy to code and have that documentation right there in the same place. And then you can render in a much more even user-friendly version. Testing, that's also a form of documentation that gives examples on how to use your code. And of course, both of these together allows people in the future to make changes to your code much more rapidly and know that they aren't breaking core functionality. And you may be used to or familiar with unit testing. This is the test that, the R unit, the tiny test. But there's also even more testing beyond that. There's the UI testing. Get users to actually try your application out, see if they can do something useful with it or automate some of that with Shiny test. And then there's the one that probably gets the least attention, load testing. For us, a lot of our scripts and applications were running on local computers. We have one user at most. But when you start to put something in a production environment, you're likely going to have concurrent users and your application is going to run differently with 10 users than it will with one. And it might run even more different with 10,000 users. You can absolutely scale up to 10,000 users, but you definitely want to test it before you do.

And to get there, you really want to just take advantage of R's natural coding structure. Take advantage of the fact that functions, packages and Shiny modules naturally lend itself to the division of labor. So one person can focus in on the details of a function and it's abstracted for the rest of everyone on the team. They just need to know how to call that function. And that's exactly what we did with our application. At the top level, we have the CPV reporting app. And of course, that relies on a bunch of common packages, right? dplyr, purr, and of course, Shiny. But for us, we also created four custom packages. We have the one that connects to our analytics platform, Mantis DBC. We have CPV reporter that actually compiles everything into that PDF. We have accelerator that will take the data and output it in a reusable format if people want to do additional calculations. And then we have PPXQC that does our calculations in the report, creates the control charts, calculates the run rules, creates histograms. And the beauty of this approach is that now you've put all this effort into creating this code, you've industrialized it, you've tested it, you've documented it, and you get to reuse it for other applications. As much as we have users who use the reporting app, we also have tons of data scientists and our users in our manufacturing division that are now much more easily able to connect to our analytics platform. And I get emails probably once a week about this.

Managing environments

Next, I want to talk about environments. See, US President Woodrow Wilson once said, if you want to make enemies, try to change something. Data scientist Marcus Adams says, if you want to make errors, try to change environments. Because when you talk about a production environment, that implies that there is also a non-production environment. And you may have more or less than what's shown here. You have dev, test, prod, you may also have a QC environment. But the important thing is you want to separate from where you're making changes to where people are actually using your code. You don't want to be testing things out, you know, tweaking things here and there all the time. When you're in a production environment, you have a certain level of stability, right? And as you move from development test to production, you want things to be more controlled, more tested, more stable. People have to be able to rely on this app to be there and work in a consistent manner.

And I'll say something that's a little more contentious. There are no production apps. There are only apps in a production environment. And that production environment, you have to define its requirements. And those requirements come from how you're going to use your application. Now, maybe you have loose requirements and low expectations, kind of like going to a care talk show. But for us who are using it in a GMP manner, we have to have things like audit trails, timeouts, things like change control. My point is, though, you want to get away from this abstract mythical land of production, right? This will lead you to arguments about whether R is ready for production. And it moves you into these very specifics that you can start to articulate exactly how you're going to satisfy these requirements. And you can take your focus and make it on industrialization and generalization of your code.

There are no production apps. There are only apps in a production environment.

The seven circles of R reproducibility hell

And more importantly, you can start to tackle the seven circles of R reproducibility hell. I know what you're asking. You're thinking, why does Beelzebub have an R tattoo across his chest? I don't know. I would have assumed that Dark Lord would be more closely associated with Python, but maybe he lost a bet to the archangel Gabriel. I don't know. But here we are. And as Dante begins his descent, he first comes across the code version. That is the code you write and is maintained by Git. Or if not by Git, this is your final version or version final, final, final, final 2.0. And that top-level code itself will rely on certain versions of packages. These are your top-level packages. Is it dplyr 1.0 or dplyr 0.8? And each of these packages, in turn, have dependencies, right? And these are all the names that you don't really ever pay attention to as they're scrolling by when you install a package. BH, Farva, Scales. But each of them also have a version they require. And those all run on top of an R language version. Are you using R3.5 or R4.0? There are breaking changes between those. And moving down, there's the runtime environment. And I think this is the most underappreciated one. So certainly you want to manage environment variables, things like proxy settings. But you also have to think, there's differences between running in an interactive mode and running in batch. When you don't have access to the console, it makes a difference whether you're running your application out of RStudio or running it on the Shiny server. And then from there, you also have external dependencies. These are your system's libraries. Think about trying to install R Java. You have to have that JDK installed as well. Or if you've ever had the pleasure of trying to install a tech to make sure you can generate those PDF reports. And then the innermost circle is our operating system. Code runs differently between Windows and Mac and between Mac and Linux and Linux and Windows. That's why some packages don't work on all operating systems. A simple example would be that they use different path separators, you know, backslash or forward slash. But then there's also more subtle things like how they render visualizations.

And I can tell you that we faced every one of these circles in our journey. But I'll give you one example now. See, here are some wonderful control charts. And on the left, we have the version in the development server. And on the right, we have the version on our test server. Identical code, but no errors, no warnings and most importantly, no resplendent dark green line that shows the mean of the data. Now, I want to point out that this is a dark space green line. And that's important because officially there shouldn't be a space there. However, we were using this code for years before we actually even deployed it to the development server. And it worked. So you've got to be asking yourself, how does a single space pose such an existential threat to our application?

Well, like most things in life, timing is critical. You see, when you specify version requirements for your packages, everything downstream of that kind of sits in this quantum superpositions, right? The actual versions that will get installed doesn't happen to actually install those packages. So our PPX2C package relied on gplot2, of course, right? And then turn required scales. Now, before October, sorry, November 19, 2019, it resolved to scales 1.0. That could handle our dark space green. After November 19, 2019, it resolved to scales 1.1. And that changed how it translated that string into a numeric value to represent that color. And inadvertently broke backwards compatibility. We just so happened to deploy to test in late November 2019. Now, we caught that because we ran tests like we should. But I'll tell you, PackRap wasn't sufficient to catch this. We tried. What we really needed here was CRAN snapshots, either from Microsoft's CRAN snapshots, RStudio Package Manager, or even internal CRAN snapshots. And to give the cherry on top of this, and a testament to really great package maintainers, this problem only existed about a month. By mid-January, this was solved. And so the moral of the story is, don't deploy early. Let the package maintainers fix everything first.

Running without direct intervention

Now, lastly, I want to say, if you can't run your app without direct intervention, it was never an app to start with. Your app will suffer the slings and arrows of outrageous fortune. And by that, I mean crazy input you never expected users to have. You know, like 12 digits of precision. Your code has to be robust to survive the thousand natural shocks the app is heir to. And it has to do it without your direct intervention. But just because you can't directly intervene doesn't mean the app can't collect information for you. First and foremost, log. There's plenty of programs out there that help you with this. Feudal.logger, Logger4R, Logging. It's one thing to log errors in the crashes in the stack trace when that happens, but you also want context. Your app is kind of like a kid, too. You don't want just your teachers calling you when something bad happens. You want to know the good things that happen. You want to know how people are using it. What are they submitting? What are the queries they're looking for?

And you know what? It's kind of like your child. It really is, because you put a lot of work into it. And so like a good child, also make it call home. Use things like Google Analytics to let you know who and where this app is being used. Google Analytics tags are fairly simple to copy some JavaScript into your UI code, and then you get to see all these analytics about how your code is being used. You can see peak volumes. You can see what time of year this application is being used. Maybe you want to scale up your production servers. And lastly, you may actually want to have your application adopt to its environment. There's some great packages out there that allow you to do just this. The config and secrets package are just that. In a simple example, for us, we needed to show a banner in our dev and test environment that said, this is not for official use. But more complex for us was password management between the environments. You see, for us, each of our environments, we had a service account, and that service account had a password that connected to a different database for each environment. And our dev environment should not have access to our production environment database, and our production environment should not have access to our dev database. Furthermore, it was a security requirement that we could not store these passwords in clear text.

And I'll say this, that credentials and secrets management is such a deep topic. I am by no means an expert on it, but I'll only share what we did, because that solution was inspired by a hallway conversation I had with the R podcaster himself, Eric Nance, at R Pharma 2019. When our client, the UI, contacts our server, we look at the domain, the URL requested, specifically the subdomain. And this is great, because we actually have multiple production servers sitting behind an application load balancer. And so by looking at that subdomain, is it shiny dev, shiny test, or just shiny for a production environment, we can then use the config package to go grab the correct parameters from a YAML file. Specifically for us, that is the ID of the service account, the database to which we should connect, and the name of the secret in our vault that holds the password. See, we've encrypted those passwords in the vault, and with the server's individual private SSH keys, they can decrypt that password and use that to go grab the data from the correct database. And this is really useful, because we can encrypt the password and make a commit to our Git repository. And only a few developers need to know those passwords, and only those servers, when we deploy them, should be able to access those passwords. Rotating the passwords, we just make a pull request with the updated password, and that can be deployed out to the proper server.

Closing thoughts

And I get it, password management is not exciting. But deploying is. It is totally worth it. And I admit, my head jerks every time I hear the word production in a presentation title. But getting there is not all natural language processing and random forest. And hopefully you realize by now, actually most of getting there is not. What I've talked about today really only even scratches the surface. Just know though, you can use R and Shiny in a production environment. We are not even the first to do so. Just last year, Heather and Jacqueline Nullis at RStudioConf talked about how T-Mobile does it. We are just yet another example of how you can do it. And if you need to, name drop. We are a Fortune 500 company, and we are using it for a very important reporting process.

Look, putting a Shiny app into production is kind of like playing the old computer game, the Oregon Trail. Now, you're not guaranteed to succeed. It might not be pretty, and it might be hard, but it can be done. Others have done it before you. You just need to figure out how to do it. And of course, avoid dysentery. But that's good advice even beyond having Shiny applications. And like those pioneers crossing the Rocky Mountains, we too have crossed over, figured out how, and reached the top of the mountain of production. And as we look out over this vista, I'll leave you with these words. You can't always code what you want, but if you build sometimes, you might just find you code what you need.

You can't always code what you want, but if you build sometimes, you might just find you code what you need.

Thank you for your time. I hope you got something you needed from this talk, and I hope you enjoy the rest of the conference. Thank you.