
Aymen Waqar | Building an iPad dashboard using plumber & RStudio Connect in Pharma | RStudio (2020)
As companies are becoming aware of the need to embrace data-driven solutions, R has gained a huge momentum over recent years. Getting the insights to users has become a very important factor of Data Scientist work. While our world has advanced there is a need to build not only web applications, but also applications on mobile that are available offline. We would like to share with you how within months we have gone from nothing to a production-ready application that handles 500 concurrent users in healthcare. There are plenty of challenges to solve including restricted environments, internal processes and users availability. We will show you how to overcome them and iterate fast, navigating through complex infrastructure and integrating with proxy architecture to serve applications to end users in compliant manner. With RStudio Connect and Plumber you can deploy a scalable REST API that can feed insights to your users. This allows you to go one step further and implement native applications for tablets and smartphones. With the right tools, mindset and priorities you can achieve personal success by introducing a digital transformation within your organization, starting with something as small as converting a business critical Excel file that is slow, difficult to edit and maintain, to a robust application. Step by step your organization will evolve and become empowered by your insights uncovering even more untapped potential
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Good afternoon, everyone. Thanks for joining. Today, Damien and I are going to talk about building native solutions using some of the technologies like RStudio and plumber, but in the context of Pharma and what that entails. So there's some of the things that I'd really like to share. Really quickly, my name is Aymen Waqar. I work at Astellas Pharma, and my colleague Damien, who's the VP at Epsilon Data Science. Together, we're going to talk about some of the things that, as much as possible, as much as we can share that we have worked at Astellas. So quickly, I just want to get out some of the legal disclaimer that we're supposed to share with you.
The pharma business context
So before I dive deep into what native solutions look like and what are some of the technology behind it, it is important to understand the landscape that we operate in within Pharma and what that means. So I really wanted to share some of the business context around what the drivers are of the industry. So if we kind of look at the Pharma industry, there are some of the key drivers. That could be legal, innovation, R&D. Some of them are political and social. But when you dive deep from the Pharma industry within one particular company and one particular brand, you would also see that there are quite a few number of drivers, such as cost, prevalence of disease, supply side factors. But one of the key ones that we want to focus on is the customer. And the customer in Pharma are usually three. They're the physicians, and the payer, or the patient.
So I work on the commercial side. So while there are many factors that influence these three aspects or customers, what we want to focus on is the physician, which is the health care provider. So there are multiple factors that influence a health care provider to make a decision to prescribe. And there are some campaigns, email campaigns, or direct-to-consumer campaigns, or some social programs. One of the most effective channels is always a sales representative. So just to give you an understanding that a sales representative here in this case is a person who physically visits a physician's office and details that physician to share information about the medicine, the drug, and what are some of the advantages in how it can add value and prolong life and save patients' lives.
So in this slide, I kind of want to share the overall landscape of drivers at each level and what we're focusing on, which is the biggest one is obviously the industry. But we're trying to be more granular, and we're looking at just the sales rep. What are the different drivers that ultimately lead the sales rep to make the decisions that they make when they're actually on the ground, on the field? And I also want to kind of highlight that within commercial, the role of data science and whatever machine learning department that you might have, we are working at probably some of the granular levels where we're doing forecasting or budget planning and strategic recommendation by trying to optimize some of these drivers at the most granular level so that we can give the business the most actual insights so that they can then make the best decision.
But what I want to highlight is that as large as you might think your department might be within a pharma company, you're still making a very small contribution to the total number of pieces of information that they use to make a strategic decision. Ideally, we would want to try to move the company towards more data-driven or analytically-driven company when they're making these strategic decisions, but we're not there yet. But it's important to know that even though you might have a very large team, you're making a very small contribution to the number and pieces of information that they're going to use.
Choosing the right method of delivery
Therefore, it is very important for your team to get the method of delivery right. And what I mean by that is, I'm trying to have a nice analogy here, where if you're a supplier of a product, you spend a lot of time in processing and packaging, creating the right optimized package, and then you choose a method of delivery, in this case, it could be a truck, to deliver that package to your end customer. And when you choose the right method of delivery, in this case, your product is on time, meets the expectations, you have positive feedback and trust within the supplier, and that results in a positive growth. However, if you choose the wrong method of delivery, you will have your product that's either delayed on time and doesn't meet the expectation, there's a lack of trust, and ultimately lack of adoption of where you're trying to sell.
The same thing can apply for your data science group, right? So within our team, we're working and making a lot of investments in hiring and expanding our analytical and predictive analytics team, focusing a lot on data and ETL processes and automating that end-to-end data science flow, and trying to package all these insights into visualizations into, you know, some products, could be a PowerPoint slide, it could be an Excel chart, and then sharing it with our stakeholders so that they can absorb these insights and, you know, make some actionable decisions. So when this process and the method of delivery is chosen as the right way, you would see that your insights are delivered at the right time, they're actionable, and there's high adoption. And there's also a trust and credibility that's built in between your group and your stakeholders within the company.
But when you don't have this, you know, when you don't think about the method of delivery, so you could have, you know, a lot of investments made into, you know, figuring out what the best model is, and you spend a lot of time iterating over that, but you chose or you didn't think about how you deliver those insights to a non-technical person. In this case, it could be a sales representative who's supposed to make decisions based on your insights that are analytically driven, but they're all lost in some sort of report that they can, you know, make use of it or make it actionable. And in those cases, you'll see that there's a huge adoption issue. And when that happens, it kind of questions, you know, even the existence of your department to begin with.
So it is important to, you know, it's actually better to have a mediocre data science solution with the right method of delivery than a more complex one where, you know, investments are made in every, you know, every part of the process, and you kind of miss on how you plan to deliver these insights. It could lead to, you know, lack of adoption and also trust whether these analytics even mean anything for your end customer.
So it is important to, you know, it's actually better to have a mediocre data science solution with the right method of delivery than a more complex one where, you know, investments are made in every, you know, every part of the process, and you kind of miss on how you plan to deliver these insights.
So if we try to, you know, go back and try to deep dive into the sales rep here, and we think about how we want to optimize the method of delivery here so that we can, you know, activate some of these analytics and insights that data science teams produce. One of the approaches here to do that is, you know, a data science team collectively working together, sending insights to your customer. And then when it approaches your end user, in this case, which is the rep, which you're trying to change their behavior, you would want to take that omni-channel approach where data is shared across all of their devices, and it's connected across different devices. It's real-time, it's synced, so that you can send the predictive insights, you know, at the right time, the right frequency. It is shared in a small-dosage fashion because you're trying to change their behavior to take your recommendation versus their gut instinct.
So you have really designed, you know, a set of tools here to really activate your insights to actually change the decision for a sales rep. And when that happens, some of the outcomes here we would notice is you have higher engagement. You have also, you have kind of like reinforced the strategies that are coming from headquarters and finally delivering to the end user that's on the ground. And we also see this interaction effect that happens between one of the drivers, which is the sales rep. But now because you have given them all of these tools that are interconnected, they're also able to have a way to interact with other channels.
So, for example, if you're a sales rep, you go to a physician's office and you have a conversation. You can also absorb some of their concerns the physicians are having about patient, you know, adherence to the medicine or some of the problems their own patients are facing, and then you can walk out of that office and quickly fire off an email campaign that gives the physician the right content that they were concerned about that's also compliant and approved. So you have this multi-channel interaction that's happening because of your approach and how you're delivering some of these insights.
So I think the key thing here is the business context around pushing the right information to the driver, in this case, which is the sales rep, at the right time to influence their decision through small nudges. So, you know, based on where they are, what they're doing, who they're seeing, you can actually nudge them through a phone or through your iWatch and kind of tell them that here's the next best action or here's the next best physician to call and why, and provide them that context. So you've connected your data science team in-house, internal, all the way down to the field that's out there. So I think for us, you know, when we were doing this kind of exercise, you know, a native solution was the right method of delivering this in this situation to really activate the insight. So I'm going to pass it over to Damian to kind of cover some of the technical aspects.
From Shiny POC to native app
Thank you. So usually the best tools actually start with a very simple POC. And R and Shiny are just the best tool to actually start quickly, start early, and iterate fast. And I would like to share with you the development approach that can lead you to having an offline application that is fully native and is using the R technology that you are using to build your models. So usually when you start with a simple R Shiny POC, you can iterate very quickly because of the Shiny capabilities. And then at some point, with the feedback of your users, you can create an application that is ready to go to production. And when you deploy the application and scale it well, this is the right moment for you to think about extracting an API. And the great part about it is that this API can be used not only in the native application, but you can use it actually anywhere that you want. And this builds a whole ecosystem of applications that you can build around your data.
I'm going to talk about the native iPad application that is just one part of the whole ecosystem. And the great thing about it is that actually you can reuse the backend, the JavaScript, and the stylesheets that you created in the first iterations. I would like to focus on three elements that you need to actually build a successful application. First one, I will talk about scaling. Then I will talk about plumber API. And at the end, we will talk about the native application.
Scaling with RStudio Connect
So let's start with scaling. In order to scale your Shiny application and to be successful in production, you need to follow very simple rules to have a good application architecture. First of all, try to extract the computations from the server. As you know, R is single threaded. And you need to make sure that you don't put too much load on the server while it is running. So you can use the database. You can run the ETLs. You can extract the processes away from the app and just trigger those processes and go back to the main thread. Also, you should make Shiny layer thin. Shiny is a great middleware that connects frontend with the backend. And this is great for communication. Then you should leverage frontend. JavaScript is a perfect tool. It actually allows you to create great interactions. And sometimes you don't need even to communicate with the backend to do some amazing actions on the frontend.
So when you have the right architecture, there is a time to deploy your application to RStudio Connect. RStudio Connect is great not only because you get the authentication, but actually it automatically scales all the processes behind. So it is capable to serve as many users as it is possible on one single machine. But of course, sometimes it is not enough to have just one machine serving all your users, especially if you have an application that has 1,000 users. And you have seen last year that Sean was talking about even scaling to 10,000 users. So this is not possible on a very small one machine.
So in order to do that, I suggest to have a very good preparation first. We often use Ansible that allows you to actually automate the process of installing all the software and deploying it on RStudio Connect. So when you have a bare metal or you just spin up an instance in the cloud, you can just run one single command and deploy everything, provision, install all of the requirements, and it just works. So when you have that, you can create an architecture that will allow you to serve a lot of users. You can scale as much as you want. You just add additional servers. The only thing that you need to remember is that you need to have a load balancer with a sticky session in front of it.
Now, the thing that excites me the most when it comes to RStudio Connect is the fact that not only can you deploy a Shiny application here, but actually you are able to deploy the plumber API. So once you build the whole infrastructure and you serve a Shiny application within your team or outside, you are now able to just add simple plumber API endpoints, and they are going to just work on the same infrastructure just next to your Shiny application.
plumber API
So let's look at plumber API. Probably most of you already know plumber, so I'm just going to briefly tell you how it works. plumber is a package from RStudio, and it allows you to simply change a function into an HTTP API. So when you look at the example here, you have a function that has arguments A and B, and it returns just a sum of those two. By adding simple annotations in the comments, you are able to create an endpoint that when the whole project is deployed, you can just call, for example, CURL and pass the data as a post to get the result.
Building the native iPad application
So now when we have the whole infrastructure and we have the plumber API endpoints that serve the data to you, this is probably the hardest part. So the native application. The problem is that we are not able to build a whole native application with Shiny, but as you are advanced Shiny developers, you already probably have worked with JavaScript as well. So the good news is that you can leverage your JavaScript skills or the skills within your team to actually create a native application using the React native package and library. So React native is a library that is created and maintained by Facebook, and it actually allows you to create fully native applications just using the JavaScript.
So let me quickly tell you how React works by itself. React creates an HTML structure within the web browser, and it is waiting for any events in the JavaScript event loop. So let's say the action is that we want to change the word world to React. What React does in the behind the scenes is creating the new version of the HTML structure that we want to apply. It checks what is the difference between those two, and it applies only the diff to the result. So you do not rerender everything. You just change one simple component within your application.
Now, what is great is that when you learn React, then you will be able to use React native just right away, because it works exactly the same way. It has a JavaScript event loop running on the iPad, on iPhone, on any Android device, for example, and the only thing that it does, it actually does exactly the same. It checks what has changed, and it applies the change, but it is a real native component.
So to show you the final architecture that you would have in order to have such an offline solution is on the left, you have the scale architecture where you have plumber API deployed on RStudio Connect, as well as the Shiny application that you have been using, and then to the right, you have your iPads that are actually running React native in the background and are showing the fully native application that is syncing the data with your plumber API endpoints. Now, what is great about this is that when the device goes offline, it actually still works, because there is a local store that you can save to, and it is up to you to write the code to make it sync well with the endpoints. But then whenever you have people that are traveling, for example, and they don't have much Internet, they are not going to see a lot of intermissions, because they can just go offline and use the app whenever they want.
So that's the technological part of building such solution, but this is not the most difficult part. Of course, making it happen in the business is actually much more difficult, as probably all of you know. So I'm going to pass it on to Ayman to talk about the rest.
Making it happen in the business
Thanks, Damian. Yeah, so just last two slides, I know we're almost out of time, but some of the things that I really wanted to share that you can't really Google or find out in a tutorial is kind of like how do you implement such a large-scale application, which is used by hundreds and hundreds of users on a day-to-day basis, and I kind of put it out of there on the slide, but I think it is important to make sure that you establish some sort of steering committee, and you have senior leadership kind of like bought into your idea. And in my experience, one of the best ways to do it was always create a prototype by yourself in Chinese, so writing a working prototype, an application that they can feel in touch to kind of sell your vision and communicate the message that you're trying to solve for them is kind of key. So you can kind of review this stuff, but I will still end with the same thing that, you know, you got to choose the right method of delivery so that you can really activate some of your analytics and insights that you're investing a lot of time within your data science teams. Thank you.
