Posit Meetup | Shatrunjai Singh, Aetna | R In Insurance

Transcript#

This transcript was generated automatically and may contain errors.

very much, Jay, for your time today. I really appreciate it. And I'll turn it over to you.

Thank you, Rachel. Let me quickly share my screen.

All right, perfect. So my name is Jai Singh. I work in CVS, Aetna. Aetna is a big health insurance company here in the US. And we were acquired by CVS, which is a big pharmacy company here, also in the US. I work in data science. I've been in insurance for the last six years. And I've been in data science for more than a decade, based out of Boston. And today I will be talking to you guys about how we have been using Shiny to build up these rapid prototyping frameworks, which help us take data science and operationalize it. And I'll be showing you some of the apps that we have been using in our day to day working.

But before I begin, as always, a quick advertisement, if you are looking for a new job, or if you are interested in opportunities in data science, please do consider us. We are a fortune five company, one of the five biggest companies here in the US, great benefits, work life balance. The other thing is that executives, they have a very strong data hole, they love making decisions based on data. That's why there's a lot of visibility on data science. And we are hiring very, very rapidly all across the US. If you are interested in opportunities anywhere in the US to contact us, we are hiring from all different roles, from analysts, all the way up to senior directors. And if you're interested, you can always Google CVS data science and you look at the opportunities, but a better way is just email me. And I can shortcut you through through most of the process and get you in get your foot in the door pretty quickly with interviews.

Rapid prototyping with web apps

Alright, so rapid prototyping, essentially is taking what whatever analysis you're doing. And if you feel like you will do it more than twice, then you build a web app out of it. And this tends to help you in a lot of different ways. One of the biggest things it helps you with is that now you can share the minimum viable product with a bunch of different people and they can iterate through it. And that will help you get the most efficient and optimized analysis. And it makes it very reputable, you can make a GitHub page. But if you make a Shiny app, it's easier for people who are not R or Python, or just in general, SAS users to play with your analysis and quickly get to the endpoint faster.

So that's why rapid prototyping, you would see that it's it's a very common theme in different industry paradigms. People in statistics will be familiar with Chris TM, which is cross industry standard process for data mining, where you start with the data, you perform data pre processing, you build a model, you evaluate how well the model is working, you deploy it. And then usually you go back to the business and they tell you everything you've done is completely wrong. And you start from scratch. So Chris TM, design thinking, agile, you'll see they're all very circular. And essentially, what they're trying to do is get to the minimum viable product faster, and then test and learn.

And to do this, you can use all these different frameworks. In data science, you can build web applications. And if you look at the history of web applications, it started with the HTML in the 1990s, probably some lonely grad student in a in some computer science department made this up, probably not. And then over the years, you saw that it has evolved. And now we got in close to 2000, we got these web applications made in Java. In 2014, we got our Shiny, which changed a lot of how web apps are made. It was developed by a Cho Chang. And essentially, it removes you from trying to have expertise in HTML, CSS and JS. And you can actually build your applications, which are data science heavy. And you can make them with an interface, which is very, very pretty, a GUI, which looks good and easy to navigate.

Now, over the few years, now you have a bunch of different web app tool frameworks like flask, dash, streamlit. But I still prefer Shiny. And I'll tell you why. Now there are trade offs. So you can consider building your web applications in different frameworks. So you can use flask, which is Python based. And you can sort of mix flask with JavaScript. To make a web application. This is throughout a lot of different companies, especially if you see things on websites, this is they use it with Django. And this is this is the way they use most of the web applications. Dash is pretty popular. Tableau is more for dashboarding.

Then you can compare all of these different frameworks on all of these different measures. The three ones that I find very important are just stack overflow support. So if you are a data scientist, you know how much how important it is for you to have support and be able to find help online. And I find that Shiny tends to have a lot of help online just because it was the first to market. A lot of people have failed doing the same things that you are trying to do. So that tends to help. It is popular with data scientists just because R is one of the two biggest languages for data science. You will speak the same language and somebody who comes after you will be able to maintain the apps that you have created. And finally, it's free of cost, which is pretty important. You don't want to pay $3,000 for a software package, which in the end, if it does, if it doesn't add to the free tools that are out there, then you've essentially cost the company 3000 bucks.

then you've essentially cost the company 3000 bucks.

Now, if you're not familiar with, when do you use web apps, which I highly doubt if you're in this meetup, then you would use web apps when you want to revisit the same code for different projects. So if you're trying to do something similar over the different projects, you can make a web app out of it. If you want to introduce a new technique, I've seen this work pretty well. One of the apps that I'll show you today is the comorbidity analysis, which is a technique that I had published a paper in academia, but wasn't used in industry too often. And making a web app tends to help with that. It leads to faster adoption of whatever you're trying to sell.

And finally, the most critical one is if you're trying, if there is an analysis that you do that has multiple steps, and that analysis is pretty common. So multiple groups within your organization are using a similar framework, and then getting different answers or different variations of the same answer, and then presenting it to the business, this leads to a lot of confusion. So if you standardize your whole data science process, and you build a web app that everybody can use, then that tends to shorten the time to analysis. And it also helps in standardization overall. And usually, web apps are more useful when you need slightly more than just dashboarding. If you just need dashboarding, then the best things are Power BI and Tableau.

So with Shiny or Flask, you can use the latest and greatest data science advanced analytics, along with visualization, the visualization might not look as neat as a Tableau, just because it's been designed by like professionals, but it will still be more advanced, you can do these analysis, which are more complicated in R.

Another question, I know, we have just a few more minutes here. But should age asked, Hey, Jay, have you encountered scenarios where you want to run clustering on big data sets, upwards of 10 million rows from within Shiny?

So, so the one thing I would say is that most of the Shiny apps that I've shown you, it tends to run on smaller data sets. For bigger data sets, Shiny for for the current for the knowledge that the limited knowledge that I have in the use that I have had, the scalability tends to be an issue later on. Now, recently, there have been new packages that you can use to

Posit Meetup | Shatrunjai Singh, Aetna | R In Insurance

Transcript#

Rapid prototyping with web apps

Build vs. buy

The clustering and profiling app

Comorbidity analysis app

Propensity score matching app

The EQUAL bias detection app

Q&A

Featured software#

rstudio

Shiny