Volha Tryputsen | R in Janssen Drug Discovery Statistics

Transcript#

This transcript was generated automatically and may contain errors.

Alright, thank you. Hello, everybody. And, yeah, I'd like to join Marlee and other presenters in thanking RStudio and Phil and others into putting this session together. You know, this is my favorite community, R in Pharma, and I'm glad to be here today. And also thank you for the opportunity to talk about R in Janssen Drug Discovery Statistics. So my name is Volha Tryputsen, and I work as a principal statistician in the Department of Translational Medicine and Early Development Statistics at Janssen. So what do we do at Janssen? Well, at Janssen, we are guided by our mission of blending heart, science, and ingenuity to profoundly change the trajectory of health for humanity. And we discover and develop drugs which, you know, treat or lead to prevent some of the world's devastating diseases.

Preclinical drug discovery overview

So how do we discover and develop drugs, right? Well, the process is quite complex, but it roughly can be broken down into two parts, preclinical drug discovery and development efforts and clinical. And I think clinical drug discovery, drug development gets a lot of spotlight, whether preclinical phase is not as commonly talked about. So I'd like to take a couple of minutes and to give you in a nutshell what that really means to discover a drug, right? So it all starts with choosing which disease are you after, right? And then after you do that, you identify the target that you're going to try to pursue. Once you have that in place, you're trying to screen possible compounds or even engineer a new compound, they're going to bind to your target. And once that is done, you basically study that, that, you know, relationship or that activity that is happening between your compounds and target. And you might want to optimize it and look at it at many, many angles. But what happens at the end, you select a subset of compounds that you're going to be moving forward into other subset of studies.

And once you have your subset of compounds, you would further test them for efficacy and safety using in vitro, which is outside of your, you know, living, living body sort of experiments, which could be testing those compounds and tissue and cell lines. And then also in vivo experiments where you test your compounds in a living organism. And then, you know, once that is done, you're basically ready to move your final compound into, you know, enemy phase, which is naming it for a new molecular entity. And that's sort of complete your preclinical drug discovery process. But as you can see, it's, it's very, very involved and, you know, very, very, very multifaceted.

So I work for, for drug discovery statistic group that basically supports all of these different drug discovery efforts across different therapeutic areas, whether it's neuroscience, oncology, immunology, infectious diseases, vaccines. And yeah, we do a lot of work working with the scientists. So what our department does, it's obviously bringing mathematical and statistical rigor into building this drug discovery pipeline and also for decision making.

How R drives the statistics workflow

Right. And so how do we do it? Well, you know, we take the body of data, a lot of studies that biologists have conducted. And, you know, one of the aspects for us is obviously designing those studies and work on design of experiments. But also the heavy task for us is to analyze all of the data. Right. So we put the data into this sausage making machine, right, called statistics. We add some magic to it and voila, we get a p-value. Right. Of course, I'm joking. It's not as, as, as, you know, as it is on this slide. But instead of magic, we obviously have a lot of math and statistics within our sausage making machine. Right. And p-value is one of the things, but it's never the only one. And we accompany all of our findings with, you know, a lot of different tables, figures, graphical representation, and a lot, a lot of different reports. Right. But where I'm trying to go with all of this is that, you know, we have this process in place for analyzing this data. But what's really great is that this whole process is driven by R. Right. In drug discovery statistics department at Janssen, we heavily rely on R and RStudio products. And this is what my presentation is about.

this whole process is driven by R. Right. In drug discovery statistics department at Janssen, we heavily rely on R and RStudio products.

So how do we use our capabilities and what do we do with R in drug discovery statistics? Well, the most important one is our portfolio support. So we use R extensively to analyze a lot of our studies. So when I was thinking how to best kind of present what do we do with R, I thought about splitting all the stuff that we do into this four different spheres. Right. So first and foremost is our general practices. Right. The majority of us use RStudio and we do it, we use it locally and we also use it off the server. And the server RStudio capabilities are great because what they help us to do is share our projects and our different study analysis between ourselves within the group and help us to collaborate in this in this manner. But also we've been tapping lately in a powerful computing capabilities of RStudio server as well for our analysis, but also for shiny app development.

We all use Git in Bitbucket. So Bitbucket is our local GitHub, we call it, which, of course, ensures traceability and transparency of everything that we do, but also plays a big role in collaboration and code sharing. We extensively use R packages. We create our own, whether they're internal and external for, of course, organizing our own code, but also for sharing our methodologies with others, which, of course, speaks to the rigor and innovation that R packages bring to the whole drug discovery statistics space.

We also start packaging our standardized analysis workflow that we actually use to create shiny apps into packages as well. And that's, again, an easy way to share your code, but also makes it easy to troubleshoot or to bring changes that might need to be implemented if the opportunity comes. We also use R packages for packaging basically templates for R Markdown, which I'm going to discuss next.

open source nature of R provides us with immediate access and opportunity to share our cutting edge and state-of-the-art statistical methodologies and also facilitates innovation and collaboration.

Some notes for future potential. I think we can keep building a larger R community, not just within our Janssen pharmaceutical space, but also go into other Johnson & Johnson sectors, whether it's medical devices or consumer finance, et cetera. So that's always a great thing to do. I think there are ways to work on sharing R code, again, across the enterprise, not just within sub-communities. So I think that would be something great to do in the future. And then enhancing internal R packages visibilities. This is something I also feel is important and we can work better there. And with that, I'm going to conclude and say thank you to some of my colleagues who helped me to bounce back some of my ideas that I had for this presentation. And once again, thank you, RStudio and people who organized this R in pharma conference for inviting me to speak. And here I'll conclude. Thank you so much.

Volha Tryputsen | R in Janssen Drug Discovery Statistics | Posit

Transcript#

Preclinical drug discovery overview

How R drives the statistics workflow

R Markdown reports

Shiny apps

Innovation

Training

Building community

Summary and future potential

Featured software#

rstudio