Resources

Pharma Open-Source Packages and the R Validation Hub | A conversation with Aaron Clark

Posit's Director of Life Sciences, Phil Bowsher, sat down with Aaron Clark, Senior Principal Clinical Scientist at Arcus Biosciences, to discuss various topics within the open-source clinical reporting space including Aaron's career in pharma, the R Validation Hub, the open-source package riskmetric, and more. More about Posit's work in the pharma space: https://posit.co/use-cases/pharma/

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hi Aaron, thanks so much for coming and chatting with me here at the Posit Conference. Yeah, thanks for the invite. Happy to be here.

Absolutely. So I love to sit down with people doing awesome work in the pharma space and just talk about the journey they went through to get here. And I'd love to start with your background. And what was it when you jumped over into the pharma world? Because looking at your profile, you started off at Meijer, I think.

I did, yeah. Wow, you did some deep digging, I bet, to find that. I did, yeah. So I actually have a master's in biostatistics, but for some reason I landed a job in the retail industry right after college. And I worked there for a couple years, and then I actually got into the radiology industry. So I work for the second largest radiology group, it's based out of Grand Rapids, Michigan, where I'm from. And I didn't really work on a ton of clinical data there, it was more like business data. But then I just got an opportunity to work for a good-sized CRO on the East Coast. And I was a functional service provider, an FSP, to Biogen for several years, like four years. And there, I got to work on a lot of new open-source initiatives.

They were pretty open-source-minded with different projects they were working on. And that's when I really got to learn a lot about pharma and dive into the space full force.

Embracing open source and Shiny

You know, it's amazing to look back at that time and see the work that you did with Shiny and embracing so many of the open-source tools like Git and version control. What was that like for you to go from commercial software into the world of open-source?

Yeah, it was a lot different. I enjoyed every moment of it. It was eye-opening. But yeah, I did use Git in previous roles, but it was never in a public sphere. So it's good because you can get feedback on what you're doing. I never got quite the code review that I got working internally on projects. But once you are part of an initiative or a project that's working towards a unified goal with a good, dedicated team, then you get all that code review, which is really good to develop you as a programmer and as an individual in this new domain, so in the pharma space.

One thing that was really popular when we first did the R in Pharma conference in 2018 was the idea of modularizing Shiny apps. And you did a lot of that work early on. What was that like?

Yeah, actually, yeah. So we did. And honestly, that was a big proponent of Maya. She was the leader in that space. She was like, hey, there's this new thing coming out called Shiny modules, or hey, there's this new package coming out called Golem. And she's like, I think we should do these things because she knew the people that were working on them. And she's like, this is going to be the future of Shiny development. And so from there, yeah, we just started learning from square one. So before the Mastering Shiny book was out, we were trying to figure out how do you do these things. And it worked out well, I think.

Joining the R Validation Hub

Yeah. And to bridge that into community initiatives that have been such a big part of people that you've worked with, like Nate and Bob. And what bridged you into supporting some of the public groups like the R Validation Hub?

Yeah, it was really probably just quality support from my manager. So Bob Engel, like you mentioned, shout out to Bob, for encouraging us to go work on those initiatives. So he would tell us like during work hours, so not in our free time, but during work hours, like after you're done with your other project work, go and support like any open source initiative that you want. So he gave us like total freedom to go and choose something. And so I ended up, there was some other folks working with the R Validation Hub within our organization. So I said, well, what else do they have going on? And so I found the risk assessment application, which is a Shiny app that helps sort of extend the functionality of the riskmetric package, which we could talk about in a little bit.

And it was very comfortable for me to do that because I have a lot of experience with Shiny. And I didn't have to know, you know, I didn't have to have a ton of domain expertise to like kind of dig in and start working right away. And so that's kind of how I found the R Validation Hub.

And I feel like the R Validation Hub has been so critical in the adoption of open source in the drug development space for helping to establish understandings around validation and this risk based approach for people watching. Why don't you give a little bit of background around the R Validation Hub and the work that they've done?

Okay. Yeah. So R Validation Hub, I think they came out of the R Pharma conference back in 2018, if I'm not mistaken. And it's a collaborative among several pharma orgs of approximately 10 or so that are supporting the adoption of R in a regulatory space. Oftentimes they are including the FDA in what they're doing so that they're getting good quality feedback from some sort of regulatory agency. And a lot of organizations sort of subscribe to our content. And I would say the primary driver, at least early on, was our white paper that we published. And so the white paper basically says that every organization is going to have a different definition of what validation means. And they're going to have different risk tolerances towards what is considered a quality piece of software or a not quality piece of software.

So in that vein, the R Validation Hub says that, hey, there's certain quality metrics that we can all sort of define and not necessarily agree upon, like which ones are more important to you. But we can use these to sort of define our own level of risk. And then from there, you can use that as evidence to document that your software that you're using for your clinical trial analysis is reliable.

So in that vein, the R Validation Hub says that, hey, there's certain quality metrics that we can all sort of define and not necessarily agree upon, like which ones are more important to you. But we can use these to sort of define our own level of risk. And then from there, you can use that as evidence to document that your software that you're using for your clinical trial analysis is reliable.

The riskmetric package and risk assessment app

And I think one thing that's so awesome is that the R Validation Hub generated this white paper that helps provide general information around this risk-based approach. But then the organization took it a step farther and created tools that helps organization implement that white paper. So the riskmetric package, the risk assessment app that you work on. So tell people about those two.

Yeah, I would say so cornerstone to the white paper is definitely the riskmetric package. So it's essentially is an R package that helps you evaluate other R packages, like software quality based off of certain domains. So like the package maintenance sort of domain, the community engagement, and then also like the rigor of testing that's performed. So to give an example, I think that for testing, there's metrics that the package will define, such as code coverage. So that's a big one, especially for a lot of regulatory environments is they want to make sure that the software that you're using has been tested really well and that the security is top notch. So riskmetric will allow you to do that. It'll also tell you like, you know, how many downloads does this package have per year? Is it really popular? Does it have reverse dependency? So are other packages depending on this package? And then software maintenance best practices are stuff like, does it have good documentation? Is there a website? Is there vignettes that describe how to use the code?

And I feel like that package helps provide the metrics and then to help provide some reporting and clarity to IT and people throughout the organization, you created a Shiny app that exposes some of this information. And so organizations can say, these are the packages that we've included and these are the things that we want information about. And what's it been like to maintain and support that package?

Yeah. Yeah. So it's been a long going effort. Like there's been tons of contributors over the years that have been sort of building these tools up. I should have mentioned that from the get go. So I came in when the app was already created and I've just as a individual contributor, I'm starting to, you know, knock out some GitHub issues. But the app has been really helpful for organizations to implement that risk-based approach within their organization. So there's like for the Shiny app, there's ways you can configure it to align with your organization's goals. And then, like you said, like print out a beautiful, you know, PDF report that you can deliver to your person who governs your GXP environment within your organization to say, hey, look, I did my due diligence. I went and I assessed this package. I think it will or will not be a good candidate for our GXP environment to do, you know, whatever intended analysis that you're seeking to do.

And I feel like communicating with quality teams and QA is such an important initiative and getting information to them about these packages and tools is a really awesome thing that your team is working on. Yeah, definitely.

Contributing to open source via GitHub issues

You talked a little bit about the issues that you solved on GitHub for the app. I just think that's such a cool way to bridge into open source. And I feel like I hear that so often. And what was that experience like for you?

Yeah, yeah. So at any time in history, there's been a plethora of GitHub issues open on our repo or the riskmetric repo. I actually looked earlier today and there's like about 70 open issues. So there's no shortage. And some of them are, you know, great for novice contributors. And some of them are a little bit more detailed, a little bit more complex.

And so there's really something for everyone. But that's how you get started. So you start picking off, you know, one or two and you say, I think I can manage this. And that's how you learn the code base of the project. Like, it can be slow and painful. And sometimes you need to reach out to some of your co-contributors to say like, hey, how is this module working? Or like, where are these inputs coming from? Or where, like, where's the, there's a database in the back end. So like, where's the database schema that's defining the structure? So then that's really your sort of first step into like a project like that. And we try to hold, you know, people's hands as much as we can for new contributors. We do a lot of onboarding for them.

And I feel like this contribution has really paid off. You won the best app at the Shiny conference last year. Yeah, in 2023. Yeah, we won best app as voted on by the attendees of the conference. And yeah, there's even a prize attached to that.

Upcoming workstreams and the Regulatory R Repo

That's fantastic. I feel like the R Validation Hub has different projects that they're working on. Anything else coming down this year, next year you're thinking about?

Yeah, so we do have a number of workstreams. So there's riskmetric, risk assessment, the application, and then there's also a communications workstream. And so they are a group of individuals who are really spearheading, like, finding those validation experts from each organization. And we've done a number of like case studies with these orgs to basically lay out like, how are you guys approaching validation within, you know, Pfizer or Merck or GSK? And we sort of catalog those. And we've been collecting data sort of over time. We first published it in 2022, our first sort of round of case studies. And we've collected those data to basically help drive consensus for the industry. So that's like a really powerful example of, you know, the communications workstream doing something for the R Validation Hub.

I feel like there's also work happening in the Regulatory R Repo group. Do you want to talk about that a little bit? Yeah, so there's a group called the Regulatory R Repo, and they have been working on something that's pretty exciting. It's like a CRAN-like repository that allow, it will allow, there's currently a proof of concept that's stood up and it's working sort of a minimal viable product. But the idea is that you can edit the available packages that are available in a session, your R session. So for example, if you want to go and only use packages that subscribe to a certain quality metric, like, you know, it could be testing coverage greater than 60%, or has, you know, over a thousand downloads per month. So you can actually filter those packages out using this CRAN-like repository, and those are the only packages available for your R session.

Reflections on PositConf and the R Pharma Summit

That's fantastic. You're here at PositConf. What's the talk are you looking forward to, or have you seen so far? Honestly, I really enjoyed the R Pharma Summit. So the summit was great, and it was my first time doing anything in person with R Pharma. I've always attended the virtual conference, but the in-person portion was great, not only because I got to see everyone's faces, but in addition, I noticed how instead of just hearing talks from people who have solutions for things, we spent a great deal of time like spending time on known issues or known topics of discussion that don't have solutions yet.

And so it was fun to get into small groups and sort of brainstorm, like, hey, how can we make some progress towards this goal over the next six months to a year? And hearing some solutions and creating some light action items to sort of start making progress that direction.

And just like always, validation is a popular topic at the R and Pharma events. We've been talking a lot about the R Validation Hub and the work you've done in Shiny. Do you do work with other languages or other projects you've been thinking about coming down? Honestly, no. I'm mostly an R programmer through and through. Although we do at the R Validation Hub, we do have our ears and eyes open to working with other open source languages in the future. But me personally, I've just been primarily R.

Getting involved with the R Validation Hub

What's the best way if an organization wants to learn more or get involved in the R Validation Hub? Are there meetings for them to attend or people to reach out to? Yeah, so we do have community meetings every quarter. And so you can actually subscribe to those by going to our website. We have pharmaR.org is our website. And actually all of our products are listed there, like the white paper or all of our risk-based packages, like riskmetric, risk assessment. But you can sign up for our mailing list and we will send you calendar invites to community meetings. And that's really the best way to sort of get involved at the start.

Awesome. Well, thanks so much for sitting down with me. I have so much respect for the R Validation Hub and the work that's happened over the last six, seven years there. Thanks a lot and see you at next year's conference. All right. The pleasure is mine. Thank you, Phil.