Open Source in Clinical Reporting | A Conversation with Ben Arancibia at GSK

Transcript#

This transcript was generated automatically and may contain errors.

Ben, welcome to Posit Conference, thanks so much for coming, it's great to see you as always. Great to see you. You came in for the summit on Sunday, you and some colleagues, awesome, well I'm so glad we got a chance to connect and I really want to highlight the awesome work that you're doing, your team's working on, and just let the community know about the things you're doing.

But I think like, before we jump into that, I'd love to take a few steps back into like how it all started, like how did you get into pharma, what did you do to Bridget and of all things like Shiny too? Yeah, yeah, for sure. So how I got into pharma, it's a weird story. So I'd been a data scientist for a long time, I'd worked in consulting and different things like that, and had lived that consulting lifestyle, you know, fly out Monday morning to some place and then fly home kind of Thursday night.

So I think you and I have similar backgrounds with IBM and everything. And one day GSK, they reached out and they said, look, we're trying to build a data science capability, we'll teach you pharma, you teach us data science, and you'll be part of what's called the ESPRIT program, which is like this executive leadership training. And you do some rotations, and we'll find you a job eventually in the organization. So that was kind of my entrance into pharma.

Like what really appealed to me, why I wanted to work in pharma was, I had a lot of great tools and toolkits with data science, open source, cloud platforms, things like that. But I hadn't found kind of like that passion area, that passion industry. And a lot of my family, so like my mom, her brothers and sisters, her parents, they've all had cancer. At some point, I'm going to have cancer probably. So being able to work in an industry to actually show impact, I don't need to know the science, but I can, you know, work on the pipeline or help people move assets through our pipeline to, you know, actually have impact on those patients, really hit close to home. And that's kind of really what appealed to me and why I wanted to join the pharma org.

So being able to work in an industry to actually show impact, I don't need to know the science, but I can, you know, work on the pipeline or help people move assets through our pipeline to, you know, actually have impact on those patients, really hit close to home.

Bringing agile practices to GSK

It's such an awesome story of bringing data science into pharma. And the work that you were doing before was GIS. GIS, cybersecurity, yeah, just trying to solve problems. And it's amazing to see the influence that you're having in the community and at GSK on work like agile and better practices around software development. Did you bring that with you into that role from?

Yeah, so when they initially hired me, the idea was for me to really focus on helping them think through like how to build our SCE, our scientific compute environment, so cloud architecture and things like that. And kind of the way it played out, it was, there's actually a big niche for like, how do people actually think about user design? How do you think about actually coming up with values within development teams to interact with users? How do you think about, okay, we built something, but just because you build something doesn't mean it's going to be adopted. So how do you actually think about putting in those support structures, but also at the same time using those support structures as feedback to improve our data products? So that was kind of the niche I found.

So eventually just kind of building out tools, you know, obviously we still, or I spent a lot of time working, coding up things, working with different people, but at the same time, really leaning on support structures in order to understand sort of what do users need? Because at the end of the day, with this big transition that we're making, we have to make our user feel as least amount of friction as possible in order to be able to kind of adopt our different open source tools.

And you work with such a rockstar team. I mean, Christina Fillmore and Andy Nichols and Ellis and Becca. What is that like to have such an amazing crew around you? I'm lucky. I mean, they make the job easy because I think what is really crucial is it's not only the individuals to be able to go out and talk to people, but it's like who can actually take that user feedback and then turn it into a reality. And I think being able to pair teams together, not only like a very solid engineering team, but also a very solid kind of user feedback support, you know, someone to, you know, consulting or voice of the customer, if you will. Being able to merge those two together is what allows us to really make those very solid data products with such a small team. Because at the end of the day, we are not big, but we're small and agile.

The Accelerate R program

Can you talk a little bit about what was the spark for the Accelerate R team and how that came about? Yeah, sure. So Accelerate R is one of my, I guess, things I'm most proud about. I've talked about it here at Posit. I've talked about it in some of the other conferences. But I think the thing that we saw and realized is we were doing tons of workshops, tons of classes on R. And then all of a sudden, no one was using it. And we were trying to figure out for a long time, why is no one using R?

What we realized is a user or an individual within our Biostats organization, they would take a training, they would do a workshop, but they actually wouldn't be able to get the ability to use our open source tools until 12, 18 months later on in their study. And I can't remember why I had breakfast. No one's going to remember what do I do with Tidyverse or anything 18 months down the road.

And so the spark was really thinking about not training in the sense of how do we train people but thinking about training and then learning. And I think that's a really important concept because I think training is really easy. You find some documents, you create the documents, you create whatever it is, and people can go and do it. But how does someone learn? How does someone learn while trying to deliver against deadlines? And that is a really important question to think about because we're not university students. We're not college students. We don't have the ability to spend a semester to learn something and then go apply it. We have to learn on the fly.

And starting to think about our user in that different way about how does someone actually learn within an organization is really crucial. And so that's why we started Accelerate R. We go and sit with a clinical study team. It's a very intense eight to ten weeks where we're literally training them and then we're using those eight to ten weeks to learn about what's wrong in our workflow to then build a tool during those eight to ten weeks. So it's not only us supporting people, teaching, having them learn, and then upskilling those capabilities as we make our transition, but it's also giving us the feedback that we need to figure out what are the actual tools instead of us making guesses. And I think that balance is really what's leading to a lot of success for us.

And so the spark was really thinking about not training in the sense of how do we train people but thinking about training and then learning. And I think that's a really important concept because I think training is really easy. You find some documents, you create the documents, you create whatever it is, and people can go and do it. But how does someone learn?

And I think it's such an impactful way to get feedback from your users because it's created and the output of that has brought forth some really important packages like TFORMAT and Slushy where you identified challenges and issues that your user base had and instead of trying to find a workaround or something from the community, you said, hey, let's build this and solve it for them. Exactly. And I think the thing that is great is because of sort of our time box engagements, we have to get something out. So like Slushy, I think Becca was like, all right, let's do this. And she built it maybe in like a month because since we have that time box thing, we don't have long periods of time where we can go away and we can think about it and we have to deliver it and we have to see the impact. And being able to see the impact quickly and then continuously get that feedback again is another input for how we make really strong data products.

And I think it's something that speaks so much to me because groups will reach out to me to do a workshop and I always tell them, look, I'm happy to do a workshop for you. And I hope that it sparks further learnings, but really what's needed in-house is a competency center or some type of group that manages the change that's happening in the drug development space. Absolutely. And your team just tackled that. And I joke all the time that last year your talk was about the downside of workshops, which is so funny because I do so many workshops, right? But I do always preach the side of it needs to be part of a center that your group manages that you think about how do we take people from the commercial software into the open source side.

Exactly. And that's the other thing that we love about it is we try to eat our own dog food. So as soon as we kind of finish an iteration, we try to get some of those individuals to start contributing to the open source. So like, for example, the team that we did slushy with was an oncology team, and then they started to directly contribute to Admiral Onco based off sort of some of those lessons and some of those learnings that we had as an Accelerate R iteration. So like being able to not only train, not only upskill, but also figure out a way for individuals or inspire individuals to start contributing to the open source. It just makes everything a lot more solid, if you will, based off that.

Like at the end of the day for pharma, what we want to focus on is the science. We don't really want to focus on like, we don't have the same competitive advantage, we don't get a competitive advantage between like who has a better tabling package or anything. But at the end of the day, we like to focus on science and if this allows people to focus more on the science, then it's a big win.

You know, it's a topic that's come up so much with my interactions with the pharma is how do we standardize things? And it seems like one of the lowest hanging fruit in that space is around TLGs and standardizing on the standard reports that they make. There's a new initiative, which was originally called Falcon, has now moved into Cardinal, I believe. Have you seen that or part of it? A little bit. I've seen sort of where, I've seen where it's going. I think if it works for your organization, use it. I think the beauty of some of our open source communities like ASA, OpenStats, or Pharmiverse is there's a real strong push for everyone to contribute and then you figure out what works for your organization. And I think that's kind of the beauty of sort of these package ecosystems. Your ability to take something that is fully kind of modular and figure out how to plug it into your workflow is what's great about it.

And so to me, I think if Falcon or some of these very, not rigid, but some of these ideas on how it is that we standardize, if that works for you and your company, great. If it doesn't work for you and you want to use something else that's available out there that someone is maintaining, great. And it really depends on what it is that works best for you. I think it's a great story because pharmas are all so different in the way that they process things. And it's basically saying, here's an ecosystem of packages that you can pick off of that are reflective of the processes that you have. Exactly.

You know, one thing I've always thought with GSK is that, you know, Andy Nichols being critical on the R Validation Hub and the R Validation Hub white paper being such a critical piece of pharma on the open source side, you know, I'm sure that must have helped get things underway for GSK. For sure. I think being able to clearly define what do we mean by validation is the most crucial thing. I mentioned it in my talk a little bit earlier, but one of the things that I remember early back before everything was validated is like, what does code execution mean? Does it mean a .exe file? Like is it literally an execution file or does it just mean you can run the code interactively? And I think being able to figure out and define these are the things we care about, reproducibility, the ability to trace, or traceability, the ability to say, yes, we trust this. We trust the outputs and we trust the outputs that are going to come from these tools that we're using is really crucial. And being able to take that framework and then apply it internally. So we apply it in a certain way at GSK. Another company can take those ideas and apply it for their QA department, but being able to say, yes, these are the things that we care about, I think is a big step forward for us to be really feel comfortable for how it is that we deliver and how do we trust what we create.

Well, I have absolutely enjoyed chatting with you. You've got a lot of the conference left. Any talks you're looking forward to attending? The thing I love coming here is being able to see not only what are other pharmas doing, but what is everyone else doing? And I think one of the things that I learned coming into GSK is we can learn a ton from financial services. We can learn a ton from United States Geological Survey. Being able to take sort of like lessons learned for people's individual organizations and then think about what it is that they did and kind of put our own twist is always a great way for us to keep innovating and pushing things forward. If we just kind of stay within our internal pharma land, we're all going to kind of say a lot of similar things. But being able to see that kind of that cross-industry problem solving is really crucial for how it is that we continue to innovate.

I think it's fantastic. You get to mix with different people, diverse groups, and different industries here. And hopefully you take a lot back to your team. Hopefully. Yeah. Well, thank you so much. I've had a great time and enjoyed the rest of the conference. And hopefully we'll connect. Any other conferences coming up this year you're going to go to? My hope is to get into the rPharma conference. But rPharma, unfortunately, I won't be at Fuse EU, but I'll be at potentially Fuse. And then, you know, wherever things pop up. And Orlando. Oh, of course. Yeah. Orlando in March after a Northeast winter, I'm ready to go there. Sign me up. Exactly. Exactly. All right. Thank you, Ben.

Open Source in Clinical Reporting | A Conversation with Ben Arancibia at GSK

Transcript#

Bringing agile practices to GSK

The Accelerate R program

Open source contributions and the BEAST package

OpenStatsWare and upskilling statisticians

Tables, TFORMAT, and the TFORMAT Builder

Gen AI, R adoption, and ARDs

Python in the drug development space

Looking ahead and cross-industry learning