Resources

Johnson & Johnson's Open Source Journey with R in Clinical Trials

Learn how Johnson & Johnson embraces R and open-source tooling in clinical trials. Abstract: Embracing open source has been a transformative journey for J&J, fueled by a commitment to collaboration, transparency, and innovation. Our R journey began with a mindset shift—transitioning from decades-old solutions to actively contributing to and harnessing the power of open-source communities. We navigated the challenges of cultural adaptation while cultivating a programmer-friendly environment that promotes sharing and continuous learning. We enhanced our technical capabilities by open-sourcing key projects focused on R and strengthened our ecosystem and industry partnerships. This webinar reflects on our experiences, lessons learned, and the strategic impact of embracing R. Speakers: Sumesh Kalappurakal, Tadeusz Lewandowski, Nicholas Masel, and Mark Bynens. Learn more about Posit's work in Pharma: https://posit.co/use-cases/pharma/

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hello, everybody. Welcome to this awesome Posit Johnson & Johnson webinar today. My name is Phil. I'm going to help moderate and be around for the Q&A later today in the last 10 minutes of the webinar.

I've got a really awesome session today to highlight the journey Johnson & Johnson has gone through over the last five, six years. Their impact in the open source drug development community has been quite profound. I mean, if you look at the beginnings of the pharmaverse of R and pharma, of the R pilot submissions group, it just goes on and on with Johnson & Johnson. You can list another five to 10 groups that they've helped support, and it's really been amazing that in 2018, 2019, they said, look, we're going to focus and train our statistical programming teams, our statisticians in open source and R.

And here we are in 2025, and they're doing submissions in R, and they're going to take you through that today and what that journey has been like. So first, we're going to start things off with Samesh, who's going to talk about the leadership at Johnson & Johnson and the vision for creating the platform, the environment for doing submissions in open source. We'll also have Tad talk about the statistical programming and the data engineering side. Then we'll have Mark talk about change management and also the program management that's gone in. And then we're also going to focus on the statistical programming and package development with Nick as well.

And so really excited for this webinar. We'll have a chat going. Feel free to post questions. We'll try to get to those during the webinar, and then we'll tackle a good chunk of those in the Q&A at the end. So I think we're going to kick things off with Tad.

Introduction and disclaimer

And we are happy to present you our J&J, our open source journey in R from idea to implementation. I will be presenting today together with Sumesh, Mark, and Nick.

So just disclaimer, I wanted to make sure that just a quick note before we will go into more details that these insights we will be sharing today are all our own and come from our own personal experiences while we are here presenting J&J. This journey is especially reflective of our own, our clinical and statistical programming group.

Now the next one, introduction to the R journey, while maybe we'll let Sumesh later on to talk about this. So let me start over a little bit. And I think thanks, Tad, for covering me up. First of all, let me thank Phil and the POSIT team for giving us this opportunity to share our R journey in our clinical trial operation.

So I want to go with a disclaimer, but Tad covered it for me. So thank you very much again, Tad. Again, the title says J&J open source journey in R, but this journey is more specific how our clinical and statistical programming group has implemented R. So it's very specific and narrow. And again, I'm Sumesh Kalaprakal. I'm heading the technology solution group within clinical and statistical programming team. Our mission is all about harnessing innovative technology to support the portfolio needs actually.

To start with, our journey is not different from other colleagues who have navigated similar path. But every journey, you know that there is a unique challenges and experience, which we will be sharing today with you guys in this webinar.

So the first thing which I want to start is when there is a great idea, it won't move forward without the support from the leadership. So we are very gracious and thankful to our leaders at all levels. They were very supportive on this initiative and also our colleagues across various department who have collaborated with us along the ways actually.

Special thanks or special shout out to our IT team and statistical group for being with us in this journey together. Again, when we embarked on this journey, we realized how valuable it is to team up with other former colleagues, industry experts, partners, solution providers, and also some nonprofit consortiums who have similar vision in promoting this open source, especially in R. And folks like you being part of the open source community has been truly inspirational for us and a game changer for us to keep our motivation high.

When we talk about this adventure, we started this in 2019 when we kicked off this, when we pitched this idea to our senior management, we had a clear vision and plan. And of course, you all know that all plans need to have some adjustments due to some unforeseen or unknown challenges that we face when we discover something new. We embarked on such challenges, I won't say it's a smooth ride, it was a bumpy ride also for us. But we acted quickly as one team mindset and we were able to overcome that challenges actually.

We have an interesting agenda, so Mark, if you want to share the agenda slide. Today we will start with Nick covering the roadmap from how do we went from ideation to implementation, followed by Tad who will be sharing building an open source mindset, the culture in that, and how do we leverage things not going and reinventing the wheel. And then Mark will cover the best practices and lessons learned that we faced during this journey. Then Nick will come back and wrap up with a presentation by providing some future opportunities. And finally, we will open up for some Q&As actually.

The R journey roadmap

And with that, I will hand over the floor to Nick to share the roadmap.

So yeah, as Sumesh mentioned, we started in 2019, but I'm going to go one year earlier. So taking us back to 2018, eggs were in the U.S. were $1.49 a dozen. We were all like eagerly awaiting season eight of Game of Thrones to much of our disappointment when it did finally come out in 2019.

And then some of you might not know, or for those internal to J&J, you might be aware. In 2018, we had our first J&J Shiny Day. So Shiny Day is exactly what you would imagine it to be, right? So it's a one-day conference where we really showcase Shiny capabilities. We have app developers come and really show what they've done and what's possible with Shiny.

So I attended this and it was my first experience really with a lot of R and Shiny as well. And I just left absolutely hooked. So my head was just buzzing with ideas of things that we could do to apply this within our stat programming group.

So this was really the precursor to joining the overall, you know, J&J stat programming journey in 2019 for me. And this was really focused to begin with on exploring, you know, what can we do with R? So that group of people that kind of left Shiny Day, really excited to apply these things, continued on with this journey, started looking what's available, what's out there, how can we use it, and really experimenting with things. This quickly came to like our next, our first Shiny application that we released. These were more operational type things. So we're not thinking of like clinical study data at this point, but more about, you know, taking something that was done in Excel, for instance, and using a Shiny application to streamline that process.

Well, we quickly saw value here, right? And we wanted to do more and more, right? More use cases came out of the woodwork and we quickly realized we don't have the people with the skills to do this. So how are we gonna, how are we gonna like scale this moving forward? So training became a hot topic. So we spent a lot of time thinking about how are we gonna train? We went through and trained the trainer process. So we came with a group of trainers, right? And then those trainers were out there now meeting with teams, helping them learn the basics of R, right? Base R, tidyverse, this sort of thing.

So we've already got 2019 here popped up, or sorry, 2021 popped up on the screen here. That really led naturally into 2021, we had a lot of momentum going and R was an initiative at this point, but it wasn't a flagship initiative, right? And really all that means is we have many initiatives, we just give a little more higher priority, right? To just specific things that we really think are important to the business and our process.

So with that came investment, infrastructure, and Sumesh already mentioned this and I can't mention it enough, another shout out to our JJIT and specifically Satish Murthy for everything that he's done helping here. So we had our compute environment, which was still relatively new, and then we just added on the R piece and really open source piece, but with a heavy focus on the R side of things. And Sumesh again mentioned this, we say, hey, we're involved openly, we're working with Pharmaverse, we're working with our consortium, but we need more, right? We need to get engaged more. We need to understand more details around what are other people doing? What can we learn? So maybe we can avoid mistakes that others have made and yeah, be as successful as we can.

Yeah, going to 2022, so super excited here. Again, momentum is building. We open sourced our first package, so that was a big deal. I would say looking back at 2018, thinking of being a package developer and open sourcing something seemed impossible, right? Well, oh, these are like mythic, godlike people that are able to do this and it demystified it a lot, but it was very exciting. And really, I guess my thing I'd like to share is any of you can do this. There's nothing really wild or crazy about it. But for us, we were super excited. We open sourced our first packages and we wanted to give our staff programmers the tools to really be successful as they begin to use R, right? So they've had some training, but how do they now apply this to their everyday job?

And really, I guess my thing I'd like to share is any of you can do this. There's nothing really wild or crazy about it.

So this is where we have code catalogs coming in, right? So we have an R code catalog there that's available for people to bring scripts down into their studies. And we specifically selected some early adopters to be able to try these things out before really just throwing out to all the trials, right? So these early adopters helped us find a lot of things that we needed to address and fix. So if any of you are listening, thank you, thank you, thank you for being the guinea pigs in that and for being patient with us as we improved our processes and tooling around this regard.

So up to now, we've been focusing a lot on static for our stat programming team. But at the end of 2022, we began to also think about building a solid interactive framework. And we have Shiny and things that are happening prior to this, but we are really trying to address the question of, you know, how do we maintain these Shiny apps long-term? If everyone's just writing Shiny apps, however they want to write Shiny apps, you know, how are we going to keep these things running year over year? So we really started looking towards having some sort of framework and putting some guidance and guardrails and expectations for how these things were developed.

All right, and jumping into 2023, we're feeling really confident at this point, right? We feel like we've got static covered, we've got trial teams actually using the tools and scripts that we've built to do their everyday job. But the elephant in the room was really still submissions, right? We're really confident based off of external initiatives that were happening and what others were doing and what we were hearing as we attended these open source working groups and initiatives, but we hadn't done it ourselves yet at this point. So again, that was just kind of the elephant in the room and became a big focus for us in 2023 is to put our internal processes in place to really nail down how we're going to do submissions. And that started with identifying specific trials that we were going to focus our energies on for submissions with R.

As we had done previously in 2021, where we built infrastructure for the static side, we invested more in infrastructure on the interactive side within our compute environment. So within our GXP compute environment, again, another shout out to JJIT. And one last thing that really wrapped up maybe the last quarter, maybe a little bit more than that of 2023, and it's kind of easily lost in all of this, right? We're talking about a lot of things over a six year span or so.

Is we really took a step back and looked at what we've done so far, what our strategy has been, what we've achieved, but look back at the industry to see, well, what's changed openly across the industry and do we need to make any sort of course corrections, right? So we had an exercise. We looked at what was available, and I'm very happy to say that we actually did. We decided to shift our strategy from developing J&J specific tooling and frameworks to actually looking at open source frameworks that were out there and available and made the decision to make a move over to existing open source frameworks.

So jumping right into 2024 then with that decision in mind to tweak our strategy, that's exactly what we started doing, right? So we started working on those open source packages to make sure they fit J&J needs. We developed a new script catalog. We also had some new standards that came out at that time to support those new packages we decided to use. And the fruits of our labor paid off. All of that submission work that had happened, all of that trial teamwork that had happened in the previous years led to our first hybrid submissions last year in 2024, and we actually had four submissions in 2024.

And by hybrid, I mean data sets were done with SAS data sets, and then all of the outputs, or we call them TLGs, were produced with R, with a caveat, like with an asterisk. The teams, hey, if you ran into something where you have an issue with producing it in R, you can of course go back to what you had traditionally done here. So really excited to see the elephant was no longer in the room. We completed our trials, our first hybrid submissions in 2024, and now in 2025, already March, we're working to make R more, for StaticSight anyways, business as usual.

And to some surprise, maybe for those in the audience, we're now looking at exploring Admiral, right? We focused very heavily on the static and interactive results generation, not really the data, but we're looking to now take that next step earlier in our process and look at creating data sets with R. And lastly, you know, we said we were going to adopt those open source frameworks, that includes interactive frameworks, and we have our first adopters using the J&J version of those interactive frameworks.

Building an open source culture

And then, as you can see, there's lots of what actually open source provides. And I want to really take a look on the culture of this collaboration. So what is really, we can see here, there are lots of opportunities. First of all, as J&J actually benefited from this, it was thousands of hours that were spent already on the framework or some other solutions that we might reuse and build on it. On the other hand, there is a great opportunity to improve and make it a robust solution out of the existing packages and so on. And I will come back to this one because it's really important that improving on something brings a lot of added value.

And the last is really the sharing expertise, because as you, the audience, there are lots of experts that are there. And then we can work together on even better solutions. But one thing I want to bring more from the personalized medicine, does one size fits all, meaning are we, as a pharma, doing the same efforts but differently? So what is the impact on our collaboration in this space? Because if we now look at an example of those open source packages that are under the pharmaverse, we can try to build on this together.

And I'm here today also to bring you two successful examples on the collaboration. One, thank you for our internal team who implemented TrueFont and RTF for the R tables, something that I will showcase in a minute. But also the other solution that probably our internal team also brought and brought in an idea and was commonly implemented, something that was less than two weeks ago, was release a so-called transform module output. So let me go through those two examples.

First of all, for the static output, as I said, implementation of the TrueFont might have a big impact on your organization and the way you implement. Because on the left-hand side, this is an example of the demographic table from this source, from the TLG catalog. And with our work and our internal efforts, we made it to actually have the same number, but looking differently. Regardless what will be the format, whether it will be RTF or the X format or something, some companies might have different needs for the feel and look of those outputs. So this is an example how our work contributed to the community.

But more exciting is actually the interactive part. Here you can see the screenshot from maybe you've already seen this before, the TL module association between two variables, race and biomarker. But currently there are certain options you can do. But with the extension that has been just released, you can also add additional things without changing or rebuilding this as an ad hoc module or doing this completely, adding new functionality. You can now do this, and this is just a trivial example, adding that this is the new footnote that comes here. Or could be any other decoration to the existing module that it will look the way you need in your company or you'd like to then to your stakeholders to utilize this.

This, of course, will be, this can go way further. And with this example, this is exactly the same TL module association where you can even change the type of the output if you like. So this is still association between race and the biomarker.

So, of course, based on this, we might, of course, thinking about them, does size fits all? And I believe with the extension, with your work, with your contribution, you might bring different new capabilities, new functionalities that we all can build one solution that really fits to all of us in the future. As an example, of course, we can go even further. You can think about adding the show R code that will create this true font output in the future.

Lessons learned and best practices

Yeah, like Nick already explained that our exciting journey has been more than just coding. It has been a voyage filled with learning and growth. And as I unwrap our lessons learned, I remember the wise words, I would say experience is the teacher of all things. So, now let's look at six key takeaways. And the first you're already seeing is training and knowledge development.

So, foundational knowledge is crucial and investing in a comprehensive training tailored to a diverse audience ensures everyone from statisticians to statistical programmers has a solid grasp of R. Try also diverse learning resources by utilizing a mix of online courses, workshops, and dynamic coding sessions. You really can cater to various learning styles and boosting engagement and understanding. I also really like to do a shout out to our statistics colleagues here for really helping us tailor some of those trainings.

Now, onto infrastructure and scalability. Understanding that a strong infrastructure is essential for moving to R. Packages are upgraded or added. Also, interactive use cases, they really grow. We really must ensure that our infrastructure can keep pace. A big shout out to IT, especially Satish Murthy's team here for their fantastic support and helping us making the right infrastructure choices.

So, when it comes to standardization, remember that consistent practices really help create smoother collaboration using the same framework and packages as well as standardized R code repositories improve not only code quality but also help collaboration across teams. And we're using basically they are the same building blocks, the same tools. So, that really helps all the teams in using R and Shiny.

And I think this one is one of the most important lessons is the power, of course, of the open source community. That's also where you guys are really coming into play. The vibrant R community offers a treasure trove of packages, resources, and support, which really accelerate learning and foster innovation.

Now, what about change management? Well, identifying early adopters is key for broader acceptance of R. For example, when they showcase its interactive potential, like Todd already did, it really sparks the interest and it really encourages participation of all the members in our organization. Phased implementation works best. So, gradual rollouts for different compounds or studies allows smoother transition. The opportunity for iterative improvements is also there as well because you do it more gradually.

And finally, risk mitigation and scaling strategies. Well, pilot programs are invaluable. And these pilot programs in using R helps us identifying challenges early on, providing crucial lessons for us as we scale up and integrating R into our daily routines. That one is also essential for a smooth transition to what we call business as usual with continued support and resources.

I think that wraps up the lessons learned effectively. So, now let's explore some best practices.

So, first, an obvious but really important one, it's already mentioned a couple of times, is the area of training and knowledge development. So, it's really important to establish a structured training program tailored to different audiences and various learning styles. This program should be designed to be updated as new tools and techniques emerge within the R ecosystem. And we also need to train, of course, our team on the new standard ways of working with R and the packages to be used there as well.

Next, let's move on to culture and engagement. It is crucial to foster a culture of curiosity and, of course, really embrace that open source mindset. We should encourage team members to explore both static and interactive R applications while promoting a hands-on learning approach. By adopting that open source mindset, we cannot only leverage the vast array of resources available in the community, but also we can contribute back to that community. Enhancing, again, collaboration and innovation.

Now, let's talk about documentation and infrastructure. It's essential to document our infrastructure choices thoroughly. For instance, which validated packages can be found on which container? And additionally, building a knowledge sharing platform is also key. So, having sort of an R portal, for example, is an important, very important step of having one-stop shop of all information related to that R journey that you're doing and sharing that with your audience.

In terms of feedback and continuous improvement, we need to create feedback loops because implementing feedback mechanism really helps us refine our processes and infrastructure based on what we learn and the needs of our teams. We should also monitor as well and assess how well R is being adopted to find and to enhance our approaches there. So, a little bit of metrics is also something that is very helpful.

Now, moving on to demonstrations and adoption. We provide, yeah, providing compelling use cases that highlight the benefits of moving to R, showcasing both the static as well as the interactive projects is really important. Creating interactive demonstrations alongside the static examples can really serve to inspire the broader, yeah, the broader use among the teams in illustrating how both methods can complement each other effectively in various contexts, and also the shiny days like mentioned by Nick that can really, really help to showcase some of the adoption of R already.

And lastly, regarding change management, we need to plan for the change and making sure that the business is ready to move to R. Also have a plan in place to deal with resistance is key, I would say, and addressing any concerns proactively can ensure, of course, smoother adoption of R.

So, yeah, to wrap up this lessons learned and best practices session, I want to stress that success really comes from ongoing efforts and resilience, so every step that you take gets you closer to your goals, so embrace the challenges, I would say, because they are key building blocks on your road to success. Adopting R and an open-source mindset, yeah, is a journey that takes time, patience, and dedication. The initial learning curve might seem to be a little steep, but the value of your investment is really, really huge.

Success really comes from ongoing efforts and resilience, so every step that you take gets you closer to your goals, so embrace the challenges, I would say, because they are key building blocks on your road to success.

So this approach not only encourages, I think, collaboration and progress for everyone who's involved, but also boosts, yeah, your individual skills as well and creating a shared environment where we build knowledge together, yeah, that is something that I can really, yeah, value and I'm really enthusiastic about.

Future opportunities

So there's tons of future opportunities, right, and when I was thinking about this, I made a quick list of 10 with almost no effort, but we decided to focus on a few of them.

Really, a big one is continuing to expand external collaboration, so as we've mentioned throughout the presentation, we're heavily involved in many things, but we're always looking for new areas to contribute, whether that's us learning or also us giving back, because if you remember back to the roadmap, one of the first things was us joining openly, seeing what's already been done. We learned quite a bit from others and we're happy to continue to share our journey through things like today with that exact intention to help others just as we learned from those in the past.

One thing I'd like to call out as an example of things that we're continuing to work towards and collaborate on is there's a FUSE project that's opened up. It's Teal Enhancements for Cross-Industry Adoption, and this year FUSE is doing an EU version of CSS. So for years, they've done this in the US, but this will be the first time in the EU, and this group will be presenting several different presentations, doing open workshops, and really working to show the EMA during this what's possible with interactivity in Teal, but also how this is coming together at an industry level for use.

Looking at this second one here, so Supplemental Interactive Submissions, and I put supplemental very intentionally because I think to start with when we think about submissions, it's not going to be necessarily replacing things, but a stepping stone of adding it as an additional tool to help people, to help reviewers quickly understand the story of what's been happening here. So this could be standalone HTML files or also the work that we're seeing with Shiny for Submissions within the R Consortium Pilot 4 Working Group, which is talked about a lot, but if you're not familiar, take a look. I'd highly recommend checking that out if you're thinking about interactivity for submissions.

And then lastly, Gen AI continues to remain a hot topic, and this exciting news seems to come out daily for this, but we're exploring how we can bring this to stat programmers within the IDE that they're working with today. So that's really our focus as far as Gen AI goes, because there's a lot of directions we could go here. So we're working to build internal modules that we know will follow our J&J standards to provide good quality responses back to our stat programmers. And we think this is really going to be useful as our stat programmers continue to grow and learn and really treating this as that personal SME that is never too busy to answer your question and you get that response immediately, that's what we're looking for.

Really treating this as that personal SME that is never too busy to answer your question and you get that response immediately, that's what we're looking for.

And in next week's FUSE CSS event, I think that was called out in the chat by someone, Sumesh will actually be representing J&J during a Gen AI for Code Generation panel discussion. So if you're at US Connect, go and check that out.

Q&A

Yeah, I think we have a little bit more time. Maybe we can open up the floor for Q&A, actually, before I give a closing remark. I see a lot of questions coming in the Slido, actually. Thanks for posting those questions.

Let me address the first one, if that's OK with everyone, on the package validation. That's a very interesting thing, actually, that people are repeatedly asking about. Repeatedly asking the same question, tell me more about package validation and things like that. So let me explain how we did it in J&J. And again, that's not maybe the right way to do it, or maybe that's the way it works for us, actually, but you can also leverage it.

There is our consortium working group, our validation hub has put a guidelines document, which is really, really great to start with, actually. So we started looking at that approach, and basically, as we are juggling multiple balls, we thought it would be much better to partner with one of our vendors to see how they can help us in navigating that package validation process. So we have built a process, we have built a process with the vendor and with our IT team to have a CI-CD pipeline created and do a validation process. So as we have rolled out this R environment in a containerized fashion, we said that we are going to release two containers per year with those validated packages being in that container. So that is basically the approach we took at this point. So we work closely with one of our vendor partners to validate the packages and push that to us, and then we use our CI-CD pipeline through our JJIT to push them into a container release for that.

So there were a few. One was around package validation, which, Sumesh, I believe you addressed already nicely. There were two others. So there was one around, has the FDA accepted the submissions already? So there has been no findings that have come back as far as from regulatory around the questioning or use of R, but these are still ongoing. So I guess no, to be determined. I'm sure we're very excited and eager to hear. The other one was around, hey, there were four submissions, what clinical areas? And fortunately, with these being ongoing and not publicly available, I don't think we're able to share that information at this time. But when they have completed and we are, believe me, we'll be the first to let everyone know.

So one question about SAS is, how important was it to match the formatting of SAS outputs with an R? Did you find yourself creating basic versions of what SAS had done in the past? Yes, there are also some challenges with matching the SAS output. So yeah, of course, we tried. But we also had some R codes that, yeah, there were new outputs also created. But we tried to match, of course, as much as possible.

There is also something to mention, I think, is there the GAMIS project. So really, you have some of the statistical results or the processes that you sometimes use, they are a little bit different in SAS, the procedures compared to the ones in R. And there's where GAMIS really comes and plays in. So I really advise you to go to the GAMIS to really see if there are differences between the different outputs. But we tried to match as much as possible.

Maybe I will take the question about the open source. So what are comparative advantages for the large industry to adopt? I was hoping that I could already show that with all of our efforts, we can make this one solution that feeds to all of us. We are all aware that there are a little bit different solutions, processes, especially IT solutions that has been developed over the years. Now, with the sets of the different solutions, we can fit them into or replace, improve over the time. But of course, it requires a little bit small changes, adoption to this one. But the core thing is that we're still using the same algorithm, that we are using the same methods, methodologies to calculate how the compound works or it doesn't work.

So all of those things are common. And looking what is happening right now, even if the safety standard that were released, I think, I might guess, but I think it would be great if we start to speak the same language, the same, I don't know, safety output looks the same across the industry, right? Probably there are no big differences. So those are big opportunities, advantages to working together, but also moving out from what probably for last, at least I can see for last 25 years being industry that we used to work a lot in a silo on a solution. Now we can work together, bring our efforts and bring the more robust solution that might change the way we will operate in the future.

So we have a Slido going with Q&A. I'd encourage everyone to jump over there and upvote questions that you would like us to address. I'm going to start tackling a few of those here with the team. We've got about 15 minutes, maybe about 12 or so, and then Sumesh is going to wrap things up.

The next one, someone's curious about the infrastructure. I suspect this is where we need Satish, but I was curious with the team, if there's anything you want to add there about the platform and the infrastructure that you use internally. Yeah, I think we can touch upon from a business side of it, but Satish is the right person to give more details on that. And we are not hesitated to share that with the industry. Actually, we have done that in multiple avenues.

We are kind of using AWS, actually, Amazon Workspace, and we have containerized Posit Workbench and Connect to basically deploy the Shiny application. So as I said, it's a revolutionary way of thinking more containerized so that the package validations go into that container. And we have planned two container releases per year so that we can use it for the regulatory submission at the same time, keeping that open source nature a little bit in a controlled fashion. We also have a development area where the users can download those packages from CRAN and do all the exploration and things like that. So we have that flexibility is also available within the GXP environment.

I do think it's probably worth mentioning, too, that Satish did a webinar back in around 2019, I think, talking about the infrastructure. So if you look online, you should be able to find that. Then also, Satish was on the Data Science Hangout with Rachel, and he talked about the infrastructure there as well. So a lot of great resources for you to find.

The next one is about validation and the perspective working on with QA. Anything additional the team would like to discuss on that topic, or was that addressed? I think I touched upon the validation, but specifically to the QA and TQ organization within the company, sometimes it's kind of new to them when we talk about open source, actually. So we need to sit with them, educate them. What is the challenge? What does it mean by open source? How do you kind of ring-fence that open source within your own environment?

Again, Satish, I would give the credit to Satish. He was kind of spoon-feeding them with all the details that they need to understand the open source, because in J&J, this is a bigger group that has implemented open source at scale for submission. So we have to explain. We have to sit with the legal team. We have to sit with the QA and explain all these things. And they do understand that, actually. The world is moving. The industry is moving towards it, and they do support us. And if there are additional things that we need to fill and fulfill, that we will take care and do it. So it is how you educate, how you explain them that will be helpful. And if you need any support on that, feel free to reach out to us, and then we can explain how we handle the situation. Every situation is unique, but you need to tailor that, take it and tailor it for your needs.

I think there's a good one here on reproducibility and also connecting to your clinical data stores using open source for clinical trials. Is that something you could chat about? Sure, yeah. So ensure data integrity and reproducibility. Yeah, so it's back to infrastructure, really. So Sumesh touched on it, but the way that we're doing this is through containerization. So packages come in, like users say, hey, I need package XYZ. We evaluate the risk of that. We mitigate that risk. We put all the testing and paperwork in place to show that the environment is operating as we expect with that software. And then we then release that container, right, based off of this sort of quick process I just walked through. And that container is there. We can spin it back up. We can reproduce outputs with it at any point in time. And then you can just think, OK, well, we released the container today. What new packages have been requested? Let's work towards the next container. And then we'll have another release later in the year.

And also, in each run of our tables, listings, and figures, we also have a trace back of what we're using there as well, not only from the package perspective, but also from the data, the inputs as well. And we really have a nice tree that we can display to really see the distribution of all the assets that are needed to produce that DFO. And just to close, so always for the reproducibility, you need environment, you need the data, you need your code. So within our environment, you all have your elements. You always can come back and reproduce it. So everything's validated, and then therefore also can be reproduced.

It says, can you comment on the use of R and OpenShift tools in the Pharmacometrics space at J&J? Pharmacometrics also is using R as well. They also have, for J&J at least, also their own platform as well to do their work. It's also a GXP platform. The packages that they use are a little bit more tailored towards the Pharmacometrics. And it has also a little bit more integrated with other tools there as well. But yes, they also use R. They also use open source very heavily as well. And they also make use of the workbench also.

Next question I'm going to jump to. It says, what tips could you help provide someone to help them convince their organization to jump into open source and embrace R and use tools like Shiny and things that you discussed today? Yeah, I can take a stab at it. It's not like one size fit everyone. So it's basically based on the use cases. These tools will be helpful even when we go with Shiny. And all the advanced visualization and BI tools, people still love using Excel. We can't avoid them, actually, right? So it is the use case that you need to bring in.

And basically, Shiny has a lot of advantage if you're working on a statistical report and visualization. It has a strong stat part of analysis and creation with the R ecosystem. You don't need to go back and do a different language or something like that. So do your coding, back-end coding, and modularize it. And then kind of plug that modules into your Shiny framework. And publishing it would be a nice combination.

Rather than I also saw someone asking Power BI and Power App and things like that. These are great tools, actually. Don't take me wrong. But they have specific use cases that they could thrive. But certain use cases like we are looking for any kind of DMC report, safety report, CSR reports, those kind of use cases, I would say this combination is really, really good. And this is why we are going in that direction.

One question that people are upvoting is around the ROI and business outcomes that you've achieved in the R journey. It's yet to determine the ROI. But basically, we can see the return of investment on the open source is huge. And there is opportunities that we still need to explore and bring it to the table, actually. So it's not like R is cheap or R is open source, R is free and things like that. When it goes in an enterprise setup, there is a cost and there is a price tag for everything to be running it in a GXP-compliant way. So there is definitely a benefit from an open source community standpoint, I was telling about. It's a very big community that we have and providing that support and the momentum.

And we also have, like Nick mentioned, Mark mentioned, our consortium is there, FUSE is there, R and Pharma is there, Pharmaverse is there. Everyone is trying to kind of come to a common goal of promoting R and supporting you guys, actually. So I think it will be there. Return of investment is definitely there. But we can't share any numbers or anything like that in this call. But definitely in the later calls, we can come and say where we were five years and where we are now and project that.

You know, to piggyback off that, somebody saying, hey, we see the importance of this and we know to do that, we have to build a community. But how do you manage or deal with people that are new to this or maybe a bit anxious to embracing open source or maybe have used other tools? I think from my experience, I think, and the feedback, I could see that the new people that are coming to our organization are really, really open to those solutions, right? They can see, they quickly adopt to the new one, right? They don't have a history of other statistical software and they learn really, really fast. So I would say adoption, it's fantastic from the new people, from the newcomers.

And also, we know that there are different profiles on the market of the new programs. So we also need to adopt to the market. Yeah, and I think also slowly letting them experience R, let them experience the interactive shiny applications as well, see the benefits there. Yeah, have enough support there for them to also, yeah, do some projects, having some questions that you have or a problem that needs to be solved in R, just let them experience with it and walk through them and with them through those use cases. And that really helps. So yeah, you do it together. It's not also there, the community, the internal as well, the external plays a big role.

And if a pharma was new to open source or starting the journey that, you know, where you tackled it five, six years ago and they're starting today, is there areas that you would suggest that they start with or low-hanging fruit places to tackle for them to get up and running with? Yeah, I think it's a great partnership, okay, just don't kind of reinvent the wheel and try to see and use the available resources in the open source community and partner along with it. And that will be the way to go. If we have to start this again today, what would be the approach? That we would have taken is slightly different than what we would have taken when we started the journey in 2019. So I would jump into that partnership and things like that and collaborate and see what this industry is offering, what the community is offering, and leverage a lot of those things and go from there.

When it comes to training, Nick or Mark and the team, what strategies have worked well? What would you recommend for people to focus on when they're training people on the open source side? Yeah, I would just say training isn't everything, right? You have to think about training and then really applying it, trying it, using it, right, and committing yourself to doing that. So I'd say if I was going to sum it up in a sentence or two, that would be it.

Yeah, like Nick already said, knowledge is one thing. So training and knowing how to do it. But you also have to apply it, of course, as well. So getting enough experience with it so that you can apply it, whether it's in your own studies or the daily work that you do, or you have something aside, a problem that needs to be solved, really try first and get the knowledge, but then apply it. And that's something that we have worked as long as people are applying it and try to do that more regularly. That's how it sticks.

Closing remarks

Well, I know we're a little bit at time. So Simesh, Tad, Mark, Nick, thank you so much for this amazing webinar. Simesh, I knew you want to wrap some things up. So maybe over to you to finish the webinar for today. Yeah, thanks, Phil. Again, I hope you enjoyed the webinar and great questions. And we will respond to those questions.

And I believe you had some takeaways from this webinar. So one thing which I want to leave with you is basically, you might have heard about this proverb a million times. If you want to go fast, go alone. If you want to go far, go together. So my ask would be to challenge the status quo, make some bold moves, and engage in the partnerships. The open source community is behind you to support all the way. So feel free to reach out to us. And some of us will be attending the FUSE US Connect next week in Orlando. We will be also attending the FUSE CSS in other land. And the USR is coming up in North Carolina. So please connect with us and ask your questions. We will be happy to respond. Thank you for the opportunity, Phil. Back to you.

Awesome. Thanks, everybody, for coming. And keep an eye out for a blog post where we can answer some of the remaining questions. And with that, we'll see everybody next time in the webinar, probably this fall sometime.