Data driven decision making in pharma | Ning Leng @ Roche-Genentech & Jing Huang @ Veracyte

Transcript#

This transcript was generated automatically and may contain errors.

Hi everybody, welcome to the Data Science Hangout. I'm Rachel Dempsey. I lead Customer Marketing at Posit. This Hangout is our open space to hear what's going on in the world of data across different industries and connect with others facing similar things as you.

So we get together here every Thursday at the same time, same place. So if you're watching this as a recording and want to join us in the future, there's details to add it to your calendar below. And just make sure it adds it for 12 Eastern time so you can join us live. And I know from our Hangout survey, people really enjoy getting to connect with others here in this Hangout space. So if you are interested in connecting with others, I want to encourage people to say hello in the chat, maybe introduce yourself, your role or your base or if you want to share your LinkedIn, it's a nice way to make friends here in the Hangout.

We're all dedicated to keeping this the friendly and welcoming space that you all have made it and love hearing from you no matter your years of experience, titles, industry or languages that you work in. Something I sometimes forget to add here is if you're hiring, please feel free to share those roles in the chat as well. And also 100% OK if you just want to listen in today, although we love getting to hear from you live.

So there's three ways that you can ask questions or provide your own perspective. So first, you could raise your hand on Zoom and I will call on you to jump in. Second, you could put questions in the chat and just put a little asterisk next to it if you want me to read it. Maybe you're in a coffee shop or walking your dog or something. And then third, we have a Slido link, which I'm sure Curtis shared already, where you can ask questions anonymously as well.

So with that, I am so excited to introduce our two co-hosts today. So we have Jing Huang joining us, SVP of Bioinformatics and Data Science at Veracyte, as well as Ning Leng, Interim Global Head of Data Sciences Acceleration at Roche Genentech. And Ning and Jing, I'd love to have you both kick us off with having you introduce yourself a little bit and your role, but also something you do outside of work, too.

You have to translate or at least speaking in the language where the leadership is more open to listen. Quick wins to build the trust and speaking the language that the leadership will be open to listen, they're more familiar with, is very important to get into the strategic space to have data insight impact.

I totally agree, Jing. I actually have a recent example that it was, like, my biggest learning was basically being specific, being concrete, and identify those quick wins actually driving the impact. Yeah, the example I want to share a little bit is basically, I think, started from maybe a year or two ago. Within Roche, we have, like, this very strong interest of making our trials more inclusive. We realized that, like, nowadays, like, I think the clinical trial, because we measure more stuff, so the clinical trials have longer and longer inclusion criteria. And then that means, like, you have a smaller and smaller patient population over there. So basically, we really want to make our trials more inclusive and then try to pressure test using data to pressure test that whether this is really a concern or this is just a hypothesis.

So I think in the very beginning of the initiative, actually, we experienced one problem, like someone called it, like, a nodding head problem. So if we present this concept, everybody agree with it, but we didn't see the needle being moved. We didn't see an actual difference on those studies. People all agree with the concept, but, like, people don't know what action to take. Yeah, so what we did, what has been very helpful is we really kind of, like, provide this, like, white glove, like, surveys to one particular study team and sit together with them and pressure test all the inclusion exclusion criteria and use real-world data to really pressure test that.

And we came up with some very concrete suggestions. And what we did, the next step is basically look at all the protocols in other trials, like with similar molecule, and then to see whether they include similar language. And then we show the management team that basically, look, we have this number of trials with exactly the same language over there, and there is no strong justification of including this language. And what additional evidence you want to see for us to remove this language for all those trials and how many more patients potentially we can enroll, like, by removing this one. Yeah, so I found that was very helpful, the kind of having this more hand-waving statement saying that we all should optimize our inclusion exclusion criteria.

Hiring for data science roles

Yeah, so because we are a huge organization, like, we have, like, 900 people globally, you can imagine that definitely we have people who are more like generalists. So, like, there are people we want them to be, like, equipment to take different roles, et cetera. But we also kind of, like, hire very, very specialized talent, for example, imaging talent, or real-world data experts, biomarker, like, genomic experts, et cetera. Yeah, so I will say that, like, for our hiring, I think for junior roles, especially when we hire master grads, we don't really look into specific kind of, like, specialty. So, like, what we look into is more curiosity and ability to learn collaboration skills, et cetera. Yeah, but, like, for those more specialized roles, like, we may look into more senior people.

Yeah, I can add a little bit, definitely similar theme for our company, which is much smaller. So, we are definitely very focused on the business need, and it can change from not only year to year, but quarter to quarter. Sometime there is a very focused need on a specific area, and we foresee that will last at least two to three years, for example. There is a big need on assay optimization right now. And then we need bioinformatics who are just extremely familiar with RNA sequencing assay, right? And then there are other need in different function we need that is almost like kind of wearing all kinds of hats and be fluent. Then it is less about a specific skill, but there is just a general focus on the data science or data engineering.

Breaking down data silos

Again, it actually echoes back to the quick win. Usually, just like Ning said, if you come up to say, oh, we need to standardize, we need to connect, people will nod their head, but they're not going to really commit. It's like, yeah, sounds like a good idea, no reason not to, you know, play along, but whatever, right? But usually what we do, and it's out of necessity for a smaller company, is concrete project, right? Concrete project, it turns out, because in our company, we want to really leverage the resource and talents across side. We went through several acquisitions in the last couple of years, and we really diversified our talents, and we realized, oh my god, like there is a, you know, resource and talents in this side, although the project is initiated by this legacy company.

And just to enable that talent can access that data, then there is a concrete problem you have to solve instead of calling it. So basically, it's almost bottom up. Start with project to say, hey, see how much efficiency we can create instead of hiring outside consultant, blah, blah, blah. We have internal talent resource to tackle it together and break silos, and then that become an ecosystem. You get a lot of buying from cross-functional stakeholders, then it's the time to maybe start an initiative, and that's how we came about.

Yeah, Jing, I totally agree. I think like starting from something concrete, and I think like Jing, you and me also discussed a little bit on like whether we, it's a matter of bringing data together, or it's a matter of bringing the people together, like the data scientists together. Yeah, I think I resonate in that a lot, because like Roche is a huge company, and it's impossible to bring other data together. Like even combining more kind of integrating data from clinical trials, which seem like a straightforward task, is not that easy.

On the other side, I feel like a quick win, like to Jing's point, is maybe bringing the data scientists together. For example, in Roche, like in pharma, like we have data scientists varying different heights. There are people who are really, really good at clinical trial design, clinical trial data analysis. There are people really good at real-world data analysis. There are people really good at genomic analysis, and sometimes we see that like actually like each person is only kind of like representing their own kind of like specialty, and then they only share their results like to the stakeholders, for example, clinicians.

And then for the clinicians, actually it could be confusing if the clinician like see some result from real-world data, some result from clinical trial data, some result from biomarker data, and we all know that there is no 100% consistency like across different data sources, like all the models are wrong, all the data's are wrong. Yeah, so kind of how to triangulate, navigate through those data sources is not an expertise from the clinicians, from our stakeholders. Actually, it is our expertise. It's an expertise like of data science to really kind of like take all those diverse results in and find an interpretation in acknowledging the limitation of each data sources.

And I can maybe add one thing. It may sound mundane, but when we say bring data together, it is very worthwhile to figure out for each stakeholder what that means. And you will be very surprised that actually may mean a very different thing. One person may be meaning we have to standardize literally into one database. One person may be, oh, I just need the data could be linked together. They could be conjoined, whatever. I don't care where they sit. It could be even further loose, as Ning said. I just need different data scientists at different sites to be able to access so they can analyze the data. So what bring data together means figure out that understanding difference.

And then as a data science leader, you come in for that specific problem to say, what makes the best sense for this particular project? Right. Then you educate all the cross-functional stakeholder to say, I know you initially saw means this, but for this project, because of ABC, we really need to do either just this or more this, right, to enable the project to go forward. That really help establish your technical leadership and strategic leadership, because the cross-functional leader is very important for them to feel they have been heard. Instead of we just rush into our own assumption, they were like, what? This is not what I saw, what data together even means, right? So spend the time at the beginning to do this understanding alignment will benefit tremendously down the road.

Addressing the nodding head problem

I can also chime in, right? It depends on things, right? If it is really just, oh, bring to the awareness, bring data intelligence across functions without a need for a concrete output, nodding head is great. Then you follow up with a quick survey, oh, 90% support, right? That's just wrapping the project up. It's a great success. We increase the awareness. However, if it's an urgent project, I think instead of relying on other people to be a strategic leader, we have to step up to do more things technical. I encourage my team members, including myself, we have action item beforehand. We say, if you agree, please do this, this, this by this week, right? Or if you need to review, here's the timeframe we can answer question, blah, blah, blah, blah. But we expect you to take this action by this time if you agree with the general plan.

If you have other thing you need to sort out, clarify, whatever, we are here to help, right? And then you have a framework, a timeframe. Then you kind of start nudging people. Like you said, you're going to deliver three weeks. That's what we decided in this meeting, followed by meeting minutes, where's the progress, right? So you start the framework that will help make everybody accountable instead of just nodding head, right? You start, you prepare ahead of time, like what is required, implied by you nodding your head.

Yeah, I totally agree. I also feel like sometimes when I talk to my team, like, yeah, sometimes I also see myself and including myself, some of my team members, like we have the perception that we will present this problem to a meeting and magically in the meeting, someone will chime in and find a solution. But 90% of the time, it won't happen because for people in the meeting, they only think about this specific problem for like 30 minutes, an hour. And you have been thinking about this problem probably for weeks and months. And if you don't have a very clear, to Jing's point, if you don't have a very clear kind of like expected outcome or expected next step, like for this project that no one in the meeting will come up with a better plan over there.

Automating workflows and managing change

Yeah, I feel like in my mind, like kind of like when you talk to people like about their day-to-day job, there are definitely things they enjoy doing and there are things they don't enjoy doing. And oftentimes the things they don't enjoy doing are the ones we try to automate over there. Those are the repetitive tasks. Those things can be standardized, et cetera. Yeah. So I think, yeah, I totally agree with you. I think there is a fine balance over there, kind of like trying to like align with people or make people realize that by automating certain tasks that allow you to have more time to work on the things that you enjoy about. And also to ensure people that in the portfolio, there are sufficient amount of work that you will be more interesting and more impactful than kind of doing this repetitive job every day.

Totally agree. I think, you know, from a personal level, make sure it's, you know, more enjoyable. And then definitely on both Ning and my case, we just have way more work we can handle. And I think from business level, right, there are two ways. One is kind of not offensive, but proactive, right? Being able to automate a lot of the tedious, repetitive work, we will be more innovative, right? We'll be more at the cutting edge and more competitive in our own business. And we'll be able to use our insight instead of just the technical capability of doing repetitive work to basically create not only more work for ourself, create more data insight value for the company, right?

And also from a defensive point of view, if you just coming out of the fear, oh, I couldn't lay out people, so let's not automate, you become obsolete. You become much less efficient compared to your competitors in the same space, right? Then the company will suffer. Versus thinking about you being replaced by a more modern technology. If you're a competent data scientist, you should be able to create more values than just doing repetitive work.

Open source and FDA submissions

Sure. Yeah. So I think my example may be more toward the deliverable, but not too much about the design. I know that you mentioned about synthetic control, et cetera. I know within Roche there are also a lot of colleagues who are like working on that, working closely with FDA in terms of kind of like having those more modernized design and getting buy-in for those modernized design. And for myself, my involvement is more in the kind of like deliverable, actual finding of like phase three trial of a product.

Yeah. So basically I've been working in this space trying to kind of like enable the industry to use open source language and especially our language for FDA submission. Yeah. So I would say like, yeah, it is a process. It's definitely a process. Like exactly as you said, I think like from, I don't know which year, maybe Mike remember which year, I think it was like from 10 years ago, FDA like had this guidance saying that FDA is not requiring any statistical software for this like drug filing. However, like in reality, I think like every single company is using commercial software for their filing and we don't see the needle being moved. So similar to the nodding head problem.

And then, so we started this like our consortium, our submission working group probably four years ago. And then the idea was kind of like having those publicly available examples showing actually sponsors can use open source software to do drug filing. Yeah. So like similar to a previous example, like we have like a connection from FDA, like from FDA, they also have open source and services like from their group. So we identify those people, collaborate with them, and then like did those like pilot filings in the public space, showing people how to do that, and also kind of learn what is the best practice over there.

The other thing, what I learned is also in those cross-industry collaborations, it's very good learning experience to know like what's in it for them, like what's in it for FDA. Yeah. Because like in FDA, like they have people who want to use open source language, but they also have limitations of their system, et cetera. There are so much learnings in those cross-industry collaborations for us to realize that how can we make their life easier when review the application.

Yeah. So, I mean, I have to do this whenever I see Ning, because like her and the R Consortium Submissions Working Group, I think are really transforming the future of submissions to regulatory authorities. The great thing about it is that it's a kind of, it's bipartisan. So it's the industry and the regulators both looking at this together and saying, what shape do we want this to take in future?

Maybe not derail the topic a little bit, but I can share maybe some learnings over there. So basically we are wrapping up our pilot number three right now. And I think that one learning is that, going back to James' point, again, start small. So when we did pilot number one, there are so much unknown over there. So the scope was really, really small. So for our pilot number one, we decided to only do an experiment to submit four table graphics. And we are not doing any data set submission. And we want to test out different tools, et cetera. It's really just a feasibility testing at that time. And I think that was great so that we can wrap up the pilot one in several months and get a formal FDA letter saying that this is feasible.

Yeah. And after that, I think we became bolder and bolder. And our pilot two was actually submitting a Shiny app, an interactive app to FDA so that they can redeploy the app on their system to do their review. And our pilot three is adding this data set component to show that the open source language-generated data set process, the data set can also reproduce what the commercial language produced. And then now we are looking to pilot four where we will introduce, again, very boldly introducing this container and also WebR component, which will be really, really interesting. Yeah. So I think the journey has been really nice to have this iterative learning. And I see our goal become bolder and bolder after knowing each other better.

Yeah. So I talked about this at PositConf last year that over 30 years in industry, we've gone from paper submissions to PDF submissions, which are essentially the same things but with hypertext links. But what I think we're getting to with the WebR and with the submissions for pilot is something utterly different, really fundamentally different. And I think that's really exciting because it's getting into the 21st century. We're typically, pharma industry is known for not moving terribly quickly. But I think this is an example where we are.

Infrastructure and languages

So like from Roche, like right now, we are finally moving to a cloud infrastructure. Yeah. So previously for pharma, oftentimes we use commercial software, we use proprietary system, etc. Yeah. So it's a little bit black box sometimes. And also you can imagine that it's really hard to add in new features when there are new AI models coming in or new language pop up like Julia, etc. So it's really hard to make them work on commercial platform or together with commercial language.

Yeah. So we are in this process of moving to AWS-based infrastructure. And on the infrastructure, we enable kind of data scientists to use whatever language they found it being appropriate for their project. So our Python stats like Julia, etc. Yeah. And we are in the middle of that. So this year, we are hoping, like in Roche, product development data science, we are hoping to move 90% of our active molecules to this new platform. And I hope next year, once people settle down with the platform, etc., we will see more innovations and more automated solutions, etc.

Yeah. We're in a different regulation space. I would say not as rigorous as drug, and that provided us some flexibility. So R and Python has always been, you know, kind of the building blocks and fundamental language we always use. And we then use specific language to tackle, for example, sequencing, sometimes tackle imaging analysis. And we just use whatever is the most fit and most, you know, advanced, right? It could change from year to year as well. We are really agnostic to the language we use.

And infrastructure-wise, there is definitely a move from on-prem to cloud. Interestingly, to add to the complexity, because our recent acquisitions, not only we leverage different cloud platforms, there could be also regulation, like GDPR restrictions, like we acquired two European companies. What company can flow from U.S. to Europe versus Europe to U.S., and how to control that in AWS could be, you know, not only a solid but sometimes complex problem. So we are definitely also in the journey.

Balancing business needs and data science innovation

I can provide, because our product is machine learning models. So it is – I will say in the most healthy organization, I certainly believe Veracyte is one. It's never one or the other. It's a joint one, right? So usually it's an iterative process. We say, hey, let's just start, right? Like one example, hey, you know, our commercial folks will come say to say we continuously hear feedback to say it would be really helpful to add a feature to our product, right? Or they will come say, well, it would be really helpful to add A and B and C to our product all at once, right? Then we'll say, hey, guess what? From a data science point of view, A needs six months development you can deploy. B needs about two years. C, we are uncertain. We need to do phased approach, right? What is the investment? Remember finance people is in the leadership. What is investment in each? What is the ROI? What is uncertainty? Let's decide as a team, right?

And then, you know, that's the initial data evidence we provide with our expertise, right? How long it will take to develop? How long, you know, how large is the investment? And then the commercial come back to say, guess what? Now I see A, B, C. Each will give us how much more revenue or volume increase or stickiness from the customer. This is what we get. And then with that two piece of information, we decide to say, hey, maybe sometimes it's a selection. Let's just try B because it's so quick. Or they will say, let's try B and A at the same time and hire more people because it sounds so important. So it's really a dynamic, cross-functional decision for our business.

Yeah, I agree with Jing. I also feel like it's rarely A versus B, like kind of like a functional interest versus business need. Often it has both components. Yeah. So from Roche, like we are actually, our culture actually encourage a lot of grassroot effort. So we encourage people to innovate in their daily work. So there are a lot of ideas popping up and et cetera. And oftentimes those ideas are coming from actual study need. And to the earlier point, there are the things people feel bored about, like the repetitive work, et cetera.

And a larger question is some of those needs are only tied with one particular project or one particular study. Some of them are applicable for the whole portfolio or across different projects, et cetera. So I feel like what our leadership team try to do is basically still encourage innovation, but if they see any kind of innovation and they see an innovation that can be applied as an amplifier across a large number of projects, then they give a push to this project.

Yeah. So that kind of like taking an example from myself, actually, when I joined Roche about eight years ago, I was about statistician. So I was working on clinical trials, like doing design, doing clinical trial reporting. And at that time I realized that there is opportunity for us to adopt open source languages and do better code sharing, code standardization, et cetera. So actually it was kind of like my hobby project for several years until maybe like three, four years ago. In Roche, we decided to double down to this strategy of adopting open source tools and platforms. And then this became my day job. Yeah. So I think like, yeah, kind of like there is definitely a kind of like incubation period, but like from our company, we encourage like all the new ideas from the grassroots efforts.

Career advice

I can start. So we usually are very technical. This is about communication, just to set the context. We're usually very technical people, but the audience sometimes, especially for strategic decisions, are usually not technical people. Sometimes we want to be so rigorous, we say or claim humbly that add 20 caveat, right? Does the 20 caveat really need to be said in the meeting? So I always say, really think about the audience. It's rarely that important what you're about to say or what you said is about what they hear, right? What do you want to get out of that conversation, right? Then structure your communication. Be more confident on the positive claim if there is one, right? Maybe put the caveat in the no section.

You are no longer a student. You don't have to be demonstrate your capability. We hired you. That's already enough. You should demonstrate what the result means. What is the next step, right? How I can move the business forward? What is the impact to each cross-functional leaders beyond the technical caveat and technical details, right? So that's always a communication tip I give people who work as a data scientist or in data analytics in the industry setting.

You are no longer a student. You don't have to be demonstrate your capability. We hired you. You should demonstrate what the result means. How I can move the business forward? What is the impact to each cross-functional leaders beyond the technical caveat and technical details, right?

I really like that. And I know from BBSW conference, we also talk about this decision size, like my side. So basically, to James' point, I think the main goal is enable decisions, is to enable actions. It's not showcasing our capability that I can generate 200 outputs over there or showcasing that I can do a really complex model over there. It's really enable the next decision.

And I remember I was reading a book called The Manager's Path. And on one section, it's saying, how do you manage your one-on-ones with your manager? I think in my earlier career, I always treat my one-on-one as a reporting meeting, just a laundry list of things that I did, et cetera. And after reading that part, it really emphasized that treating every one-on-one meeting as an opportunity to ask your manager to help you with certain things, kind of like manage app, to ask them to take certain actions to help you or ask them to share a certain context that can help with your work. Yeah, I found that really insightful.

And maybe another thing to add on is also learn from BBSW. I think I learned so much from those nonprofit organizations. It's kind of like finding a network of mentors. Yeah, I think for BBSW, it gave me the opportunity to meet a number of leaders in the Bay Area. And everybody had their own perspectives. Everybody had different career journey. People working in different type of companies, different type of groups. And I found that having this mentorship coaching, like the opportunity to get mentorship and coaching from a diverse group of people have been really, really helpful. Yeah, because I think everybody has their blind side. And what your manager believes or what your leadership tends to believe may be slightly different than other leaders. And it's really helpful to get the diverse perspectives.