Resources

GenAI and Pharma: Learning as we go | Episode 1: Copilot

In this episode, Phil and Cole discuss using Copilot in clinical trial submissions. Due to an industry-wide shift in pharma, statistical programmers are transitioning to open-source tooling in drug development. This skills gap that occurs when moving to open source is one area that can be aided with generative AI tools. In this episode, Cole and Phil show some of the features and functionality of a popular tool in the data science space, Copilot. More about open source in Pharma: https://posit.co/solutions/pharma/

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

So today, I think what we're going to tackle over the course of a couple weeks is looking at really practical ways that people can use GenAI. And I think where we're going to start is with Copilot and taking a look at that, right?

The main thing that I'm hearing from groups and people in leadership is how do we embrace this tool to help with the migration and transition to more programmers in the open source space, and specifically in the drug development space, using R for late-stage clinical trials. And so I think there's some stuff there where this can help with some of that. It's not perfect, but I do think there's a piece to this that's pretty cool. So we're going to show some of this today on how people can use Copilot to create some shiny apps.

How Copilot works in an IDE

I love it. All right, so why don't you open up our studio? I think the big thing about Copilot, everybody's used to the chat GPT interface, right? Of like where you're just like a chat back and forth with a superhuman artificial intelligence thing. Copilot is much more like recommendation driven, but you can actually talk to it.

There's a couple of interesting things here because I think when I show people Copilot for the first time, one of the light bulbs that go off is the way that they prompted or interact with it is through these comments. And so that's what's like, it's kind of interesting that like comment notion. And the Q&A thing is like a nice way to like make it into a chat bot.

I think that context is important, right? Like I think when I show people this for the first time, their like mental map is trying to attach it to like the experience I've had interacting with chat GPT in a web browser. And I think one of the big things that I think I would hone in on here is that we're a code first data science company, right? Like we believe that the work that you're going to do is impacted with code. It's you can take that build workflows. It's repeatable.

So we have this pretty heavy prerequisite that you're going to be working with code, right? And to do that, like that happens in an environment where you're programming, right? And so that's an IDE, right? Like this integrated development environment, which is RStudio. So having a tool that's integrated into where the code first data science happens is a huge piece of the puzzle for me. And so that's why I like the fact that you're not running two screens. You've got chat GPT on one screen. You're asking your questions, getting the results and copying it over. Like this is just happening natural as you're inside of the script, you know?

So having a tool that's integrated into where the code first data science happens is a huge piece of the puzzle for me.

So I think like what's important to show here. And maybe it's even worth backing up. It's like you just created a Shiny app. I'm always impressed that Copilot works too, mostly on the first try. Like I get something that works.

But the cool thing I love about pairing this stuff with an IDE is the way that if like Copilot makes a silly suggestion, like let's do with ggplot2. Let's see if it makes a nice ggplot2 recommendation.

You know, that's another thing I've noticed is that like to get it to do it, you have to like interact with it in a way that's like, oh, okay, I'm going to generate the code now. But like, for instance, like the fact that ggplot2 is not loaded and not installed, the IDE is what's picking that up, right? Because like, if I try to run this, we're going to have problems. There's no package called ggplot2, but the IDE recognized that. And so like you, it's all just tools that you get them to work together, right? So the IDE is helping me install this package.

I think something else that's pretty cool here is when I was playing with it yesterday, and I said to create a Shiny app with ggplot2 and the diamonds data, it launched the Shiny function for the package, then it wrote the app. And then if I went back to the top and it realized that I didn't load ggplot2. So it prompted to do that, which is pretty cool. So it's like assessing as you're coding what's happening in different sections. And so there was another section where I did something up above, it detected that, and then it put the reference to it in the next function.

That's awesome. Kind of like a linkage here that happens, which is pretty cool. But that's the key. Like if I mess up, you know, part of this, the IDE is going to pick that up. And so it's not necessarily going to give me, Copilot is not going to give me the solution necessarily, unless I go to the right line, then it does.

And so, yeah, Copilot is definitely much more of an assistant than like a chat GPT would be like, hey, write this app for me. But it is, it's nice that you can iterate on what you're doing.

Who benefits most from Copilot

Because what I usually tell people is like, I'm under the impression that Copilot seems to be really good for like intermediate to advanced users that are looking to enhance like the coding that they're doing. And I think there's a great video of, at R and Pharma last year, where Devin past year was doing a workshop and he used Copilot to write like a complex regular expression, right? That was, that's a pretty good way to like have Copilot assist you. And just like what you're doing now is like, hey, you know, help generate this plot that I want to make, you know, I think if you're a beginner, it's going to be a bit tricky, especially if it spits out something that's error prone, you know.

Yeah, exactly. That's the thing is like, you got to know what to do with it's kind of, with its stuff.

Is it happy with me? Hey, look at that. Carat in price. Turns out as you get a higher carat diamond, it gets more expensive. Who knew? Although there are some one carat diamonds that are really expensive because of their purity. Ask me how I know.

Um, good times. So what else should we do to this app? Here's what I would do. Why don't you start over? All right. So Cole, we're going to create a shiny app using Gen AI. And let's see if we can get it to work on the first try. So nice, simple, clean instructions here. So shiny app, ggplot2 and diamonds.

Oh, look, I need to install Plotly. Thank you, IDE. Let's install Plotly. We're taking a shiny app that we made with Gen AI. And we started off with ggplot2. Now we're going to use Plotly for it.

Plotly's play with the ggplot and ggplotly was really brilliant. It's brilliant. One line of code. Adds so much power. I think for the simplicity sake, it's probably one of the more powerful, like, you know, visual analytics type tools that exist in R. Because ggplot2 is such a, like, effective and simple package. And then to have another layer of, like, simplicity on the top of that, that even accelerates it, it's pretty, pretty stinking cool.

Karsten, he does good work with this. And there's a Plotly book, too.

So, like, I don't know. Like, what if now you wanted to add, like, a GT table underneath this? Would you do that?

I do love the interaction between the IDE and Copilot. Like, the IDE is more focused on my syntactical correctness, and my environment, and all that stuff. Formatting, that kind of stuff. And then Copilot brings a lot of the content, right? Like, how do you monkey with a thing? What is the syntax for creating a GT output?

So, somebody was asking me yesterday, if they have, like, a Shiny app, and the Shiny app is using, like, another function in another file, or if there's something else that's, like, related to that work that's being done, does Copilot go and assess those other files? Or is everything in this one file? I know that that's part of, like, yeah, that's part of the product story there. Like, how do you build the context and stuff like that? I mean, clearly, something was happening, because it literally, just a second ago, built a whole copy of my other file with all the comments removed. So, like, clearly, it has memory somewhere.

Building a Shiny dashboard with prompts

I think there's a piece of this, like, that comes up for me, where people are, like, I really want to use this, but it needs the context of, like, the other files. Yeah, it's totally confused. I mean, it's pretty confusing. It's pretty cool, though. It's building me a Shiny dashboard.

Because I think the thing that I hear from the farmers is they're, like, we really want to innovate with Shiny. And we want our programmers that are using other tools and other languages to come in and be able to start innovating with Shiny. But they don't always have the background with Shiny. So, can this be, you know, propel them into a place where they can start building apps?

I would say yes, because I literally did nothing except type tab. And this is what I got. Type tab. You know what's amazing? Type tab, type tab, type tab.

Now, for somebody coming into here, where do you think the challenges are going to be? For someone to just come in here, they don't have a background with Shiny. I think it's intimidating looking at this and being like, is this right? I don't know. And that's where you really do. You need the, and I think this will be a good plug for, like, the chat-based stuff. Because the chat-based stuff can generate this whole thing. All it, like, you don't even need to press tab, right? It's just enter once.

And there's, like, some things where you can, like, you know, copy this and give it back to the model. And basically, hey, like, how would I change this? Or change this other thing? So, you have, like, some iteration that you can do back and forth of, like, can you explain, right? Explain what is going on here?

And so, you can do things like chat with this stuff, right? That's where two tabs, one for plot, one for widgets. So, like, you can totally dig into this and understand. I think, honestly, a lot of the things, the most intimidating thing is starting.

In martial arts, our sensei often says the hardest belt to get is the white belt. Because most people will never get that. People always think black belt's the hardest or, you know, like, the fifth degree or whatever, right? But it's, like, the white belt is the one that most people never get. And so, it's the same thing, right? Like, starting is the hardest part.

In martial arts, our sensei often says the hardest belt to get is the white belt. Because most people will never get that. And so, it's the same thing, right? Like, starting is the hardest part.

But then you just need to have the good kind of, like, okay, well, what are the ways I get more information when I need it? You've always got the search engines and Stack Overflow. But now, you have this other tool in your pocket where you can just chat with all the code bases on GitHub. Then the model that Microsoft trained on it.

There's, like, two things I'll add here. Like, one, I was in a workshop showing people, like, tools within AI. And this programmer came up to me who was using SAS. And she said, look, like, this has made it much more approachable to me to take that first step. Like, I've been on the fence trying to think, like, how do I learn this? Where do I get started? But now, having, like, a tool in here that helps you get started and using something that you can prompt and ask questions about is a pretty interesting piece of this puzzle.

But I would say, like, I still think, like, watching you here and working with this myself, like, this is a good tool for, like, I sort of know what I'm doing or I know what I'm doing, right? I think when I was talking to that person about using Gen AI to help them take that first step, what they were referring to was working with ChatGPT and Chatter inside the IDE, which is a little bit more natural to, like, the prompting of, like, a UI that someone has. And I think, like, we'll cover that in a separate video.

But I think, like, right now, like, the copilot piece is a pretty powerful story for people to see. And if you're looking to get up and running and you do know how to program or maybe you've even programmed another language, like, hopefully they see that this is pretty approachable and they can do it with just a couple of different comments that they add.

And I think the next step here is, like, those people need to have an internal community or, like, a competency center or a champion or somebody that then they can go to and say, look, I built this app, and it's close to what I want. Right? This is what I want now.

Getting started with Posit Cloud and Copilot

So actually, Google, if you don't mind, Posit Cloud and Copilot. Like, there's a really good article that shows, like, here's how you get up and running with Posit Cloud and Copilot. And so, like, when I've been showing people this, like, they take my workshops and then afterwards, they're like, how do I keep using this? I'm like, you can use it. Right? Like, just go create a free account on Posit Cloud. It's free. There's no cost to it. And you can learn R. And there's primers there. There's, like, examples that you can pull online. There's the R for data science book. But and then they can just activate Copilot right inside of Posit Cloud, which is pretty cool, you know?

But I still think there needs to be a human element. Like, especially if you're a staff programmer, you've been working with NAS for 10 years, and you do build a shiny app, you need to go to someone and say, hey, does this make sense? Or, hey, I'm getting this error. What do I do? And I think that's one reason why I like so much the GSK team is that they have that type of support internally.

Or we have this Posit community, too. Yeah, I mean, there's a lot of really amazing stuff online. But I think, like, what you said that resonated with me so much is the whole, like, you know, the thousand mile journey begins with the first step, right? Like, you got to do that first step. And I think the more we can make things approachable, I think the more people will migrate into the world of open source, you know?

Yep, totally agree. All right. So this is Copilot. Copilot helped you create, I don't know, two or three, four shiny apps in about five minutes. So where do you see this going, Cole? Do you think, like, if you had to guess a year from now?

Where GenAI tooling is headed

What's weird to me is years are going by so fast. Like, it feels like it was yesterday that ChatGPT, like, fell on the scene and everybody was like, what? That was a year ago. So I think it'll simultaneously go further than we think and less, not as far as we think, you know? A year from now it'll look very similar, but it'll also be probably a lot better in quality and the interfaces will continue to get better of, like, how you're interacting with it inside of an IDE, that kind of thing. But 10 years from now, I have no idea. That's a lot will change in 10 years, hopefully.

Could you see, like, a day where they prompt to create the shiny app and then they prompt to say, now test this, right? Oh, yeah. Yeah, that's already happening today. People are totally doing that. I think the tooling is kind of janky for doing that, but there definitely are, like, those types of things. And, like, there's stuff like, yeah, write me some unit tests because I don't like writing unit tests.

And then what about, like, taking action? Like, do you see a world where they'll be like, okay, I just created a shiny app. I used the Gen AI to create the unit test and now I'm ready to deploy it. Do you see a day where they'll be like, hey, deploy this for me, too? Yeah. I mean, if you think about, like, we're building tools that help the deployment to connect, right? And so it could very much be like, oh, yeah, like, we trained this model, this LLM, on your admin docs. And so it knows how to use the Posit CLI or it knows how to run the deployment in the IDE, you know, like, those types of things. That's really where I think a lot of the work is for people like us is building those interfaces. They call it an AI, what, agent, I think. Agent, yeah. It's a phrase where, like, it can take certain actions. And so, like, you could imagine, yeah, having, like, all right, I want to deploy this now. And it's like, you know, it saves you having to click on that little blue publish button over here in the IDE. It just does it for you.

Using Quarto notebooks as learning tools

So one thing I would show, Cole, before we wrap up today is one thing that we're exploring with R and Pharma is a workshop in October that is basically taught by Gen AI, right? And so one thing that you and I did for a workshop a couple of weeks ago is we created a bunch of notebooks that basically had the prompts pre-built, right?

So, like, one thing I think would be cool here, because I do think the story and a big part of, like, what I've been telling people is you can use Gen AI and co-pilot to create Shiny apps pretty quickly, you know? What about, like, opening up a Quarto document and in those chunks, you can have, like, a more traditional workflow? So, like, so let's say you want to create a Quarto document and go to the, like, the source version real quick.

And so let's just do something like a real basic workflow, which is typically, like, import some data, build, you know, tweak the data a bit, and then build a visualization. So, like, something that's taking steps, right? So you can have, you know, if you hit Control Alt I, or I don't know what it is on a Mac, but you can insert some code chunks. So, like, this top one's going to be, yeah, like, get the data, right? And then the next one might be, like, okay, so we've got the data. Let's actually, like, use the plier to edit the data, you know? That's what it's doing here. It's filtering it to ideal diamonds.

Because I think, like, I think what's cool here is that, like, oftentimes what you see is, like, a more agile method of creating shiny apps where, like, people start off creating, like, the ggplot2 visualization. And then they create a dashboard using Quarto with that, right? And maybe they start adding Plotly. And then their users are, like, hey, that HTML-based report was really cool, but I really want to look at this geographic region, or I want to look at these, you know, patients, right? Yeah, so you get into shiny.

But I think there's, like, also an important story here that as you're working on your data and building, like, that beginning phase, like, you can do that in a Quarto document. And the Gen AI piece can help you here, too. And I could see a world where, like, what we were doing is, like, and internally you build these notebooks that have these prompts kind of built in for you that people can learn from. And so I imagine on GitHub or in repositories or classes, like, you'll see a day and age probably coming pretty soon where they'll just be empty notebooks in places. Like, hey, you want to learn about my new package? Here's an empty notebook. But the problem is, go run this. Right? And I think that's going to be a cool thing. And so I think it's going to be interesting to see what level of, like, engagement that provides with, like, learners, you know?

And so I imagine on GitHub or in repositories or classes, like, you'll see a day and age probably coming pretty soon where they'll just be empty notebooks in places. Like, hey, you want to learn about my new package? Here's an empty notebook. But the problem is, go run this.

Yeah, no, exactly. Yeah. That's awesome.