
Joe Cheng - Summer is Coming: AI for R, Shiny, and Pharma
Summer is Coming: AI for R, Shiny, and Pharma - Joe Cheng Abstract: R users tend to be skeptical of modern AI models, given our weird insistence on answers being accurate, or at least supported by the data. But I believe the time has come—or maybe it’s a little late—for even the most AI-cynical among us to push past their discomfort and get informed about what these tools are truly capable of. And key to that is moving beyond using AI-enabled apps, and towards building our own scripts, packages, and apps that make judicious use of AI. In this talk, I’ll tell you why I believe AI has more to offer the R community than just wrong answers from chat windows or mediocre code suggestions in our IDEs. I’ll also introduce brand-new tools we’re developing at Posit that put powerful AI tools within reach of every R user. And finally, I’ll show how adding some AI could make your next Shiny app dramatically more useful for your users. Resources mentioned in the talk: - Slides: https://jcheng5.github.io/pharma-ai-2024 - {elmer} Call LLM APIs from R: https://elmer.tidyverse.org/ - {shinychat} Chat UI component for Shiny for R https://github.com/jcheng5/shinychat - R/Pharma GenAI Day Recordings: https://www.youtube.com/playlist?list=PLMtxz1fUYA5AYryl4t2mtqBngqWDrnMXJ Presented at the 2024 R/Pharma Conference
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
All right. Hello, everybody. We are so excited to have you for the first day of rPharma for 2024. Thank you, Harvey, for those wonderful opening remarks. And it is my sincere pleasure to be able to introduce our first keynote speaker, who honestly, for this crowd, definitely does not need an introduction. He was actually one of our very first presenters when rPharma began many years ago. And I can speak for many of us, I'm sure, when we were completely amazed by the directions that we can take Shiny in the pharmaceutical space. But now, it's even much more than that. So I'm thrilled to introduce the positive CTO, architect of Shiny, Joe Cheng. So Joe, the floor is yours. Thank you for being here.
Thank you. Hello, everyone. I am super pleased to be here talking to you about Shiny. rPharma is one of my favorite conferences. And it is with sort of mixed feelings that I come to you talking not only about Shiny, but also about AI. And I have a feeling that a lot of you might be a little skeptical of this topic. I myself have been quite skeptical about this topic in the past. And it's easy. It's easy to be skeptical about Gen AI these days. I mean, the kinds of claims that people are making about what LLMs can do are just so fantastical. And even like much more science fictiony claims about what's about to happen in the next couple of years. It's all a lot to take.
And the effective accelerationist movement is like, to me, pretty cringey. If this is not something you've heard of, don't Google it. You don't need to spend your time like that. And meanwhile, in the real world, what we're seeing when we use these tools is they have all these limitations like chat GPT is constantly telling us plausible falsehoods. We have co-pilot writing code that often just doesn't work or doesn't make sense. Not to mention all the AI generated nonsense that has taken over Facebook, both in comments and in posts and in images being posted. It's all enough to ask like, is this even the future that we want, that we want to be running towards?
And I have sort of changed my mind to have actually a pretty optimistic view about the usefulness of these sort of AI tools. In terms of like the Gartner hype cycle, the plateau of productivity that's coming for LLMs, I think is going to be like pretty incredible. And I think of it as like when the iPhone came out, if you just thought of it as a phone, then not actually that exciting. And I think like judging LLMs by just your experience of chat GPT is similar. Like what was interesting about the iPhone is that it put compute touchscreen sensors and connectivity in your pocket and everyone's pocket. And like, it was hard to tell in the beginning what were going to be the specific things that the iPhone is going to be really exciting and useful for, but you knew it was going to be something.
And I sort of feel that way about LLMs, not when viewed through the lens of chat GPT, but when viewed through the lens of their APIs. I just feel like every programmer on earth who cares to now can have access to a magic function that approximates human reasoning. Now it may not perfectly reason, but the kinds of reasoning that it does has never been available to programmers in the past. And now it is available to all programmers. And I think that interesting things are definitely going to happen.
I just feel like every programmer on earth who cares to now can have access to a magic function that approximates human reasoning.
Demo: AI-powered Shiny dashboards
So I want to set this up by starting with a demo and kind of showing what me and my team have been doing. Some of the things that we've been doing with this magic function in the cloud. And use that as a little bit of motivation for the next section, which is we're going to talk about how does this actually work? Like as our programmers, how do we call LLMs? And then finally, just to get a little bit into this conversation about, so we know these things are cool. We know that they're powerful and they're easy to use. Hopefully I'll demonstrate that they're easy to use. What are some implications for how we can use these responsibly and not be tricking our users and stakeholders with incorrect answers that are very beautifully presented?
So let me start by showing a demo. I'm going to start with a very sort of general example that has nothing to do with pharma, because it is a data set that I'm more comfortable talking about. This is just a restaurant tipping data set. I showed a version of this app earlier this year. And what you see on the right-hand side, filling out most of the page, is a very typical Shiny app, a very typical Shiny dashboard. And you have some metrics at the top, you have a view of the raw data, and then a scatter plot and a ridge plot at the bottom. And in any normal Shiny app, you would expect on the left there to be some controls, some sliders, some drop-downs that let you filter the data on the right. And in this case, we've replaced all that with a chatbot.
So this chatbot is powered by GPT 4.0. And I can ask it to filter the dashboard the same way that I would want it to be filtered in a normal Shiny app. So I can say, like, show me only bills over $20 on Sundays. So I ask the chatbot to do that filtering for me. It recognizes that I'm making a filter request and performs that filtering for me. Now, notice at the top, there's the SQL query. It says select star from tips where total bill greater than 20 and day equals Sunday. This is important for a couple of reasons. Number one, this is the chatbot showing me its work. I'm able to look at this SQL query and determine whether I think it is correct. And if it's correct, then I can trust it.
Because the second part of this is the only way this chatbot has the ability to alter any of the rest of this Shiny application is through this SQL query. It has no direct ability to affect any UI on the screen. And that's really important because if we were relying on this chatbot to draw all these pictures for us, to fill up this table with data, I personally in 2024, I wouldn't trust it to do that. I mean, I think that is way too much risk of hallucination and you would end up with numbers that are subtly wrong. This way, it's really only the SQL query that I have to verify looks correct. And it turns out that the best foundation models these days are quite good at writing SQL queries. And it is almost invariably right.
The final feature I want to show you of this particular dashboard is there are these buttons next to each of the plots. So this is a plotly plot and it's fully interactive. But maybe I'm not used to reading these kinds of, you know, like I'm not sure what a low S is and I'm not sure how to interpret this trend line. So I can click this button and what this will do is it'll take a screenshot of the plot and send it to GPT 4.0, send it to our large language model and ask it to explain what it sees. And if you want, you can ask follow-up questions.
So those are a couple of quick demos of how we can just one idea for how we can use this technology together with Shiny. And the code is available in these two public repos. The second one is actually slightly simpler and easier to read than the first one. But both of these demos were written to be easy to fork. So if this technique looks interesting to you and you'd like to try to apply this to your own data and visualizations, please take a look at those repos.
Using LLMs from R: Elmer and tool calling
So, I come bearing great news. I come bearing gifts in the form of packages. Number one, there's this new package called Elmer that's just a few weeks old, created by none other than Hadley Wickham. It is a package for R for working with chat APIs. The pros are it is easy. I mean, I'm going to show you in a second. I believe that it's the easiest LLM API client in any language. Or at least tied for easiest. I don't think there's anything easier out there. Secondly, it's powerful. It's designed for exactly the capabilities that we thought were really important for people in R and in Shiny, which is multi-turn conversations, streaming, async, and tool calling. And we'll talk about most of those things.
And it is compatible. You can use it with OpenAI. That's the company with the models behind ChatGPT. You can use it with Anthropic, which is the company behind Claude, Google Gemini, AWS Bedrock, and open source models as well. So, it is designed for us to be able to expand the set of models as new models and new APIs arrive.
And I just want to acknowledge that there are a bunch of other packages that are out there, including a bunch of them that are already on CRAN, that have other people's takes on working with chat APIs from R. None of them do exactly the set of things that we needed in order to make this work for Shiny, especially the async part was a huge pain to implement and was very, very important to us.
So, here's how you use Elmer. To get started, the most difficult part is that if you're using one of these commercial models, you need an OpenAI API key. Or, sorry, you need an API key. And if you're using OpenAI, for example, you need an OpenAI API key and for this environment variable to be set. Doing so is pretty easy. You just go to the OpenAI site and create a key. You do need to pay for it. Like, I think $5 will easily get you started for, like, a week of experimentation. It's surprisingly cheap when you start programming against these things.
So, you need to get that key and assign it as an environment variable, probably in a .R environment file. There are instructions on the Elmer site for how to do this part. But once you've done that, then you can simply call chat OpenAI to create a to start a conversation using an OpenAI model. And in this case, we're using GPT 4.0, which is their best general purpose model. We can also give oops. We can also provide a system prompt, which are sort of, like, the ground rules that describe how you want this chat bot to behave in this session. And in this case, we want it to be terse. Don't go on and on like chat GPT usually does.
So, the first question I ask it using this chat method is when was the R language created? And it says the R language was created in 1993. And then I can call chat again. And if I call chat again, that's actually continuing a conversation. So, this chat object automatically keeps track of the previous questions and answers. So, this follow up question I'm asking, who created it? If Elmer was not keeping track of the previous part of the conversation, then this question would not make sense. But it is so it knows that I'm referring to, like, who created R and it says Rasa Haqqa and Robert Gentleman.
The next thing that Elmer does for you is I think one of the lesser known things that LLMs can do, but definitely the most exciting to me, and it's a feature called tool calling. And what tool calling lets you do is it lets you give new capabilities to an LLM by writing functions. In our case, by writing R functions. So LLMs are just models that are, you know, sitting on a server somewhere. They don't have the ability to execute code. They don't have the ability to access the internet. They don't even know what the current time is or what kind of server they're running on. They are just static models that were trained in October, you know, 2023 in the case of GPT 4.0. So by adding our own tools or R functions, we can make them a lot more powerful and give them capability.
So the way we do that is we write R functions and then we expose them to the LLM. And once we've done that, then when we're speaking with the LLM, the LLM can decide to call these functions with whatever arguments it chooses. And then whatever the response is that's returned, it can integrate that new information into its responses if it wants to.
So just to give you an example of that, one thing that these chat APIs definitely cannot do is they don't know what the current weather is. So the first thing that we want to do is we need to create a function in R that can get the current weather. And there's an open media package that does a really awesome job of this. So we'll write our friendly little wrapper around this and put some Roxygen documentation in there. So get current weather takes the latitude and longitude and returns the current weather in JSON form. Secondly, now that we have that function called get current weather, we need to tell our chat object about it. So once again, I'm creating an open AI chat using GPT 4.0. And then we're calling the register tool method and telling it about this function.
Now that we've done that, now that we've given this tool to the chatbot, now we can ask it, like, what's the weather at Fenway Park, which is a baseball stadium in Boston. And it can go and under the covers it's doing whatever it needs to do, but it comes up with an answer and replies to us. And we can peek under the covers a little bit by taking that chat object and printing it and we can see what happens. So the user asked, what's the weather? The assistant called get current weather with the latitude and longitude because the LLM happens to be quite excellent at taking place names and turning them into lat longs.
Now, that's a very simple example, but there's all sorts of things that we can do if we have the ability to add tools. So what we just saw was fetching data. So you could have them search the web, you can have them call an API like a weather API and give them all sorts of access to information that they don't normally have. You could have LLMs perform calculations. LLMs are extremely bad at math for various reasons. And by giving them tools, you could have them write R or Python code to execute. That's actually when you use chatgpt.com, it's actually doing that for you quite routinely.
You could use LLMs could have tools to call other gen AI models. For example, you could hook up an image generation like stable diffusion or Dali. You could make that available to a chatbot using tools. And you can also provide tools that let LLMs take actions. You can in the context of a Shiny app, you can let an LLM modify a reactive value. That's what happened in my side bot demo. I was having it update a reactive value. That's what's the current SQL query that we want to apply. Or you could have it navigate to a page or a tab or all sorts of things you could imagine a user doing in a Shiny app. You can let an LLM do that.
shinychat and putting it all together
Now there's the question of how do you take an Elmer chat and bring it into Shiny and create a chat interface? And the answer is this shinychat package. Neither Elmer nor shinychat are currently on CRAN, but they both will be within the next couple of weeks. But for now you can install it from GitHub. So shinychat provides a very simple chat bot UI for Shiny for R. And it is designed to work really well with Elmer, but also hopefully to work with any other chat clients that might emerge in the future that people might want to use with Shiny.
I'm not going to go too deep into this, but just to show you the order of magnitude of effort it is to create a chat bot. This is a complete example of a Shiny chat bot. You have your UI saying, here's where I want the chat to appear. You create your chat object, just like I've demonstrated on the last few slides. In this case, the system prompt is, you're a trickster who answers and riddles. And then any time the user submits input, then we ask the chat object to start streaming the answer to that. And then we use this chat append function to send it into the UI.
Putting this all together, I've now shown between Elmer and tool calling and shinychat, these are all the pieces that you need in order to implement something like those side bot demos from the beginning. So you create a Shiny interface using shinychat, you converse using Elmer, and then you register tools to control parts of your Shiny app.
Using LLMs responsibly
Okay. I'm really running low on time. So I'm going to fly through this probably the most important part about using LLMs responsibly. But also, I don't come from a pharma background. So I don't have a lot of pharma specific answers here. But I do want to talk about some of the general approaches that we've been thinking about these problems. So there are a lot of dangers. The most obvious, incorrect, and unverifiable answers. These things are really uneven at the reasoning. There are math and coding capabilities. Not bad, not good, uneven, which is maybe the most dangerous of all. You know, these the results are not interpretable, because these are giant black boxes. Or at least you have to go through a lot of effort to make them somewhat interpretable. They're definitely not reproducible. There's no seed that you can set on open AI. There's a seed, but there's no seed that you can set that will guarantee that in a couple of years you'll get the same result from the API. You absolutely will not. There are data security privacy issues.
So incorrect and unverifiable answers are the most important. And right off the bat, it's okay to say no. It's okay not to use AI if you need your answers to be deterministic and correct. But if there are cases where sort of AI is worth the slight uncertainty, then design your AI workflows to keep a human in the loop if the answers need to be correct. So giving the user the ability not just to inspect the answers, but what was the method used to get the answers, like you saw with the SQL in my Shiny example. If possible, use software to verify the LLM's answers. Like for example, when LLMs generate code, sometimes there's syntax errors. You can use software to detect syntax errors very easily and send those back to the LLM. And often they can sort of try to auto recover very nicely.
And finally, like identify use cases with squishy answers. Use cases where you want to use software, but the answers are not inherently very quantitative or, you know, objective. So, you know, creating summaries or certain kinds of sentiment analysis feel like really excellent uses for LLMs for the most part. And finally, like we want to always be clear with the user that AI is being used and that the answers might not be correct. So as tempting as it is to just kind of put up our AI features and pretend like they're perfect, we should really be clear with the user that they should be paying attention and checking their work.
The last one I really want to talk about is data security and privacy. And this matters for every large corporation for sure. Every company, including Posit, we are loathe to send queries that might have our proprietary code, our private data, any other kinds of trade secrets to these startups, right, that we don't have legal agreements with really and the terms of use are, you know, kind of unclear and can change over time. So I just want to point out, oh, there are a couple of alternatives. Number one, the most obvious one is open models. Like there are models that you can deploy on any server of your choosing or on your MacBook, like the llama family of models. A lot of people think of this first when they think of data security, but I will warn you that I have found these models so far to be a lot less smart, especially for the kinds of things that we've been trying to do on the Shiny team.
However, I think maybe the most hopeful answer is that AWS and Azure have hosted models, hosted Anthropic and OpenAI models, including their latest and greatest. And they do have guarantees that, you know, if you're trusting them with your databases already, if you're trusting AWS with your S3 data, with your Postgres data, with your, you know, what have you, then there's really probably no reason why your corporate IT or legal should have any problem with using AWS hosted LLMs as well.
I think maybe the most hopeful answer is that AWS and Azure have hosted models, hosted Anthropic and OpenAI models, including their latest and greatest. And they do have guarantees that, you know, if you're trusting them with your databases already, if you're trusting AWS with your S3 data, with your Postgres data, with your, you know, what have you, then there's really probably no reason why your corporate IT or legal should have any problem with using AWS hosted LLMs as well.
Okay. Thank you. That is what I have. And here are the links if you want to learn more about Elmer and shinychat.
Q&A
Well, thank you so much, Joe. And rest assured, we budgeted time for this. So we got a lot of questions that I'd like to pick your brain on as we go. And certainly feel free to send them in the chat. I'm going to talk about a few of them here. I've been keeping track. I got a little notepad here. So we'll fire away here.
First, Ian Wallace is asking, would you trust it to generate, say, a Reactable output or a ggplot2 output in, say, a Shiny app, if it was showing you transparently the function and the code it was using to do it, much like you were showing with the SQL queries in the demonstration apps? What are your takes on that? Yes. Really good question. So I think there's a couple of answers to that. I would probably trust it to get that code correct 80% of the time to 90% of the time with Cloud. Actually, if you're limiting it to a single output or you know specifically the kind of output that you want, yeah, like maybe even like 95% to 100% of the time with Cloud 3.5 Sonnet.
That being said, if the people who are visiting your app are not people you trust, if they're people from outside your organization, you are having a potentially malicious person write a prompt that's going to go to an LLM that's going to return code that you are then going to execute on your server. So that is the reason why my application was so specifically targeting SQL. And SQL itself is not completely without risk. But it is far, far, far lower risk than, hey, please write me some R and Python code.
So I do think that is something that a lot of startups, a lot of companies are doing. But when you do that, you do have to be a little bit careful that you either trust all of your users or that you run your R or Python in some kind of more sandboxed environment so you don't have the risk of, you know, malicious code doing nasty stuff.
That being said, I think what I'm more excited about is like having an LLM generate not React table code directly, but some kind of data expression, some kind of declarative, like a DSL about what kind of React table it wants. Like, for example, I don't want it to write TG plot code. I want it to write a Vega specification. You know, because the chances of doing damage are much lower. That being said, that particular case we tried it doesn't work very well. But we're working on it. We really want it to work. And you could definitely imagine that React table and GT could have versions, you know, of their functions that are like, please give me a JSON blob instead, instead of writing raw code.
There was a related question on that from Leonard going back to the SQL query aspect of it. Would you suspect that, say, a malicious user would be able to manipulate these AI to give, say, SQL injections by chance? Or do you think that's an issue that's not really relevant here? Yeah, that's a good question. I think for this particular example, we're actually loading the data into memory as a data frame and then using DuckDB. So, this is actually, number one, it's I think I marked it read only. So, like, DuckDB will actually not let you inject any changes.
And then secondly, in the prompt for maybe the most interesting part of these Shiny apps is not the R code, but the prompt. And here I think I tell it, like, please only do select queries. And I personally have not been able to trick it into doing a non-select query. And I also have not been able to get it to call any dangerous functions in the select query. So, it might be possible. And I think if you were doing something with really trusted data or against a live database, you definitely want to do your own due diligence on this. And of course, always use, you know, database credentials with the least privileges for that situation that you can get away with. So, in this situation, even if you're connecting to a live database, you should definitely use read-only credentials.
Yes, you do have the ability to programmatically get the chat history, and sometimes that's, like, quite important for some scenarios. So, you definitely do have the ability, if you're writing a Shiny app and you want to, say, like, generate an R Markdown report that reproduces all the analysis that you see on the screen, but now you also want to include, here's the conversation that led to this set of inputs, that's not a feature that's built into Elmer, but it's certainly, all you have to do is take that Elmer object and say, give me the messages, and then you decide how you want to, you know, transmit that to the person on the other end.
Yeah, Cloud 3.5 Sonnet is my go-to, and supposedly it's gotten even better in the last few days with the, they just released an update. I will say I have not used Google Gemini in Angular, but I use GPT-4.0 and Cloud a lot, and for coding tasks, I always go to Cloud first.
Well, I will say for people who are, you know, not able to use those services because of, you know, your very good reasons for corporate IT insecurity, that there are a lot of people at Posit who are working on, you know, commercial products that can't, you know, they can't send any of their code to those models. So there have been people using local models with OLAMA, O-L-L-A-M-A, which, it makes running local models, like, pretty much approximately as easy as Docker desktop. So you really can get started in about five minutes, and people have had some success running, you know, LLAMA 3.2 locally with various plugins for their IDEs. So it is definitely, you can do it. Cloud's better, but you can totally do it.
If you haven't heard of Shiny Assistant, this is our wrapper around Cloud that was designed specifically to help you. This is the Appsalot blog post about it. It's specifically designed to help you, number one, answer questions about Shiny, but number two, actually write Shiny apps for you. So, you can do it for Shiny, for R or Python, and, you know, ask one of these questions. And it doesn't have to be one of those questions. It can be any question you want, but not only will it answer, but it will kind of have this integrated Shiny live environment, and you can go back and forth collaborating with the model. It often makes mistakes, and, like, there's some people that just their mentality is, like, this thing made a mistake, I'm out. And other people who are, like, this thing made a mistake, I wonder if I can get it to, you know, actually do it right. And I gotta say, like, people who take the latter approach have been, like, remarkably successful doing super cool things with Shiny Assistants. I definitely encourage you to give that a try.
So, yeah, we're gonna probably wrap up so quick here, Joe. But one last, like, parting word from you. For those that are still very new to this space, and I can imagine our audience, a lot of us are still new to this. Do you have, like, recommended resources that people can go to on top of, of course, the packages like Elmer and Shiny Chat to really feel like they're getting up and running with these key concepts? Yeah, I think we are working on, or Hadley is working on, more introductory material for Elmer right now. So, I think you can expect a vignette, like, literally within the next few days.
But what's there already is, I think, pretty helpful. And the other one I would recommend, unfortunately, is in Python. But Jeremy Howard did a keynote talk at PositConf, I think, two years ago. Oh, no. 2023, maybe? Anyway, Jeremy Howard, PositConf keynote. And I think he did a really excellent job of just distilling down the information. Like, instead of talking about the theory and talking about the math, it was just like, if you're a hacker, you know, like, if you're just someone who wants to write code, and wants to know, like, how do I interact with LLMs, he really did a good job of boiling down, like, this is how they work programmatically. And here's how tool calling works and, and all those things in a little more detail than I was able to go today. Excellent. Well, on behalf of everybody here, we thank you so much for this enlightening and inspiring presentation. So Joe, thank you so much for presenting to us at our pharma today. Thank you.

