Resources

AI and Shiny | Keynote ShinyConf 2025

ShinyConf #ShinyConf2025 This is a talk from ShinyConf 2025: https://www.shinyconf.com Abstract: As LLMs continue to get more and more capable, you might find yourself wanting to build web applications that make use of LLMs. Shiny is a great tool for doing this. I will talk about some of the tools we've developed at Posit to help you build applications that use LLMs, and I'll show you how simple it is to get started building your first AI chat application with Shiny. ‍On the flip side, LLMs are great tools for helping to build Shiny applications, and I will show the AI-driven tools we've been working on at Posit that can help you build Shiny applications more quickly and with more fun. ____________________ Want to explore more sessions? Sign up for replay access:

Jul 10, 2025
59 min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Welcome back. We're closing the day with a keynote from someone whose work is behind many of the tools we use every day. Winston Chang is a principal software engineer at POSIT. He's the creator of R packages like R6, Provis, Shinytest and Fastmap, and a contributor to key projects including Shiny, ggplot and DevTools. He's also the author of the R Graphics Cookbook and leads development efforts on Shiny for Python and ShinyLive. You've probably noticed I've skipped a few names I always struggle to pronounce. Winston, it's great to have you with us. The floor is yours.

All right. Thank you very much, Paweł. Okay. Let me see if I can share my screen share here. Okay. Great. Thanks for having me. I'm going to be talking about Shiny and AI today. Now, I just want to start off by saying this is a very exciting time. I've been working with computers and coding for some time now, and there have been times where I've been excited about projects and there have been other periods where I've been kind of bored with things. But with AI and large language models, these are the most interesting and exciting technologies that I've worked with. It's been really fascinating to learn what they can do and it's been fun to use AI tools to build things. So at Posit, we've been putting a lot of energy into AI in the last year. And certainly for the Shiny team, AI has been our main focus because we want our users to be able to build applications that use AI. And we recognize how important AI is right now. And that importance is only going to grow as time goes on.

Okay. So here's the plan for this talk. I'm going to talk about tools that we have for building Shiny applications that use AI. Then I want to talk about how LLM conversations work. Now, understanding this is important for getting the most out of your AI tools. Then I'm going to cover AI assistance for building Shiny applications and for fun, using AI assistance to build Shiny applications that use AI.

Tools for building AI-powered Shiny apps

Okay. So for building Shiny applications that use AI, there's really two parts of the software that are important that you need to know about. There's the back end. There's the libraries that communicate with the LLMs. And then there's the front end. There's like the chat UIs and chat UIs for Shiny. So for the back end, we have a number of LLM. Well, there's many options for LLM libraries in various programming languages. But in R, the one that we recommend using is Elmer. Now, development on Elmer started last fall by Hadley Wickham. In the R world, there are not that many options for libraries that talk to LLMs. And Hadley has some strong ideas about how APIs should look. And they're usually pretty good ideas. So he started working on Elmer. So that's what we recommend people use in Shiny to be able to talk to LLMs like OpenAI and Anthropic.

And on the Python side, there's, well, we've developed ChatList. In general, in Python, there's a lot of development activity happening around AI. And so there's actually many, there's several options for LLM communication libraries in Python. But none of them were quite at the right level of abstraction for what we wanted. So some of them were pretty low-level, like the libraries that OpenAI and Anthropic have put out are relatively low-level. And then there's very high-level ones like LangChain. And we didn't feel that those had quite the right level of abstraction for what we want to do and for what we wanted our users to be able to do. So we started working on ChatList. So Carson Sievert on the Shiny team has been leading the development on this. And the API for ChatList is heavily inspired by Elmer. So there's quite a few similarities between these two. And we think that these are at a good level of abstraction and simplicity for people to use and to be able to do interesting things with them.

Okay. Now for the chat UIs, this is stuff that we've been, we've added into, sorry, these are things that we've added to Shiny or in R in a separate package. In R, there's a Shiny chat package. And in Python, we've put these components directly in the Shiny package as UI.chat. So this is what it looks like. So, you know, it's a pretty, this is a pretty typical chat interface. And we're constantly working on making it better. This is a very active area of development for us.

So I'm not going to go talk about these in depth, because there have been several talks about them. So I know Carson Sievert gave a keynote in the morning. For me, a very, very early morning about these packages and building UIs with them. And I think I believe David Rapp also talked about some of these things. But I just want you to know they exist and what these names are, so you can look them up and learn more about them.

How LLM conversations work

So now the next thing I want to cover is how LLM conversations work. So I think this is very important if you want to be able to get the most out of the AI tools that you build. Okay. So let's start off with how the communication works. So when you want to send something to an LLM and ask it a question, so you make an HTTP request with a prompt to an LLM server. So that might be at OpenAI or Anthropic, or maybe you're running something locally. All right. Then that LLM server, well, it sends that text to the LLM, and then it responds with the answer. Now, one thing that you might find surprising about this is that the server is stateless. So the server doesn't remember anything about their conversation when you send something back to it in the future. And that is surprising. Like, how can you have a conversation with someone that has no memory during the conversation of what you're talking about? How does that work? I will say that this does have a lot of useful properties, though. This makes it much easier to program against an LLM because you don't have to, like, manage synchronizing state or the state on, you know, the client and the server side. Every time you send a request, it's just, you know, you're sending a new request. It doesn't have to know anything about the past.

Okay. So let's look at some code here. This is what a chat request looks like. And this is using the curl command line tool. Of course, you can do the same thing in R or Python, but I'm just using this here because it's relatively simple. All right. So you visit this web API here. And you send us some headers. And then this is the interesting part. This is the payload that you're sending it. You tell it what model you want. In this case, we're using OpenAI's GPT-4. And then you give it these messages. The first of these messages is the system prompt. In this case, we're saying you are a terse assistant. And the system prompt in general is an instruction that the application author wants to send to the LLM that the user doesn't see typically. And that will guide the LLM's behavior. Okay. So there's that part that's sort of behind the scenes for the user. And then the user has their message. In this case, it's the user saying write a haiku about data science.

All right. So that gets sent to the server with the LLM. And then when the LLM finishes computing its response, it sends it back. And it sends a message like this. And this is the interesting part here. It says message role of assistant and content. I formatted this. This is obviously broken JSON if you know all about that. But I formatted it so it looks nicer. Numbers, whisper truth. Insights bloom like dawn. Okay. That's really nice. Nice haiku.

Now, let's say we want to continue the conversation. All right. And we want to take another turn in the conversation. So the user or the application sends another message to the LLM. And we have the messages here. So we start off with the same two messages that we sent last time. Then we also add in the assistant's response, which is a haiku, and then the user's query after that. Now make it funny. Okay. So we're sending this entire the entire message history to the LLM. And when it gets that, it'll respond. It might look something like this. Message role assistant content. Charts and graphs all day. AI thinks cats are croissants. Blame the intern's code. So that is how it works. You're constantly every time you make a request to the LLM, you have to tell it, you know, what it said in the past. What you said and what it said in the past. And that's how it can be stateless. The server can be stateless.

All right. So to recap, the conversation with an LLM is not a conversation between two stateful beings, right? If you're talking to another person, that's two stateful beings talking to each other. And they remember what was said in the past and they have some sort of mental state that persists through the conversation. All right. So you have a mental state that persists, but the LLM does not. And every time you send a new message to the LLM, you're also including the entire previous conversation. And at that point, the LLM responds with a completion that makes sense given the path of the prior context. And that's how these LLMs work.

Now, the chat interfaces that you see with the chat GPT web application or Claude, they provide the illusion of a conversation with a stateful being, but that's not really what's happening behind the scenes.

Demo: 20 questions chatbot

All right. Now I'm going to show you a little demo here that sort of makes this really clear. All right. So this is a little AI chatbot I wrote with Shiny to play 20 questions. So the chatbot's thinking of, it says, I'll think of an object and you ask the questions. You, that's me, I'll ask questions about yes or no questions trying to figure out what this object is. So I'll use actually this voice recognition thing on my computer. Is the object found in a house? Okay. Yes, it's found in a house. Great. Is the object found in a kitchen? Is it a utensil? Yes, it's a utensil. Great. I'm doing great here. Is it a spoon? Is it a fork? Yes, it's a fork. Congratulations on guessing. Would you like to play again?

Now every time, every step in that conversation, I'm sending a new request to the LLM and giving it the whole message history. So, you know, there's not some state at the beginning. Well, there might have been some state at the beginning where, I mean, who knows what's happening in the LLM, where it had some idea of what it was thinking about. But that was not the same LLM instance that I was talking to necessarily. And I'll just, I'll show you, I added a little thing here that recorded the communication here. So, here's the first message I sent. Is the object found in a house? It said yes. The model here was GPT 4.0. Then I asked, is the object found in the kitchen? And this model is called 3.5 Sonnet. So, I actually sent a request to a different LLM with the message, with the same, with the continuing message history. And then for my next question, I sent it to a different LLM, GPT 4.0 Mini, and it gave me a response that made sense. And yet another LLM for the next step. So, I kept doing that, just rotating through these four LLMs. And we have this, what seems like a perfectly reasonable conversation, but it's not the conversation with, like that you have with a person, where the person has something in mind that's persistent through the whole conversation. Every step of the way, the LLM is sort of making it up.

Every step of the way, the LLM is sort of making it up.

Tool calls and prompt engineering

All right. So, I hope that really drives the point home. Okay. So, the next concept that is important to know about is tools for LLMs. So, LLMs themselves are just sort of text in and text out machines. That's really all they can do by themselves. They can't do anything else. There's stuff that's built around them that can do things. And one of the things that we do is, that we can use is tool or function calls. So, these tools are a way for LLMs to interact with the world.

The application, like the app that you write, sends the user request to the LLM along with a list of tools. All right. And the list of tools, they look like functions. So, here's one, you know, get weather, location, fetch URL, list files, control lights. So, along with the user's request, your application will also send to the LLM, like here's some things that you can do. Functions that are available for you. Okay. So, let's look at a quick look at a request that has tools. So, we send that user message. If it's cloudy in Minneapolis, that's where I live. It's cloudy in Minneapolis, turn on the lights. All right. So, there's, and we send along these two tools. This one's called get weather. This is a description of it. And then these are parameters, location. All right. And this is how the LLM knows how to, you know, what can this tool do and how do I, how do I invoke it? We also give this other one control lights. All right. So, control lights description, control lights from user's home, and the state is either on or off.

Now, when the LLM receives that request, it'll probably want to find out the weather in Minneapolis. So, if it decides that that is a useful thing to do, it'll send, it'll reply with a tool call. And it's going to want to call get weather and pass in the location as Minneapolis, Minnesota, USA. Okay. So, just to recap, there's those steps. The application sends the user request along with a list of tools. The LLM responds with a tool call if it thinks it's appropriate. Now, if the application, this is your R or Python code, if it receives a tool call, it'll execute the function. So, there's some function that, you know, you've written to go fetch the weather, to find the weather from like a weather API. And the application sends the result back to the LLM along with the message history. And the LLM can, will either respond with text or another tool call. So, in this case, it'll probably, you know, it'll find out it's cloudy today. So, it'll probably make another tool call to turn on the lights. All right. So, but the application is responsible for actually doing those things, for actually turning on the lights. And so, this cycle just will repeat as long as there's any tool calls remaining.

All right. So, that is how LLMs can interact with the world with tool calls. And these are supported, this is supported by ChatList and with Elmer.

Okay. Now, prompt engineering, I'm just going to give a really brief overview here. So, by prompt engineering, usually we mean modifying the system prompt for the LLM. And one of the things that we can do with this is to customize the behavior. Like you can tell it, speak like a pirate, or if the user asks data questions, provide R code, or extract a recipe from text and return it as JSON. All right. So, those are things you might, those are things you might do in your system prompt to customize its behavior.

But you can also teach the LLM new things. So, and, you know, by teaching, I just sort of mean in that when it's when it's just giving that the response, that one response, it's not really learning things long term. Okay. So, for teaching LLMs, you can do something called prompt stuffing. So, that's basically just adding documents to the system prompt. All right.

And I'm going to show you one way to do this with Elmer Assistant. So, when Elmer, we first started using it internally, it was new for everyone, and we wanted to make it easier for users to be able to use Elmer and for creating chat applications, either with Shiny or without Shiny. So, Joe Chang, he wrote this app called Elmer Assistant, and it's, I'm Elmer Assistant, I'm here to answer questions about Elmer and Shiny chat or generate code for you. So, let's say, you know, create a simple chat app for the concept. So, this will be a chat app without Shiny. All right. So, it's emitting some code here and some text, and it's just showing us how to do this. Now, the interesting thing, though, is that the LLMs themselves, like from OpenAI and Anthropic, they don't know about Elmer. Their training data stopped before Elmer was released even for the most recent models. So, but we have here a chatbot that knows about Elmer and is helping us, helping us write Elmer code. Well, how do we do that? Well, the Elmer read is basically copied and pasted into the Assistant's system prompt. So, let's take a look at what that system prompt looks like. All right. So, here's the system prompt. You are an Assistant that helps write code for Elmer and our package for interacting with LLM APIs. That's more text here. But then down here, here's the readme for tidyverse slash Elmer. Okay. And at this point, that readme is just copied and pasted into here. All right. So, there's a lot of stuff here. I think there's some other, there's some vignettes for the package also that are in here. There's quite a bit of information. So, all of this information is sent to the LLM when the user makes a request. And that is how it's able to answer questions that the user has about Elmer.

And, you know, it helps, like, because the documentation is good, that helps it, the LLM give good responses. So, that is, you know, you have another reason to write good documentation. It's not just for other humans directly, but for LLM so they can ingest it and then help people write code.

Another example of this is brand.yaml. So, Garrick gave a talk about this today, I believe. And he has created a prompt for brand.yaml, just specifically for LLMs. This one's actually quite a bit shorter than the Elmer one. And, you know, this, if, to use it, you just copy this code block and then paste it into, well, you can paste it into the chat interface or use it as a system prompt. So, you know, there's this, this is the information about how to use brand.yaml. So, this is, you know, this is really great because if you're building tools that need to know about brand.yaml, all you have to do is copy and paste this in there. And it's actually not, this is not too big. So, it's, it's very useful. All right. And so, that's how you, that's how you can teach them about more information than the LLMs know by, you know, from their regular training.

Okay. All right. So, those are some of the concepts that I think are very important to know about working with these LLMs. And I hope that, you know, that helps to demystify some of this, you know, some of the things about LLMs. And I, for me personally, it also helps me view them as being less scary. They're just sort of like, they're just tools for, for helping, you know, helping us do things. They're very interesting, very powerful tools, but they're not magic. All right. And it's not like there's some, well, there's not like there's some like robot that's just sitting there thinking about how to like dominate the world or something. It's, it's, it's when you feed it some text, it's going to give you some text back. Although I guess what you can, the things that you can do with that, just that is actually quite impressive. So, so that is something interesting to think about.

AI tools for building Shiny applications

All right. All right. Next up, let's talk about AI tools for building Shiny applications.

All right. Now, I want to say that I think that Shiny is at the right level of abstraction for AI assisted coding of data applications. Now, what do I mean by that? What is the right level of abstraction? Well, even if the AI is generating code for something, this code still has to be checked by a human in order to make sure it's doing the right thing. So even though LLMs can generate JavaScript and like, you know, the LLM tools for generating JavaScript and React code, those are, they're really good. They can do very, very, very impressive things. But even though they can do that, you know, if you're being responsible at all, you have to be able to verify that the code works. And if what your goal is, is like, hey, I'm not, I'm not interested in, you know, creating the flashiest interface or, you know, some really customized thing. I'm trying to get this data project done. That is just, that's sort of noise because you still have to verify that all that works. And that's another thing that you would have to learn, like, you know, learning JavaScript and React and all that. If you're using those tools, if you're writing that kind of, if you're, sorry, if you're using your AI to write that kind of code, you still have to, you'll have to understand those things, embed all that code, but that's not necessarily the important thing that you're trying to get done. So, so for data scientists that are working in data applications, Shiny is a good balance of readability so that you can verify what the AI has done and, and flexibility so that you can do a lot of the interesting things that you want to do.

Okay. So, let's take a look at the AI coding landscape. So, there are, like, these web chat interfaces, which I'm sure everyone here has used. There's, like, chat GPT and Cloud on the web. And if you're using those to help you write code, it's, you know, it's, it's a bit of a manual process getting your questions in there. Like, you might have to copy and paste your code to get it in there. And then if it generates some code, you might have to copy and paste it to get it out. So, so these tools are either, they're, those web interfaces are very useful, but they're, they're a little bit of a manual process to get stuff in and out of there.

Okay. There's also these, some command line tools that you'd use in a terminal. There's Editor and Cloud Code. Those ones are, you know, you give them, you tell them what you want and they'll go and look at all your source code and, like, make a bunch of modifications. Those are, those are some pretty interesting tools. There's Coding Assistance in Code Editors. So, probably the most well-known of these is in VS Code. These days it ships with GitHub Copilot. Although I think you still need a, oh no, sorry, I was going to say I think you need a subscription, but I do believe there's also a free tier for using Copilot. Let's see, there's also the RStudio IDE can use Copilot if you have a Copilot subscription. There's VS Code, there's a bunch of VS Code extensions. So, you can see a lot of this is happening around VS Code. There's VS Code extensions like Continue, Cline, Codium, a lot of things that start with a C. And there's forks of VS Code like Cursor and Windsurf. So, these are, these are basically derivatives of VS Code itself where the extension, the VS Code extension mechanisms were not flexible or powerful enough to let them do all the things that they wanted because they needed, they wanted deeper AI integration. And also Positron possibly at some point in the future, which I can't say too much about, but I thought I'd mention that.

Now, okay, what are some of the limitations of these tools and specifically limitations for working with Shiny? Well, one thing that's important is that current LLMs are not good at Shiny for Python. There's, there's part of the reason for that is there's, there's just not that much training data for them for Shiny for Python. There's a lot of like RShiny stuff that they've, you know, RShiny has been around for over a decade. And so, there's a lot of stuff out there when, when these companies scrape the web, there's a lot of examples that, that they can learn from, but not so much for Shiny for Python.

Okay. They don't know the latest features and libraries. So, these LLMs, they don't know about like Elmer and Chatless and all this, these Shiny chat UI components. So, typically when the new LLMs are announced, the training data usually has stopped about six months before. So, you know, the, the companies that get the training data, they, you know, they scrape the web and to gather a whole huge amount of data. And then they start training the models and, you know, that training takes time. And during that time, they're not like getting more data to feed into the, to the models. And, you know, like even if Shiny had new features that say were like two months before the training data cutoff, the LLMs might know something about it, but that's still not enough time for there to, for there to be a lot of training data out there for the LLM. Because, you know, people might've just started using this, whatever new features. And, and so there wouldn't be a lot of code for the, for, to use for the training data.

Shiny Assistant

So, really these limitations are more about the LLMs than about those tools that are built on the LLMs. But these are, you know, these are important limitations. So, last summer, you know, we were looking at this and we thought, okay, well, maybe we can build something that works better for Shiny. And so out of that came Shiny Assistant. And I don't know if everyone here has seen it, but this is Shiny Assistant. It's a chatbot, but it's also a little bit more than chatbot. So, let's have it write some R code. So, you know, I can ask it to create a hello world app and it will move over to the side and open up Shiny live. This is Shiny live is a Shiny where both the server and the client portions are running in the browser. So, in this case, R is actually running in the web browser, not on some server somewhere. Okay. So, here it created a hello world app, but this, this is not being escaped properly. So, we can tell it, you know, we can go through this and we can tell it to fix some of these things and tell it that it's displaying this in a funny way. But yeah. So, this is what Shiny Assistant can do.

It can, you know, and we can ask it to iterate on this, on this application that it created here. So, what are the things that go, that are part of Shiny Assistant? Well, the most critical component is having a good LLM model. So, right now that's called 3.7 Sonnet. When we, you know, first built this last summer, it was called 3.5 Sonnet. And that was the first LLM model that was good enough to actually generate decent, you know, have a good chance of generating decent Shiny code. That was a big step up when it came out.

All right. The next really important component is a customized system prompt with information about Shiny. So, you know, as I mentioned earlier, these models were not very good at, or so their models by themselves don't know very much about Shiny for Python. And, you know, last summer, they knew even less. A lot of the time, you know, oh, Shiny for Python doesn't exist. All right. So, we had to customize our system prompt with information about Shiny, Shiny for R and Shiny for Python. Okay. Now, it's also integrated with Shiny Live. So, Shiny Live is, you know, it's this whole thing here on the right, under this blue header part. And Shiny Live is really nice for this because it has a code editor built into it. So, when the LLM generates code, you can go and edit it and change it yourself. Because sometimes, you know, a lot of times for these code generation tools, the LLM generates code and then you can't edit it. And if you want to make even the slightest change, like the change to a color, you have to ask it to do it. And sometimes it's just more efficient to do it yourself. All right. So, it comes with a code editor. It will run the application. And all of that happens in the browser sandbox. So, and that is all it's all very efficient. And it's safe. It's in the browser, the browser sort of the browser sandbox, you know, that's something that's been battle tested for security in at a level that is it would be very, very hard for anyone else to any other type of thing to replicate. People are using web browsers all the time and trying to hack them. So, trying to hack them. So, the browser sandbox is very secure.

Now, there are some limitations. Shiny Assistant. So, you know, first among these, it still doesn't write perfect Shiny code. If you use Shiny Assistant, I'm sure you've experienced this where well, and we just saw it just now, like it didn't escape these HTML tags. I just stuck them in there. And you can oftentimes you can ask it to fix it and it will be able to do it. But sometimes it won't. All right. And that's something that, you know, perhaps newer LLMs will be able to do better. And perhaps we can improve that by working on the prompts that we give them. There's also data privacy concerns. So, right now, all of the code and data in a Shiny live app is sent to the Anthropics API. When you ask it to make changes to the code, right now, if I asked it to do something, it would send this application, this whole app code, to the Anthropic API. It currently sends all of the files that are in here. You can have multiple files. It will send all of them to Anthropic. And, you know, we could actually improve that and could change that behavior. But for simplicity, that is how it works right now.

All right. There's some limitations because Shiny live runs in the browser. So, I was telling you about the advantages of that before, but there's also some limitations. So, first off is it can't access local resources like files and databases. So, if you have, you know, data on disk, Shiny live applications are not able to access them. You could run your Shiny, you could copy and paste that code into your regular code editor where you save it on disk and run it like a normal Shiny app, but then you're sort of taken out of that really quick cycle, iteration cycle with Shiny Assistant. So, you can't generate the code and copy and paste it elsewhere to get around this. It's a little awkward to modify existing apps because you have to sort of, you have to copy and paste your code into Shiny Assistant, into the code editor for it to know what the code is. And then you have to copy and paste your code out. Okay. There's restrictions on network requests. And this one is actually particularly important for AI stuff because we can't use these LLM libraries that make network requests. So, like in the Python world, OpenAI and Anthropic have libraries for talking to their LLMs. In R, Elmer does that directly. But those libraries, those packages, they, although it's possible for a browser to send an HTTP request to those LLM servers, these packages, they actually import libraries for that communication that can't run in the browser. So, it just means that we can't use those libraries to talk to LLMs.

And for that reason, you know, Shiny Assistant doesn't know about Shiny chat developments. So, we could, you know, we could hypothetically update the prompt to know about Shiny chat stuff, but we haven't because it'd be just disappointing where it would generate this code for people and then that just would not work in the browser when they're using Shiny Assistant on the web there. Okay. And I also need to mention Epsilon Flow. Peter Strzanko gave a talk about this today, which unfortunately I was not able to watch since I was working on some last-minute touches to these slides. So, I know that they have their way of dealing with some of these issues, but I'm not able to intelligently comment on that. But I just, I thought it was important to mention it here. And I've seen some of the stuff that they posted about it on LinkedIn and it looks really interesting.

Shiny Assistant for VS Code

Okay. So, that was Shiny Assistant for the web. Now, we've also been working on something called Shiny Assistant for VS Code and Positron. And Positron support is coming soon. All right. Now, let me give you a little demo of this.

Okay. So, this is what it looks like when you're using it. So, Shiny Assistant for VS Code actually works in Copilot. So, if you have Copilot, if you use Copilot for VS Code, you'd open up your chat thing here and you say, at Shiny, and let's ask it to build an app for us. So, create a simple AI chat. Okay. Let's tell it which language we want. Let's try, let's do R. Oops. Oh, yes. This is a, sorry, that is a bug with the Shiny extension that was just recently introduced and I will fix that soon. Okay. So, it wants to, now it's asking us where we want to put this application. Do we want to put it in this chat app subdirectory? Sure.

Okay. Okay. So, it'll tell us, give some instructions about what we need to do. We need to install some packages. Now, I actually already have those installed. It tells us to create a .env file with API keys in it. I actually have already done that as well. And then it's generating the code. Let me move this a little wider here. Okay. Now, after it generates the code, we can, you can view these changes as a diff. And if it all looks good, which it's creating from the file new, so it's fine to apply changes. And then, let me make this a little smaller there, and we can run the application by hitting that play button.

Oh, you know what? It is, I think it's missing something. Well, let's see if it can, okay, let's see if it can fix this. So, this is something that happens. I knew I was taking a risk by doing a live demo here, but let's give this a try. I hope this works. It says, let's just paste that in there. There's one way you can work on fixing issues with AI. Yep. So, it correctly diagnosed the problem. We needed to add the bslib package. All right. So, this is, these are the changes, and I'll just apply those changes, and I'll run it.

Okay. Oh, man. So, cannot find nvar opening. Oh, sorry. I didn't read the instructions carefully enough. There was a, I actually have to copy this .env file to that directory. So, copy, paste. All right. Good times a charm, please. Oh, my gosh.

All right. Here we go. Okay. Let's see. I know the solution is here, but I'm going to see if it can figure this out as well. Okay. So, it's telling us where to get the API key, and then, okay. I guess I have to give it a little hint here. Okay. So, this is it figuring out what it needs to do and has to load this .env file. Okay. So, there we go. This is our AI chat app. A little bit more work than I was hoping, but it is here. So, let's see. You know, tell me a very short story. Okay. So, this is working. It is talking to GPT 4.0 from OpenAI.

All right. So, yeah. So, you can build these AI chat apps this way, and you can also, you know, you can ask it to do some customization on it. So, let's say I'll talk to it. Make it speak like a pirate and play the game 20 questions. I also want the system prompt to be loaded from a Markdown file on disk. All right. So, it's going to think. All right. So, now it's creating a system prompt for us. That's actually really nice. Sometimes it can be hard to sort of get started creating a system prompt. So, I'll just apply that changes. There's a prompt. It's also creating some, making some changes to our app.r file. Okay. Let's just apply those changes.

All right. So, wow. It's even colored this for me. That's kind of cool. All right. So, you know, so, now you can play 20 questions with this. I'm not going to do that right now. But I want to mention that once you have it in this state, I've got this system prompt in a separate file. I can modify this by an insert, you know, documentation about other software projects in here. So, if I wanted to teach it about some new software packages I'm working on, I could just copy and paste it in here and tell it to answer questions about that. All right. And you can also see there's not that much code here. This is actually, I mean, a bunch of this is just styling. So, the actual code for the chatbot itself is quite simple. So, you can create your own custom chatbots this way if you want to teach them about topics that they don't know about. All right. Back to the slides.

Okay. Now, to install it, here's how it works. So, in VS Code, go to the extensions. And I'm sorry. You click on this thing for extensions. Type in Shiny up here. Click on this Shiny thing that will pop up. It should be at the top of the list. And then click on install. And it will install the it should install the latest version of the Shiny extension for you. And it does require a GitHub Copilot subscription. All right. And then to use it, make sure it's in ask mode. This is a very new feature that they just added to their interface within the last week. And then say at Shiny and ask it to do things.

Okay. So, how is it different from the web version of Shiny Assistant? Well, it uses a GitHub Copilot subscription. That's a very important difference. I know many people have it already. And one thing that comes along with that is that Copilot has an intellectual property indemnity that's provided by GitHub and Microsoft. So, for many organizations, that's important. The Shiny app can use local resources like file on disk, data on disk. It can use network resources. The Assistant can modify existing apps. And it knows about new Shiny features like AI chat and brand.yaml. So, comparing to other coding assistants, it's installed as a VS Code extension like many other of these coding assistants. And it integrates with Copilot. You just type in at Shiny. You don't have to subscribe. You don't have to get API keys for OpenAI or Anthropic. The system prompt is customized with Shiny knowledge. That's what makes it work. And it knows about the latest Shiny developments. But it's also not as agentic as some of these AI coding assistants. Some of these AI coding assistants can go out and just read a whole bunch of files and change a whole bunch of stuff. And Shiny Assistant can do a little bit of that, but not as much as these tools that are built for that specific purpose.

Okay. All right. So, that's it. You know, we've talked about writing Shiny apps that use AI, using AI tools to build Shiny applications, and how these conversations with LLMs work. So, I hope you all learned a lot from this and are, you know, excited to use Shiny with AI. And thanks for listening.

Q&A

Thank you very much, Winston. We have plenty of questions from the audience. We might lose some context. So, I kindly ask authors that, if we struggle to answer, to put some more context in the chat. So, let's start with Mateo. Does this increase the token cost because each request compounds? Yes. That is definitely the case. Every time you send a request, you know, each time you're sending a larger and larger payload to the server. And that is just how they work. Yeah. That's unavoidable. I mean, there are techniques that people use. Like, they might ask an LLM, hey, summarize the previous, you know, the previous steps in the conversation for future use. And so, the LLM can do that. And you use that summarized version later on.

Okay. Extending on this question, if we have this big payload in the system prompt, is it also including to the total cost of the tokens? Yes. That definitely counts as well. But, you know, Anthropic has a prompt caching mechanism where you can cache prompts, like a conversation up to a certain point. And if it sees, like, another conversation that has that same stuff in the beginning, then it'll draw it from the cache and it costs, like, I think it's, like, a tenth of what it normally would cost. And OpenAI does some caching as well. Their mechanism is a little different. But, yes. If you have a very large system prompt and you keep sending over and over again, it can add up for costs.

Okay. Zuhai was asking, is there a limit to how far back into the conversation it can look? Yeah. So, most... Okay. So, the best, well, the state-of-the-art models these days, you know, they have a limit of, like, about 128,000 tokens for OpenAI, 200,000 tokens for Anthropic. And a token is... Oh, gosh. A token is, like, a part of a word, typically. So, let's say it's, like, kind of translates to maybe typically around about five characters on average, I believe, is the correct number for that. And so, there's a limit to how far back you can go. And even, you know, even if all of your information fits in that context, the LLMs are not perfect at recalling all that information. Especially, like, the Google Gemini models have context lengths of, like, a million or two million tokens. And my understanding is that when you're getting up to that size, the recall of the stuff in those prompts is not always very good.

It might relate to the Josh question. As the conversation continues, building a longer history, is there an increased potential in hallucinations? I don't think there's an increased potential for hallucinations, in my experience. It might not remember everything perfectly in that history. But I don't think it makes it more likely to just make things up. So, you know... But although when it does happen, that is kind of alarming. Like, when, you know, like, chatbots just make up stuff, like, URLs that don't exist. And sometimes you don't realize that until you're well into a project. And you're like, oh, wait. This is all completely made up.

Right. Stephen asks, how do you deal with preventing accidental exposure of API keys? So, typically... Well, that's actually one more reason we didn't have ShinyLive, on the web version of ShinyLive, talk to these AI LLM providers. Because the API keys, those should be stored on the server. And the server, like the server that's running Python, that server goes and talks to, you know, talks to the LLM endpoint. And that server... I mean, it's common practice for the server side of these applications, of all sorts of applications, to have various secrets, like API keys and things like that. And you treat these in just the same way. But you don't usually put those API keys, send those to a web browser. So, yeah. Because if you sent your API key to, like, your user's web browser, then they could just use your API key for whatever. And you don't want to do that.

Absolutely. James' question. Llama Force card has a 10 million context window. Could you, in theory, pass a GitHub repo into the system prompt and have the module understand the entire repo? Yeah. I mean, there are... And there's tools that help do this. One of them is, it's a JavaScript project called RepoMix. And people do that sort of thing. I don't know. I haven't used, you know, I haven't used Llama Force Scout. But again, my understanding is that when you get to very long context lengths, they're not going to have perfect recall. But they will learn something from it. Like, if you want to, you know, probably if you do that and then, you know, you ask some questions about it, it will be able to answer them