Small boosts here and there - Simon Couch

Transcript#

This transcript was generated automatically and may contain errors.

So, this is a talk about the sort of other end of the spectrum of turning 45-second tasks into 5-second tasks with LLMs. If you're the sort of person who in the past has been frustrated by this process of steering LLMs and correcting their errors, this might be the time to revisit and explore what it feels like for LLMs to be very good at small, specific tasks.

So for an example, let's imagine we're writing this function in R. It's a silly little wrapper around sys.getenv. So we're grabbing the value of an environmental variable, and if that environmental variable isn't set, we're raising an error that says I can't find that environmental variable.

So Matthew queued this up really nicely. What do we not like to do? Write documentation. In my gig as an R developer day-to-day, writing documentation is something that I do all the time. There are many parts of writing documentation that I think are pretty interesting, like eliciting the connections between different functions in my packages and the packages that I'm interfacing with. But there's a lot of parts of it that are pretty boring, like templating out boilerplate.

Introducing the chores package

So today I'm going to be introducing the chores package, which is a package that helps you automate hard-to-automate sort of fuzzy tasks using a markdown file. So in this example, I have this key get function, and I need to write or start writing Roxygen documentation.

So in this video, I will highlight the function, I'll press a key command to pull up a small shiny applet, I'll select the Roxygen helper from inside that applet, and a template of the documentation will begin streaming in. So something like this already exists inside of RStudio and will be coming soon to Positron , which is like the insert Roxygen skeleton. And so that will give you like at param name, at param error call. We can get a little further along the way using the chores package, because the LLM can actually infer the sorts of formats of the arguments to that function.

So this prompt is sort of tuned to give me the least amount of documentation possible while being thorough enough to where I don't want to delete anything. I don't want to read through a bunch of stuff that's not exactly how I would write it. This isn't complete documentation, but it's just complete enough to where all of this seems reasonable and I don't want to delete any of it. Or not reading through slop, if you will.

So this prompt is sort of tuned to give me the least amount of documentation possible while being thorough enough to where I don't want to delete anything.

So this is a talk about the chores package. The chores package is intended to help you with repetitive, hard to automate tasks. They're sort of like smart RStudio snippets.

The exchange is that any information you submit to them is theirs, and they're going to do whatever they want with it. So it's only free in that sense.

Here is the more screenshot-able table. So I did write this package. As I was writing it, I was using Anthropic's Claude, which is the default model when you call chat-anthropic with Elmer. If you just want to get a sense of like, is this something I'm interested in using, put $0.50 on an API key and give it a whirl. And I think that will give you the best results and maybe show you what's possible there.

Again, chat, GitHub, you have access to some really good leading models. Usually as OpenAI releases new GPT releases, they're available on GitHub within a few days. For like an ultra low budget, but higher privacy model, you can use GPT-4o-mini from OpenAI. Before I went out of office five days ago, this was like the newest snappy model from OpenAI. And of course, five business days is like a decade in LLM time. Don't use LLAMA for now, sorry.

If you want to keep track of this process where I'm trying to find some cheaper models and some local models that we could be running on our laptops to do these sorts of smaller tasks, I'm developing this evaluation called the chores eval, which is what generated the data behind the graph. I showed you a second ago.

So if you'd like to learn more, the chores documentation is a good place to start. You can install that from CRAN using the regular degular install packages. This repository, github.com slash simonpcouch slash usr-25 has the source code for these slides as well as some links out to various resources. I'll just end by saying I'm super stoked to be here. This is my first usr and it happened sort of last minute to be able to come out here. So I'm super grateful to be here and having so much fun already.

Small boosts here and there - Simon Couch

Transcript#

Introducing the chores package

How chores works

Use cases and custom helpers

Choosing a model