How to build a model annotation tool with FastAPI, Quarto & Shiny for Python

Transcript#

This transcript was generated automatically and may contain errors.

Hey everybody, thanks so much for joining us today for our Workflows with Posit Teams session. If this is your first time joining us, we do host these workflows the last Wednesday of every month and so if you ever want to go back and check out the recordings, I think we've had about seven of them so far, but today you will learn from Gordon Shotwell, Senior Software Engineer at Posit, who is joining us to show how to use Posit Connect as an end-to-end Python platform for hosting internal machine learning models.

I'll be hanging around in the background here, so if you have any questions, feel free to put them into the chat. I'll also put a Slido in there where you can ask anonymously, but with that, thanks again for joining us and I'll turn it over to Gordon.

My name is Gordon Shotwell. I'm a Software Engineer at Posit and I mostly work on Shiny for Python , but before I worked at Posit, I worked in industry building tools for training, deploying, and monitoring machine learning models and I learned a couple of patterns with Posit Connect that I think are particularly useful and I wanted to go through some of those today.

So let's just get started by taking a look at this modeling code and this is a very simple, maybe probably in these days, this day and age, primitive text classification model that basically classifies text based on the words that are in that sentence to determine whether it's about electronics or not. And here we have sort of the modeling code, break the model into a training and test set, fit the model, and then print out some accuracy, some model statistics.

And the question here is, say I'm happy with this model and I want to share it with other people, how should I do that? And in particular, how should I share it with all the different groups of people that might need to interact with either this model or the data that underlies this model somehow?

Identifying the different user groups

So this is a kind of the way I would sort of break it up in terms of this sort of imaginary situation where I'm building a model, I want to share it with other people, and I can think of five groups that need to use this model. So the first group is me, like the modelers, me and the team of people who actually generated this model. Second group is leadership, who might want to get model statistics or have a sense of how this model is performing on the latest set of data.

Third one is the data team. These might be people who use code to score the text that they're encountering in their own data sets. And then finally, we have these other two groups. One is a group of annotators. This is a group of people who might be annotating data to provide me with more training data and improve the quality of my models. And there's this last group that's just kind of other systems that might come up. You know, other people using other programming languages, maybe somebody who's working on the website might need to call this model for some reason.

And all these groups have different needs and, importantly, different interfaces. So modelers and the data team, they're probably both using Python. Leadership, they probably just want to see a static site. You know, they don't need something super interactive, they just need to see something that gives them a little printout of statistics. And finally, we have these annotators, and they're probably not technical, so they're going to need more of a full-featured web app. And these other systems are going to interact with some programming language that maybe we don't know now or won't know about in the future.

APIs before UIs

One of my most sort of important principles when thinking about these types of systems is this rule, which is APIs before UIs. So when we look at this group of people, see that there are a few different groups that are using something you might call an API. It's a code-first interaction with this product. We have these Python users and these other systems, which might use some type of code. And the reason why I think APIs before UIs is such a good rule is that code interfaces are way easier to build than GUIs.

They just have fewer parameters. You don't need to worry about centering something or making the UI kind of intuitive. You can just build the code and have people interact with the code. Because they're easier to build, they're easier to change and iterate upon. This means that if you're kind of working mostly on an API, you're going to be able to make more changes more quickly to center in on the right interaction, the right data model, and the right type of user flow for your particular product.

And my experience doing this for a number of years is that a good API will usually create a good UI, both in terms of the simplicity of the code and also how intuitive it is. Once you've kind of centered on that right API, the right interaction for your problem, the GUI that serves that type of interaction is pretty obvious most of the time.

And my experience doing this for a number of years is that a good API will usually create a good UI, both in terms of the simplicity of the code and also how intuitive it is. Once you've kind of centered on that right API, the right interaction for your problem, the GUI that serves that type of interaction is pretty obvious most of the time.

The opposite is not true. I've seen a lot of times people build a GUI first. And in doing that, they kind of just sort of throw different things into some type of programming data model or some type of interface. And trying to build intuitive code interface using those same data structures can be really awkward. So if you ever tried to analyze web data as a data scientist, this is kind of why you get this big giant blob of JSON that's not particularly well structured for your purpose, right?

So if you ever have the option of building the code interface first before doing any type of GUI work, I really recommend doing this. Doesn't always happen, but in this case we do, right? We have these three groups of people are interested in code. They're also probably our earliest stakeholders, so we might want to serve them first.

And the term there that's really useful is called an API contract, which is basically the contractual, the relationship that you're promising where these systems plug together. So like we're exposing some endpoints to all these different things. And so we're kind of signing a contract that those endpoints are going to be stable and maintain their thing, even if where it's hosted or what's happening under the hood changes dramatically.

All right. Thank you very much for your time. And I think I'm going to be able to take some questions now. Absolutely. And thank you so much, Gordon, for the great demo. And we'll jump over to Q&A here in just a second. So it should automatically push you over to the Q&A room. But if it doesn't, I'm going to share it in the chat here right now. But again, when you go over there, you can ask questions in the YouTube chat. But you can also use this link shown here on the screen for anonymous questions, too. Thank you again so much for joining us today. And I'll see you over there in just a second.