
Michael Chow - User guides: engaging new users, delighting old ones | SciPy 2025
User guides are the piece you often hit right after clicking the "Learn" or "Get Started" button in a package's documentation. They're responsible for onboarding new users, and providing a learning path through a package. Surprisingly, while pieces of documentation like the API Reference tend to be the same, the design of user guides tend to differ across packages. In this talk, I'll discuss how to design an effective user guide for open source software. I'll explain how the guides for Polars, DuckDB, and FastAPI balance working end-to-end like a course, with being browsable like a reference
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Okay, so welcome back everyone. We're excited to hear about user guides by Michael Chow.
All right, thank you. Whoa, okay, so sorry, came in hot. All right, thank you so much. Hope everybody's enjoying SciPy. I have to admit that I'm really excited to finish this talk so that I can enjoy SciPy. But I'm also really excited to give it.
Yeah, so I'll be talking about user guides and this idea of welcoming new users and delighting old ones. I also have to admit I didn't love the idea of calling some people old users. But so if you know a better word, I'm all ears. Experienced users, we got it.
My name is Michael Chow. I did a PhD in cognitive psychology. And I work on open source data tools at PositPBC, mostly on Python tools, for example. One tool I maintain is called Great Tables, and it's all about displaying tables for publication. Aside from that, I have two beautiful cats, Bandit and Moondog. And I will say one is very sassy, one is a real goober, and I'll let you decide which is which.
Yeah, so I'm really excited about user guides. I think part of it is as a cognitive psychologist, I'm really interested in how people pick up new skills. And as a software developer, I'm really interested in sort of my responsibility in dumping how people should use a tool and being sure that people can use a tool to do the kind of range of things I hope that they can do with it. I think a lot of open source developers, we build tools because we think they give people a new or a better way to approach a problem. And I think user guides are really the bridge from our kind of like hope inside our brain that people can solve problems in a new way to actually putting our tools to use.
The case for user guides
And to motivate this talk, I just want to bring up a scenario. So I work a bit on the documentation for Plot9, which is a Python plotting library. And suppose that you saw this graphic and you wanted to reproduce it. I would say an interesting question is, how would you go about that?
On the one hand, you could crack open Plot9's API reference, and you might see something like this, which is a list of a bunch of functions you could use from Plot9. But this isn't even the whole story. If we zoom out a little bit, there's actually more functions. And if we just keep zooming, there's actually like 40 of these. These are called geomes.
If we keep going, actually what you're going to see is this. Geomes are like this little piece of the API reference. But there are a lot of other kinds of things, like facets, scales, and positions. Obviously, these went into the plot that you're trying to create. But this would be a really tough place, I think, to start. You would just be waiting around for a long time, super confused.
But what if you saw a simple example instead? So what if you saw a really simple plot that used some key pieces of Plot9? For example, what if it used facet wrap to create subplots? This helps sort of set up a sense for how Plot9 puts things together. And it could mention that it's using geome smooth and geome point to actually put things on the plot. So geome smooth puts that blue line in each of these subplots. And geome point puts the points there.
And we could point out that scale color continuous is what makes this color bar from, say, blue to yellow. So I think this is really helpful because this is a sort of minimal example that gives you a sense for how some of the pieces of Plot9 fit together. So you can navigate things like the API reference to do what you want.
I also wonder what if you saw the big pieces. So rather than this API reference, squint your eyes, I mean, it's all there. What if you saw something like this? This is, I think, like 16 pages that just summarize the big pieces of Plot9 and help you get a sense for how all of it fits together.
I think these are really important jobs. And I think these are essentially what a user guide does. So I would say a user guide is concrete and sort of right sized for getting into a tool. And user guides are all over. So you might know them by a lot of names. So in FastAPI, it's the learn button. In Polars, it's just called user guide. We love that. In Plot9, it's guide. But all of these go to the same kind of thing.
I also think really interestingly, if you click this probably most prominent button, it usually takes you to the user guide. So user guides are really front and center in documentation. They're probably the most prominent call to action that documentation sites give. So it's really important to get it right.
They're probably the most prominent call to action that documentation sites give.
And my hope in this talk is to lay out the two key pieces to focus on for getting a user guide right. So the first one I'll talk about is onboarding. So, for example, we showed what I call like a simplest whole task with Plot9. What's the sort of simplest whole plot we could show someone that's exciting but also easy to break apart so they can get started with the pieces.
And I'd say I'll go through some other examples to kind of like flesh out the onboarding story with FastAPI and a resource called R for Data Science. But jumping to the second piece, I would say it's sort of like grazing. And in this talk, I'll use a sidebar as an example for grazing. But once you get through the simplest whole task, how do you then sort of keep going through a library from beginning to end? And how do you do something that gets you through prerequisites but then lets you come back as you take on different tasks to find the sort of big pieces you need?
And again, I'll hit on a FastAPI and R for Data Science example to try to flesh out different focuses for sidebars and grazing that I'll say is one's frontloading and the other I'll call backwards chaining.
Plot9: simplest whole task and sidebar hierarchy
So starting with Plot9, I really want to focus on this idea of this simplest whole task and having a good hierarchy to your user guide. So we looked already at this plot. I'm going to show the Plot9 overview page, so sort of the starting page of the user guide and how it sort of carves up this simplest whole task.
So to start, it shows you the plot that you'll be making so you know sort of what you're getting into and hopefully it's like something interesting to you that actually shows off why you might use Plot9. I think this table of contents, this floating table of contents, helps show some of the structure that there are sort of like seven big categories that the overview will emphasize. And interestingly, in Plot9, each category is sort of a grouping of functions that starts with, is prefixed with the category. So geom functions start with the word geom.
So basically, to give you a sense for the overview, what happens is each of these categories is previewed quickly and each of these is just a little bit of text. So a couple sentences followed by a little code and then the output of the code. So people can see, they saw the original plot they'll be making and then they can see it being built step by step. This is called a worked example. You're showing, you're sort of working through the process. You're not necessarily asking people to do things, but you're working it out and you're pointing their attention at things they should care about in these outputs.
So zooming out, this is the sidebar. So these are the sort of like chapters in the Plot9 guide. And basically how it works is that the pieces that were talked about in the overview are then Plot9 goes on to dive deeper into each one in a subsequent page or chapter. So you can see this thing called AAS, this isn't short for aesthetic mapping, is then covered in detail. And I'll call this a detail spiral. So basically the overview page covers in brief each of these things and then a subsequent section, basics, goes into detail on each of these pieces in depth. This lets people get a broad sense for how to use your tool on a real task before diving deeper into each piece.
I think this also helps with prerequisites because this means, you know, a lot of these pieces don't just work in isolation, they depend on each other a little bit. So by covering all of them, you get some of the prerequisites out of the way.
For the sidebar I'd say the value is sort of going from things that focus more on prerequisites, things you need to know to get started, to more grazeable, so things that you can sort of sample as you go. Once you get to about geometric objects, after this chapter, you can sort of like click around quite a bit. And so the hope is that the titles and these sections make it easy for people to see different sort of categories of things they might do and quickly key into different activities.
And just to show you by contrast, so this is without the sections. It's actually, I find it pretty hard to read without this breaking apart into sections. And I see this a lot in guides where you just have like 20 things and I find it incredibly overwhelming. So just a little bit of structure I think helps people navigate a user guide. It also puts names on activities which makes it easier to talk about.
FastAPI: task support and backwards chaining
So next I'm gonna look quickly at FastAPI which is a really interesting tool I think with a really nice approach to user guides. So the way FastAPI works is FastAPI is a tool for building, it's a web framework so you can deploy web applications like this one shown in the browser. The tricky thing is that for a web application is often there's something in the middle. So usually the process is you might write some FastAPI code and then there's this middle piece called a web server which is not necessarily FastAPI and then you interact with it in the web browser.
So there's actually a lot of context in which you're interacting with different things. You might be in an IDE writing code. You might be using something like, actually I've never said this word out loud, GUnicorn? Well, I mean, one of those two is right to serve it and then you might pop open your web browser to interact with it. So FastAPI has a hard problem because you could be in three different contexts when you're using FastAPI.
The way they handle this is that they give you a simplest whole task which is essentially like doing this as fast as possible. So they just hand you the code and they say copy this into a file and then they just give you this special command FastAPI dev which runs the web application. So essentially they try to lift you past these pieces so that you can get into the web browser as fast as possible.
And this is a real risk that I see sometimes with these tools is some tools instead will tell you about the middle thing in depth and how you can use it or they're actually like, probably like five different web servers you can use and some things take the chance to kind of tell you about each one but I think FastAPI does a good job of just actually telling you about none of them and getting you to the kind of end product. You'll have time later to learn about the middle part but FastAPI wants you to know what it does well for you.
So this is showing a bit of what FastAPI did there. So they gave you task support. They just handed you the code. They just handed you this special command and they got you to pop open a web browser. That's actually a ton of work. I think that FastAPI got you super far and I think if you're new to programming or new to web applications they got you through a ton of stuff really fast.
I think the interesting thing is the next piece they do is they start to work backwards. So rather than diving into the code they say, well you've already got the web browser open so why don't we just tell you about what we're giving you and they explain special FastAPI features like their automatic documentation. So I think this is really neat. They've basically lifted you to the end product and they're explaining to you what it does. Then they loop back and explain the code in detail after they've sort of gotten you to appreciate the end thing and understand what an end goal looks like.
The other interesting thing is their sidebar structure looks like this. This is a bit in the weeds but they talk a lot about concepts but rather than... Plot9 basically went in order like this is thing A, this is thing B, this is thing C. FastAPI does an interesting thing they automate through topics. So they start with path parameters, query parameters, request body. These are kind of like the initial prerequisites but then they start looping back to things they think you'll want to use first. So they don't do all the path stuff in depth. They then loop back to querying.
And I think the way that they maintain a sense of structure about FastAPI is they front load all the titles. So notice that path starts chapters 2 and 6. Notice query starts 3, 5, and 7. They sort of use this to flag to you hey, we're alternating through these things but these are like the three big things we care about.
The other thing is they back load subtopics. So notice like these two chapters are about these main topics and then validation. So they're sort of weaving it together and I think they're really cleverly using front loading and back loading to try to flag the structure to you.
And without something to distinguish you could really run into the case where everything starts to kind of run together. So Plot9 used sections. FastAPI used front loading. But you want something to make it clear.
R for Data Science: backwards chaining at scale
Now we're going to look at R for Data Science. Which is a bit funny as a user guide. It's a book that's like hundreds of thousands of words. But I think it's a good example of what is essentially a really large document.
Just for a little background, R for Data Science is a book about using R for Data Science. And it centers on this diagram of this data science workflow or life cycle. Where you start by importing or reading some data like a CSV in. You tidy it up. Maybe you fix the column names or you reshape it a bit as part of the tidying process. You then engage this cycle of transforming, visualizing, and modeling your data so you can understand it. And when you come to something really nice that you want to send your boss or journal you then communicate it. So this is the cycle of understanding. And the book is really structured around this life cycle.
And here's the structure of the book. So this is the introduction. And this is the chapter organization. And the thing I really want to flag here is that first it's got this whole game section designed to teach you about the life cycle. And it starts with visualize. So it actually doesn't start you from the beginning. I think what's notable is that it can't do like a single chapter overview. It needs to break this up because it's pretty big. And it starts with visualize. And I think the real reason is visualization's really satisfying. And it's also pretty close to like seeing an end product.
And then notice it keeps going backwards. So then it goes back to transform. Back to tidying. Back to import. So this is its way of handling a really big job by breaking it up and chaining backwards from the most satisfying or a very like kind of end goal piece to the beginning piece.
And then notice the later sections actually follow this as well. So the first section after this whole game onboarding is visualizing followed by transform. So it's going backwards again to try to kind of optimize for joy, I think. Another interesting thing is that it kind of peppers in these prerequisites. So these are often like coding knowledge you need for the book. But it doesn't want to just front load all the coding pieces because you would hate your life and you wouldn't be learning R for data science. It works them in right before the point that you need it.
But it doesn't want to just front load all the coding pieces because you would hate your life and you wouldn't be learning R for data science. It works them in right before the point that you need it.
Summary
So, yeah, we talked a little bit about onboarding. We talked about Plot9 to use a simple whole task. FastAPI to support people to the end goal. And R for Data Science to backwards chain and teach the most satisfying thing. First, we also looked at grazing where you break up your sidebar to create an information hierarchy so people can see the parts and it doesn't all blend together. How you can also front load to emphasize the structure. Or like R for Data Science, if you have a really big job, you can backwards chain and start with the end thing and work backwards.
I think these are really important, powerful moves because tools are complicated and it's really hard to learn a new tool. And people need all the help they can get both to get started and to come back as they need things to handle new tasks. So I think the more we can do to bring people in but also provide a structure that makes really clear how our tool fits the way we want them to think, the better people will be at using our docs through the sort of whole life cycle of the tasks they're trying to do. So thanks a lot for listening.
Q&A
First question is from Kayla. What thoughts do you have on tools to automatically generate doc strings and documentation? Good, bad? Such a good question. Well, what if I told you I myself was the maintainer of a tool to generate API references?
I actually had this as a starting thing. So I maintain a tool to generate API references. Things like this. So like each of your classes, each of your methods. And I would say like people always ask me about the API reference which makes sense because I maintain the tool to build it. But I always think about this. The tool itself, actually I've spent more time on the user guide than the API reference. Like I think that engineers always go API reference first. Like I've never seen an engineer not do API reference first. And my story is like you should user guide first. Like you can write doc strings but you should just pretend they don't exist until you have a user guide. And that's what I like about FastAPI. They actually had a user guide before an API reference. I just think it's a really good forcing function for thinking about the story you want to tell rather than how each individual piece works.
Peter asks, do you see the role or format of user guides changing when many people's new default way of learning about a package will be to ask an AI chatbot about the package? Yeah, it's a good question. My guess though is that there's so much low hanging fruit in terms of writing user guides and documentation in general that we're not really at the point I think of... There's so much on the table right now that you could do, I think, to improve user guides that it's probably okay to assume that your improvements to user guides will also be picked up by AI either as context or as part of training. And I think also it's a good way to just sharpen your thinking. Like a big piece of, I think a user guide and documentation is having a controlled vocabulary. So I think like having to work out like what are the sections is also a good way to ensure that you're using sort of systematic language and categories. And I suspect that AI will heartily take advantage of that because it will have like consistent words and categories to use.
So Charles asks, is the spiral about trying to keep readers in the zone of proximal development? Yeah. Yeah, I think so. Like essentially... you're like bringing down the difficulty by kind of like building their skill in the things they need. Yeah.
Lauren was wondering, is it useful to have a user guide be part of a repo to facilitate contributions from the point of view of users instead of authors? Is that common? Oh, like you said user guide like in its own repo. What a great question. I actually have a user guide. So the Plot9 guide, I own the repo. So I don't know if this counts, but the reason I did this was maybe for that reason, as I said, like, this is such a big job. This is basically like as if you had like a dependency on Plot9 or you were like a tool to explain Plot9, like that. I just asked Plot9's author, Hassan, to just like delegate to me the job of running the guide. And I don't know if that's the exact question, but I think relatedly like I was like, I need to be fully immersed in this task because it's a job just as important as writing the Plot9 code.
Max has a question. Aside from doc strings, auto-generated API guides, how do you ensure that docs are up-to-date with the code and a continuous integration cycle? So one, Quarto's really useful. So like for the tool, Great Tables, we use Quarto and I'm biased. Quarto's developed by Posit. So take that with a grain of salt. But like we do a lot of code in our documentation here and these are really simple like markdown that gets executed. So like our docs get regenerated each time and it's just executing the code. And we get, we also get really sad because maybe we do like a code thing and our docs don't generate and then we can't merge it. But that's like the right, I feel like that's the right level of sadness. You know, we like feel it. So it's just using this like little markdown syntax and it gets executed every time on our CI. So I think things like that are useful.
And as our last question from Tim, as you mentioned, quality user guides require considerable time investment. How do you balance this time or even keep track as API and best practices evolve and even your end users goals evolve? How do you identify what is the core of a library that needs to be learned and shared? It's a really good question. I mean, I'm thinking of R4DS, like R4DS is all about an activity and it's like hundreds of thousands of words. So it's a bet that that activity is worth talking about. I think it's fair like the fear that your user guide will need to change is sort of like the fear that you've targeted the wrong audience or built the wrong thing. But it's kind of a product question. I would say like you want to really make sure you've built the right thing. And I think that a user guide, I think the thing a user guide would give you is the chance to like run people through the core idea of your tool and to look them in the eyes and see if they're like they look dead inside while they do it. That's my litmus test is I'm like try doing this thing and I watch them like skip parts of the guide or like die inside and I think it's a really great opportunity. You should see people die inside as you build your tool. I think it's a soul enriching activity.
That's my litmus test is I'm like try doing this thing and I watch them like skip parts of the guide or like die inside and I think it's a really great opportunity. You should see people die inside as you build your tool. I think it's a soul enriching activity.

