
The Curse of Documentation (Michael Chow, Posit) | posit::conf(2025)
The Curse of Documentation Speaker(s): Michael Chow Abstract: In Greek mythology, Tantalus was doomed to stand with a lake of water below him and branches of fruit close above. When he went to drink the water it receded, and when he reached to eat the fruit it was blown beyond his grasp. What he needed was forever at arms length. Software documentation often puts users in a similar bind. The information is there, but something doesn’t quite connect. Maybe you try and fail to adapt an example to your use case. Maybe it’s unclear how a bunch of functions fit together. In this talk, I'll discuss how effective user guides--like R for Data Science and the React.js guide--break the curse. I'll focus on three factors behind effective guides: strategic information, inductive learning, and task sequencing. posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
My name is Michael Chow. I'm a software engineer at Posit PBC, and I'm really excited to talk to you today about how documentation is cursed.
And what I mean by that is I'm really interested in documentation for open source tools. And during this talk, I want you to think about some of the documentation you've looked at for different tools you've been interested in. Have you ever looked at a documentation site for a tool and felt like it's loaded up with information, but it's not really the information you need? I'd say that documentation site is cursed.
I think we have to go back 2,500 years to the Greek myth of Tantalus, who was cursed to be put under a tree where branches of fruit hung right above his head, and there was water at his feet. But when he reached up to grab the fruit, it was blown beyond his grasp. And when he went to drink the water, it receded so he couldn't reach it. So what he needed was forever at arm's length.
And I think that cursed documentation is worse than regular documentation because it tantalizes users with the promise of the answers you need. But ultimately, we often can't get the understanding that we need to do the problems to solve our problems. So you might see docs, but it might be really hard to figure out how everything fits together.
And what I'll talk about today is I think that in 99% of the cases, the solution is a really good user guide. So I wanna go into what a user guide is and does, and then five steps for creating a user guide.
And what I'll talk about today is I think that in 99% of the cases, the solution is a really good user guide.
What a user guide is and does
All right, so to do that, I'm gonna use great tables as an example. And I'm glad Rich talked, so he showed off great tables a little bit.
Great tables, it's really hard, I think, to solve this problem and to do a good user guide. And so I just wanna flag what this looks like in great tables. So great tables has two really recognizable pieces. It has this examples gallery, which is actually our most popular page on the site. And then it has this API reference, which is all the little pieces laid out.
And then we, but I would say we probably spent the most time teasing out this piece, the user guide. So to show what a user guide is and does, let's go through the examples in reference as if those were the only documentation we had.
So let's say you're looking at great tables for the first time, and you pop open the examples page. This is the most popular page because it shows you at a glance real life use cases. It really highlights why you would use great tables and realistic things you could do with it.
So you might pick out this table as a really interesting table. This is our coffee table, which represents the sales of a fictitious coffee equipment supplier. And so this shows off a lot of nice things about great tables. It shows off some different elements, like making a title and filling backgrounds and styling things like bolding stuff.
So let's say you wanna take this and adapt it, and maybe you look at the code a little bit. And to dig deeper, let's say you open the API reference. So you go there, you hit this page, and basically what the API reference is, is it's a big list of functions categorized a little bit, and it has a lot of pages that just go into detail on all these tiny pieces.
But what I'm showing here is not the full story. As it turns out, API references are pretty big. So even this isn't quite it. It's actually something more like this. So you've got this neat example, and you've got 105 functions, and I think at this point you're a little bit cursed. You have a really nice example in your hand. You have 105 tiny things, but I would say what you're missing is these sort of like middle-sized chunks, the strategies to be able to break down examples into the tiny pieces, and know how to remix things for your problems.
And that's exactly what a user guide does.
So I think that takes us to user guides. How can we break this curse? If I have one really neat example in my hand, and the absolute deluge of 105 little tiny functions, how can I find what I need?
So I think the key is that a user guide can really help you understand the big structure of examples and the big pieces for your problems. So for example, if we're onboarding you to this example, we might flag that there's a key piece which is structure in GreatTables. So that's something like this title and these column spanners, these high-level labels, or the fact that you can kind of clean up column names. These are all structure activities in GreatTables.
We might also point out there's formatting. So you can take these values and you can turn them into currencies, or you can turn them into things like percentages. We also might point out that there's another activity called styling. You can color the background of a column, or you can bold a row of text.
So the point here is that we've flagged kind of three big activities you can use as sort of an in-between between the tiny chunks and the examples to help you understand the overall structure of GreatTables. And I'll call these components strategic components. These are these kind of like middle chunks.
And I want to flag that this example, I'll say this is a really good use of induction. We're using a case study or a concrete example to teach you about general ideas.
The other thing I want to point out that's a challenge for user guides is task sequencing. So this is the table that we might onboard someone with, but I would say it's super hard to find, even figure out like which example do we start with, or how do we kind of weave through an example to break it down for people.
But I would say this is the crux of a user guide, is trying to get like a really good onboarding so someone has an end-to-end example, and then being able to sort of, instead of flag 105 functions, have say like 10 to 15 big concepts that you want people to know, so that they have this kind of digestible way to break stuff down.
Five steps to breaking the curse
All right. So that's the crux of what a user guide is and does. It's those intermediate pieces and the need to get a really nice onboarding.
I think what I want to spend the rest of my time on is five steps I think are important to breaking the curse, or how I would approach designing a user guide.
So I find user guides really tricky. So working with Rich on Great Tables and Point Blank, one interesting thing was, Rich had worked a lot with these libraries, and I came in not really know, I had no idea what they did. And so what we had to do is, Rich just had to hit me with example after example, to kind of like beat me over the head with examples until some idea of the pieces came out.
And so I'll say this five step process sort of emerged from me trying to figure out how do we pull a user guide out of these libraries I had never used before. So I'll say the first step is collecting examples, followed by scoping concepts, so breaking the examples apart, and then sequencing and onboarding, figuring out how you would get a new person to understand the crux of what your library does. And then finally, detailing all the remaining concepts or the big pieces left over, and then drafting a workshop, which I think is kind of like the documentation form of touching grass, being sure that your documentation's meant for humans, and you can see when they have a bad time, or a good time with it.
Step 1: Collecting examples
All right. So the first is collecting examples. I think the key here is to get the worst examples. It's W-R-S-T, which I would say is you want examples that are whole, real world, simple, and typical. So you want a real, complete example that you might see in the wild that's not too bananas to hit people with, that someone doing this type of thing might actually do. So it's typical.
And I'm sure you've seen in docs, one challenge is sometimes it has a lot of examples, but they're little tiny pieces, and the pieces don't add up. So I think that ensuring it's whole and real world makes sure that you're tackling real problems. And simplicity is all about making sure that a newcomer can actually digest it. It's efficient for new learners.
So we settled on these three tables as really reference points. And I think Rich showed this off in the last talk. So this is one he created on chemical compounds. So this is, Rich has a PhD in chemistry, so this is a table that's sort of real to him. And this was useful to see some of the additional things you could do with great tables.
And then this one we pulled off social media from Grant Chalmers. This is a carbon emissions table. And one neat thing is it's sort of like a blend between a table and a heat map. So it introduces some really interesting concepts. How can you create a heat map table and do things like make sure the columns are the same width and some other nice things.
And I will say we had really great inspiration. So we were lucky to have the 2024 Posit Table Contest to reference. So Rich and another person named Curtis Kephart have run a contest for the last few years to get people's neatest, worstiest examples. And this was really great inspiration to see what's the range of things people might do and sort of variation across real use cases.
Step 2: Scoping concepts
All right, so that's collecting examples and doing your worst. Next, I would say scoping concepts. So breaking them apart.
All right, so I showed this. And the key here I just want to emphasize is that mapping concepts is so important. As Rich just produced example after example, I feel like we had them in whiteboarding software online and we were just marking them up like crazy.
When I show you this idea of structure, format and style, this idea has caught on a bit. We've seen people use it when they're teaching workshops and it's really been helpful for structuring our docs. But I can't emphasize enough that when we started this task, we didn't know these were the concepts we'd use. We actually had to do tons of diagramming and examples to understand that we might actually use these concepts.
The other thing is Rich produced this really great domain model. So he dove into this idea of table structure and he produced a diagram of all the different parts that you might customize in a table, all the different individual pieces of table structure. And I think this is a really neat way to emphasize core pieces of strategy and sort of bubble up like a cheat sheet that people can reference and use to understand your tools.
All right. And all this adds up to for breaking the curse is filling in these missing pieces. So for us, structure, format, and style are really nice things we can kind of like hang our hats on. They're just a good size for explaining things. You can pull up any example and ask what are the structure, format, and style pieces. And we even organize our code using these concepts.
Steps 3 and 4: Sequencing, onboarding, and detailing concepts
All right. So that's the idea of getting the examples and breaking them apart into concepts. I would say once you're at this stage, you're in a good place to sequence in onboarding.
And I won't go into detail into onboardings. And I will also note that actually immediately after this, I have a PR to open on Great Tables Docs because I've cheated. As it turns out, library maintainers are not great with their docs all the time and that includes this guy.
So I need to really flesh out the Great Tables onboarding. I feel like this talk was the inspiration I needed. But I will emphasize onboarding's been really discussed a lot and there are a lot of really nice examples you can find of it out there. So I'd say R for Data Science, the book R Packages, and Plot 9, another guide I've worked on, all have really thorough onboardings that you can use and reference.
But in general, a lot of these really great user guides I've seen have emphasized this concept of the whole game. So teach the whole kind of end-to-end picture so people have the big pieces and the kind of like sense for the big game before you go into the individual bits.
All right, so I'm gonna kind of skip sequencing for the sake of time. But I would say the emphasis there is to ensure you do the whole game. So you take your favorite worstiest example and basically turn it into an onboarding.
All right, so the last things are detailing the remaining concepts and drafting a workshop. And I think detailing remaining concepts is one of my favorite topics because I think most people can get to onboarding pretty quickly. But I feel like there's this kind of like now what sense sometimes after you give people like their first onboarding piece.
And I think what's neat, one neat approach technique I've used is this idea of a concept matrix. But there are sort of like hidden concepts in your documentation. So what I do sometimes is I put the function names on rows and then I'll put different parameters things could take on columns. And this is just to emphasize shared structure across the package.
So an API reference usually goes horizontal. It'll take one function and describe it in detail. But what I like about the concept matrix is that I think it flags pieces in the guide where there should be guide pages. So I would say guide pages should be vertical. They should do slices that emphasize structure shared across functions or pieces that you might not see in the docs.
All right, so the concept matrix helps you by exposing these sort of like vertical slices. And the example I'd say here is we have a page on column selection. We noticed functions across categories, structure, format, style. A lot of them can select columns of a table to work on. And this actually involves a little bit of, there's quite a few options you can do for selecting columns. So it was nice to put it in its own page where it could just flag to users, this is an activity. And then we could even flag some of these functions to show them these are different types of places you might end up doing it.
Another interesting one is this formatting pattern that there's these parameters USEPs and locale. And locale, I've actually never said that word out loud. But I've thought it a lot. So locale is shared by most of the format functions. And another interesting thing is it interacts sometimes with different parameters. So to show you how locale might work, if you have say like the English number, well, this is just a number. But 1,234, you know, in English, you might separate the one into a comma, it's a unit. In German, you might use a period. But then you might use a comma where the English period was. And if you use NOCEPS, you might not have a comma at all. So essentially locale determines how these numbers get formatted, and it interacts with other things.
So in this case, sometimes it's really useful to flag these dynamics. A user clicking through your reference is just gonna hit this definition over and over. So our docs would just tell you what a locale is over and over and over. But we can have one page that actually surfaces this dynamic across a bunch of different things and shows you where it comes up.
Step 5: Drafting a workshop
All right, so I'm gonna go quickly on the very last workshop piece. This is all to say, for the user guide, we usually break it down by topic. For the workshop, we go by case study. And the case studies usually are able to link to the guide. But this has been really helpful for us because people can graze the user guide. But then the case studies we use to kind of say, what's the slice across topics you need to do the next most interesting thing?
All right, so that's the whole idea of the five steps. Thanks you so much for watching. If you find any cursed docs, please send them to me. I'm an avid collector. And thanks so much for watching.
Q&A
We have time for one question. So how much documentation is too much documentation?
I don't know. I don't even think I should document an answer to that. But I don't think you can have too much. But I do think the odds of it becoming cursed grows the more documentation you have.
Okay, actually I have time for one more if you're good with that. Do GT and Pointblank use the Quartodox package? If yes, will it replace Packagedown for R in the future? Yeah, that's a good question. I'm the maintainer of this package called Quartodox to create an API reference in Python. So GreatTables and Pointblank do use it. But Packagedown's all R packages. So I don't think it's gonna replace Packagedown anytime soon. As a Python developer, I don't have, I love R, I actually absolutely love R and use it. But I don't have room in my heart to think about stepping on Hadley's territory. I feel like he's pretty set, you know?
Okay, great. Thank you and let's thank all of the speakers for today's session.

