Resources

Documenting Things: Openly for Future Us - posit::conf(2023)

Presented by Julia Stewart Lowndes This talk shares practical tips and tangible stories for how intentional approaches to documenting things is helping big distributed teams tackle hard challenges and change organizational culture via NASA Openscapes, NOAA Fisheries Openscapes, & beyond. I'll share about documenting things, and how intentional approaches to documentation and onboarding are helping big distributed teams tackle hard challenges and change organizational culture. The goal is to provide concrete tips to help you document things effectively & hear stories of how putting a focus on documentation can be help teams be efficient, productive, and less lonely. I'll give a short lightning talk (inspired by Jenny Bryan's Naming Files talk) followed by stories from NASA Openscapes, NOAA Fisheries Openscapes & beyond. Materials: - Slides: https://openscapes.github.io/documenting-things - Blog post: https://openscapes.org/blog/2023-09-27-documenting-things-posit-conf - Website: https://openscapes.org - links to NASA Openscapes and NOAA Fisheries Openscapes and beyond - Jenny Bryan's Naming Files talk - https://github.com/jennybc/how-to-name-files#how-to-name-files Presented at Posit Conference, between Sept 19-20 2023, Learn more at posit.co/conference. -------------------------- Talk Track: Getting %$!@ done: productive workflows for data science. Session Code: TALK-1092

Jan 25, 2024
19 min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hi everybody, I'm going to be sharing today about documenting things and how doing this with intention can really help big organizations do hard things around tech and culture.

So I'm Julie Lowndes, I am a marine ecologist and I grew up in the R and open science communities and documenting things has been a big part of my shift towards doing my own research to supporting other research teams doing research. And so I do that now through a program called Openscapes and this is a huge collaborative effort with many, many people I'm going to be sharing from.

So let's talk about documenting things openly for future us. So first of all, things is all of the things. This is your code and analyses, your teaching resources, your onboarding and community documentation, your field work, your lab notebooks, events that you've got, your blog posts, like you name it. And future us is really a mindset and a habit around the idea that you're doing this work for yourself, for ourselves, for our teams, for our communities and really thinking about that on different time scales in the next hour, in the next week, in the next decade.

So this is really getting beyond the idea of working by yourself on your own laptop and really thinking about how to be intentional and inclusive in the work that you do so that others can benefit as well.

So documenting things does not have to be painful. That's one of the things I'd like to share today. And in fact, it's supposed to be helpful. It's supposed to be there for you so that you're not having to repeat the same mistakes or frustrations that you or others have had in the past. It does, however, take time and intention. It takes support of yourself and others, and it really means kind of slowing down briefly in order to speed up for yourself and others over the long term.

So really the purpose today is to help you document things effectively and also hear stories about how documentation can be visible and valued and help teams be efficient, productive, and less lonely. So the structure of this talk is going to be a five-minute lightning talk with practical tips, and this is inspired by Jenny Bryan's Naming Files. And Jenny Bryan has been such a role model and teacher about how to document things well and share with others, and so I'm trying to bring some of that here. And then we'll have 10-minute stories about repeatable strategies from community organizations at NASA and other places.

So really the purpose today is to help you document things effectively and also hear stories about how documentation can be visible and valued and help teams be efficient, productive, and less lonely.

Practical tips for documenting things

Have a place. Have an audience in mind designed for readability and accessibility. Have a place. It doesn't matter where at first. Just write it down. Write it down somewhere, some kind of software you're using. Get it. Get it. Have a place there.

Then having that place will let you write as you go. You'll be able to develop a habit of writing things in that place, and one of the ways you'll do that is by copying and pasting things you're already writing. Take it from that Slack message, that email. This will help you break down documentation that otherwise can seem like a big looming task that might take weeks to do.

Keyboard shortcuts are great for cutting and pasting text from one place to another. You'll be able to write in a modular way. Writing in small bits is less daunting, and it's easier to maintain collaboratively. You'll be able to write these little bits in a way that can be networked so that you can link back and forth to them because you don't need a linear flow when you're writing documentation. This will also help you follow the concept of don't repeat yourself, which is something that we can borrow from coding, where you're trying to have your code only written once that you call by functions. There's often also no true order to what you're documenting, and so you'll be moving things around as you develop.

Having an audience in mind is key here. You're writing this for someone, that someone is probably you and many others, and different audiences at different entry points, so you can make your documentation engaging. It doesn't have to be dry or distant. Writing in an inclusive tone is part of this, and having the mindset of we're in this together and that your readers are intelligent people and they're here, and they're here to learn from you. So you can really consider your goals for your documentation and the style that you want to have that can welcome readers.

You can avoid specific words that trivialize pretty complex things, like simply clone the repo. There's a lot to unpack in a statement like that, and thinking about how to meet folks where they are. There's a lot of thoughtful resources that you can emulate, and these are just a few.

Narrate code in small chunks if you're presenting code. So this is you can narrate your code in a way that you would say it out loud if you were teaching it, and you can match that tone with your purpose, especially as you're trying to highlight specific parts of your code for learners to focus on. We learned this morning that Quarto has a new code annotation feature to help in this as well.

Sharing early is a big part about writing documentation. You want to be able to have this be useful for folks as soon as possible, and you want to be able to iterate it and receive feedback and incorporate that feedback to improve it. This is where the idea of open comes in, and open does not need to mean public. Open can mean shared with your team, and you're able to leverage different technologies and the permissions that come with them so that you can share what you want broadly and also keep things private. I think about this as leaving breadcrumbs for yourself, and in this example, this is a public Quarto documentation for NOAA fisheries where they're still able to include internal links that are just accessible to them, but we're all able to benefit from this documentation.

Lastly, design for readability and accessibility. And we're able to leverage a lot of defaults on the software side and best practices on the community side in order to support these goals.

Using section headers is one thing we can do here. These are important for screen readers to help describe the sections and flow of a document, and you're able to anchor to them directly so that you're able to share a URL specifically anchored to a certain section of your documentation. Naming things can be really key here, and you're able to embrace the idea of the slug where the name of the header becomes the URL so that you're able to give a little bit more information about what that header is.

Use text formatting in your documentation. This will help your readers follow along. You're able to hyperlink to the thing that is important for people to look at rather than a word like here or this that can be ambiguous and hard to see as well. You're also able to distinguish code with fonts and use markdown in this case as well.

Using alternative text for images is really important. This is something that describes the details of the take home messages of the visuals. So it's different from a caption in that it will actually describe that a round hedgehog is knitting a yellow sock, a rabbit with a teal beanie is wearing one yellow sock, and it watches in anticipation, and a shelf to the left of them contains yarn in a tote labeled text, and knitting patterns in that tote are labeled code.

NASA Openscapes: documentation enabling cloud transition

So putting a focus on documentation has enabled NASA to collaborate across divisions and support users that are transitioning to Earth Data Cloud. So as a reminder, NASA collects an amazing amount of data about our Earth. They have many satellites and missions that are collecting data that are freely available for users who are studying our planet, who are studying climate change, who are studying public health, agriculture, sea ice melt, a lot of awesome and important work.

And these data have been traditionally downloaded by researchers in order to do analyses locally. And after many years of effort and planning, these data are being migrated to the cloud in AWS West 2, and this means that researchers are going to be taking their compute to the cloud in order to analyze these data. This means that this is a big shift in the way researchers have worked their entire careers, and it's also a shift for the NASA staff who support researchers in terms of how they teach users to access and work with these data.

And this teaching this NASA Earth Data Cloud is going on by learning as a community as we go. So the NASA Openscapes project is a project that I lead with my co-lead, Aaron Robinson, and we are supporting a mentor group across the NASA Earth Science Data Centers, and those folks are creating and teaching common tutorials that researchers are using to migrate their analytical workflows to the cloud.

So this is a huge effort around connecting folks from different parts of an organization, having them be in a comfortable environment to learn together, to teach together, to write documentation together. There's a lot of emphasis that we put on building a psychologically safe space to ask questions, to have a growth mindset where we're able to learn and do together.

So first of all, we have a place that researchers can go to to learn when to cloud, how to think about using the cloud with R and Python and other approaches. So this is the Earth Data Cloud Cookbook. It's made with Quarto. It's hosted on a GitHub repository that we're able to contribute to with Jupyter Notebooks and R Markdown documents and otherwise. But so this is the place that we started to have a place to contribute to.

Our audience was really important. We realized that it needed to be us first, and then specific researchers. And so this diagram shows that we really invested in our NASA OpenScapes mentor community in order to write documentation that would let us figure out how to use the cloud, and then we were able to then support other staff across the NASA data centers towards a specific workshop with researchers, and all the while contributing to the open science community through this documentation.

In terms of design, we're really emphasizing meeting people where they are. We don't assume experience with GitHub, with Quarto, with cloud, with Python, with R, with Quarto, but we do have the expectation that everyone can learn and teach these things. And so we're able to support folks coming from where they are and point them to the skills that they might need in order to contribute or learn.

We're also able to collaborate on this documentation across workflows because we're using Quarto. We're able to share these or combine Jupyter Notebooks, R Markdown documents, as well as contribute in our home places that we're most happy working, whether that's VS Code, Jupyter, RStudio, MATLAB, or otherwise.

One other design element is that we're really reusing and complementing existing work. We found this documentation system framework is a really helpful way for us to write documentation. It helped us think about what was learning oriented and writing those as longer form tutorials versus what was problem oriented and could be quick how-to guides with copy and pasted code.

Another part of complementing existing work and learning from others has been to be a part of the broader open science and open source community. This has really helped NASA mentors learn what's available, scope out what the bounds of the contribution around documentation should be, as well as co-develop and share and get that early feedback.

Intentional onboarding and growing the community

So putting a focus on documentation has really enabled this NASA OpenScapes community to grow through intentional onboarding over the last three years. So the point that we're really trying to build this sustainable community within NASA that's able to support researchers in the long term and have this documentation so that folks can contribute.

So there's really been three places for three different audiences that has all been designed for onboarding more NASA staff into this mentor community so that they're able to support researchers. The first place is that we have a project website for NASA OpenScapes, and the primary audience is really for leadership and managers to have an idea of all of the work that the mentors are doing and really trying to make a lot of the invisible work more visible. We share all the slides and resources and workshops in one place so that there's visibility there.

And really in terms of an onboarding idea for onboarding mentors, we're really there to invite participation and have that invitation and all the information that goes along with it, not live within a forwarded email, but actually have a place online that can be updated if necessary.

The second place for onboarding mentors is actually in that Earth Data Cloud cookbook itself. We have a contributing section that was one of the earliest places that we developed, and that's really a welcome for new mentors once they join this community. That's where they learn how to access the communication channels we have on GitHub and Slack and JupyterHub, as well as all of the tech, as well as how to contribute and think about how to do code review and pull requests and that kind of thing.

And then we have a third place for onboarding, and that's really in our OpenScapes team approach guide. This is where we kind of write documentation at this higher level for ourselves where we're thinking about our facilitation approaches, the purpose of doing the different things, our timelines, our checklists. And this has been the last thing that we have had the bandwidth to document, but we're dedicated to documenting this.

So what's possible from all of this is that this mentor community is growing and taking more leadership of the community because they're able to not only repeat what we've done, but iterate and go beyond. And they're not having to start from scratch about timelines or onboarding text for emails and whatnot.

But beyond this, what's neat is that putting a focus on documentation has enabled NOAA fisheries, California water boards, and other groups to see this and repeat this whole process. And this has all been just incredible and exciting work, but what's been neat is that the NASA OpenScapes mentors are part of this bigger cross-governmental group that are teaching each other, learning about what works and what doesn't work, and having a shared place to collaborate.

So some of the advice that we have as this broader mentor community is to really invest in both technical and social infrastructure when you're thinking about how to document things. Documenting things will ultimately save time. There'll be fewer emails where people are emailing, asking where things are. People will feel less stuck for the, well, they'll feel stuck for a shorter amount of times hopefully. And they'll really feel less lost and like they belong.

And they'll really feel less lost and like they belong.

And that's a really critical thing just in general, but specifically during this pandemic and hybrid work culture. So this does, though, require intention and time, and this investment in psychological safety and growth mindset is really important for documenting things.

So when you come back to thinking about documenting things openly for future us, it's have a place, have an audience in mind, and design for repeatability and accessibility. So thanks so much.

Q&A

So you mentioned several times that there are different levels of documentation, right? There are the technical stuff, there's the high-level stuff, and everything in between. What are the kind of recommendations you gave for getting it done, getting it on the ramp? Is it the same for all these kind of different documentation types?

That's a good question. For the teaching resources that the NASA OpenScapes mentors are putting together, that process is a bit more intentional. Well, it's not a bit more intentional. It is more intentional. So it's done in a branch, we have dry runs, we review the materials, we test things in different places, and then that is merged back into the cookbook.

When we're writing documentation on that higher level about what sort of Google Doc we should have in a one-on-one call, for example, that might be something that we kind of push to main pretty quickly, and I'm okay with that being a little messier, because it will still help future us on our team level. So yeah, there are different levels of review and quality assurance, absolutely, is the short answer.

So, as you said, documentation takes time, and it is something that gives back in a long time span. Can it be difficult to kind of allow management and other levels to carve out the time where you don't feel like you have to yourself prioritize it and make space for it?

Yeah. Yeah, I think it's kind of a yes and approach, where the idea of showing what documentation can do is important. So if you take your messages from Slack and paste it into your Quarto book or your Google Doc, if you can show that to folks as being useful, that can help the second step of advocating for more time to document. But having that early thing to show that minimal viable product can be really helpful. But yeah, I think advocating for this as part of your time and part of your job is a big part of this.