J.J. Allaire - Publishing Jupyter Notebooks with Quarto | PyData Seattle 2023

Transcript#

This transcript was generated automatically and may contain errors.

I'd like to introduce Allaire here. Thanks. Thank you very much. And thanks everybody for coming.

I'm here today to talk about publishing Jupyter Notebooks with Quarto . I'm J.J. Allaire. I'm from POSIT. Tracy will be in front. Just for Tracy. She's from POSIT. POSIT used to be called RStudio . So you may not have heard of POSIT. It was probably the name of RStudio.

So, we're going to talk a lot about Quarto today. But before we talk about Quarto, I want to talk at a very high level about Jupyter Notebooks and what is special about them.

And for those of you who are users of Jupyter Notebooks or curious about Jupyter Notebooks, passionate about Jupyter Notebooks, I really strongly recommend you read this paper that was written by Brian Granger and Fernando Perez who actually created Jupyter Notebooks. And they talk about what's special and distinctive about Jupyter Notebooks.

And of course, as we all know, and we all probably benefit from all the time, Jupyter Notebooks are an interactive computing environment. But more than that, they, you know, my emphasis here, help humans to think and tell stories with code and data. Jupyter Notebooks' narrative and writing are a fundamental part of using a Jupyter Notebook.

And it really is important in two dimensions. How many of us have, when writing about how we're going to solve a problem, that actually affects how we solve the problem. Or writing about what we're going to do in the code actually affects what we do in the code. And how many of us have produced output, a visualization, a model result, various figures, but there's additional context required to understand what we produced. And that's the telling stories part.

The importance of narrative in data communication

So reflecting a little more about this idea of telling stories and telling the whole stories, some of you may have read Edward Tufte 's pamphlet, which is sort of a takedown of the reductive style of PowerPoint for communicating about data specifically. He has a bunch of examples in the pamphlet. One of them specifically actually has the PowerPoint deck that was used to greenlight the ill-fated space shuttle launch. And some of the figures they presented were very much presented in isolation with some very, very short bullets on a slide. And there's actually quite a bit of important context that was needed to understand those figures. So he's urging us to do more when we communicate about data.

Actually, a podcast that I just heard the other day was an interview with Bill James. How many of you here have heard of Bill James? So you've probably heard of the movie Moneyball. And Bill James was actually the person who kind of started the revolution that led to Moneyball. And anyway, he kind of invented sort of this idea of deep data analysis of baseball and sports. But he's actually, as things have gone on, become kind of disillusioned with where that movement has gone because it's also become quite reductive, where people are obsessed with producing a single number, wins above replacement, that fully encapsulates the value of a baseball player.

And when he used to write and do data analysis about baseball, he had quite a bit of narrative that went along with it. So this idea of narrative that describe our assumptions, constraints, qualifications, that go along with the data and visualizations that we present is really fundamentally important. And that's this idea of telling stories, computational narratives, telling stories about data. And that's kind of where Jupyter fits, and also, as you'll see in this talk, Quarto fits.

So this idea of narrative that describe our assumptions, constraints, qualifications, that go along with the data and visualizations that we present is really fundamentally important.

So before I get into some of the specifics, I want to acknowledge there's quite a bit of history in this building tools to help people with sophisticated technical narratives, interactive computing, weaving those together. That goes back to tech, but it also goes back to notebooks, various implementations of Markdown, Jupyter itself. And all these tools have kind of, I think, really weaved themselves together in recent years to create a very compelling environment for computational narratives and storytelling of data.

J.J. Allaire - Publishing Jupyter Notebooks with Quarto | PyData Seattle 2023

Transcript#

The importance of narrative in data communication

Background on Quarto

How Quarto works

Output formats and document options

The project system: websites, blogs, and books

Technical communication requirements

Semantic authoring

Jupyter integration and output formats

Notebooks as a standard container

Extending and hacking the system

Featured software#

Quarto