Great Tables: Make beautiful, publication quality tables in Python | Rich Iannone & Michael Chow

Transcript#

This transcript was generated automatically and may contain errors.

All right, hey everybody, I'm Michael Chow and this is Rich Iannone , and we wanted to talk today about how making beautiful, publication quality tables in Python is possible in 2024. And we're really hoping that at the end, you don't just think that they're possible, but a total delight.

So just to give you a little bit of background, we are software engineers at Posit PBC, and between the two of us, we have two PhDs, two dogs, and five cats, and we are fanatics of table display. We're just so interested in beautiful, publication-ready tables. We each have two and a half cats. That's the joke.

Yeah, what I mean by display tables and publication-ready tables is less of the table on the left, which is a Polars data frame. So this is like a raw data frame. If you're a data frame plumber , you're a data scientist and you're working on the data, this view makes sense. You know, the column names have underscores in them. You can see the types of the columns, and you can really see the nitty-gritty raw values.

We mean less of that kind of table that has its guts hanging out for analysis, and more of the table on the right, which is a table that you might send to someone if you're trying to convey something about the data. So this is the table we'll be talking about. It's a coffee table, and we'll be walking through it today.

Just to give you a sense of where we'll be going, first we want to talk about tables as data visualization, then key ingredients to creating a beautiful display table, and then last, some more advanced considerations when creating tables for display. So I'll be talking about our table goals. So basically the first part of the talk.

Tables as data visualization

And I want to start just by showing you a few beautiful tables I got from the internet. So the cool thing about this table here is that it's pretty visually stunning. It has all the parts that I like in a table. Like look at this header. Basically it just has a title and a subtitle, but what it does is it explains the purpose of the table. Before you see anything else, you sort of see what the table really is supposed to drive at.

Another cool thing I like is that there's a spanner. What that means is there's a label above a number of columns. It just gives you some sort of grouping and some structure to the table. And another cool thing is these team logos. They quickly convey the identity of each row. And formatting. That's a thing here. We see that we have percentage values, and they're nicely formatted for readability. And the really cool thing is that this row here, the second row, is highlighted, and it essentially draws attention to the main subject of the table. I won't get into that right, what it is, but it's a key part of the data visualization. It gives you the focus and it drives you to the insight pretty fast.

And so I have another table from the same author, coincidentally. What really grabbed me about this table is that there's bar charts inside this table. You can see right here, quite a few of them, they're front and center. And what they do is they enable really fast comparisons between values here. If you just have the numbers, like the percentage values, it wouldn't be so highly readable, but we have both. We have the bars plus the precise values.

Another cool thing is that we have lots of rows, but what the author did was subdivide the rows into categories or little groups for a better organization. So that's great. If you want to focus on, like, the box scores or the shooting part or some advanced stats, you can focus on them one bit at a time without the entire table overwhelming you.

Another cool thing is that we have a footer here, and in that footer, the author chose to put in some footnotes, and that provides additional detail. It didn't have to be put in the text. It could just be put right in the table. So that's a really cool thing that the author did.

Okay, here's a third beautiful table I grabbed from the internet. This is sort of a big one. It's a lot of numbers, but the cool thing about it is that there's essentially a heat map in this table. We see lots of color, and the color is actually pretty good because it tells you what the values are without you having to look and parse each and every single value. You sort of get an idea just from, like, the colors presented.

And there's a number of good things I like about this table. Besides the heat map, we have nicely formatted percentage values, and we noticed that if you look really closely, the percentage values have decimal alignment, which makes for easy readability of the values, which is really a nice touch. And of course, the heat map, like I said before, it helps you scan the values, and it really aids comparisons without having to read every single value.

So the really cool thing about these tables is they're not made in Illustrator, they're not made in some graphics program, they're made from code. And of course, the benefit from that is that we get a reproducible workflow. You go from input data to analysis to visualization, which is this, and you can bring that right to reporting. And you don't have to, you know, go to another program, dump the value, dump the data out, and then, like, you know, use some other tool. You have one continuous reproducible workflow, which is very cool.

You have one continuous reproducible workflow, which is very cool.

So how did we get here to this point where we can make tables from code? Well, we had to generate an API, but before we create that API, we had to look, you know, at, you know, the state of the art in terms of, like, what's done for tables. Surprisingly, and this is something that made me a little bit upset, is that there weren't many texts at all for table design. So we had to look really, really hard. You know, lots of books here. These are great for everything else. But luckily, we found one book, which was great. It's a pretty old one, but it's a great one. It's the Census Manual of Tabular Presentation.

What does it do? Well, it essentially dials the concepts on table display to 11. It provides many solid and useful recommendations. It has nearly 300 pages of, you know, just table stuff, which is really great. The most important thing is right away it formalizes the structure of a table. We don't have to go too far. It's actually in figure two of this whole book. It gets right down to just showing you what the different parts of a table are called and how they're structured and what they do.

So we took that because these are really great ideas, and we adapted it to a sort of a more modern, you know, version of a table. And we got this. So we have the different parts, like the table header, the stub, and table footer, and all the other parts. And we gave them names, which was super important.

So how do you make tables today? Sort of like looking at how people do it. You may take a raw data frame. This is, I believe, a Polars data frame. You can present it to other people like this, just like, here you go, the raw data frame. This is your results. Not really recommended because it doesn't look too good and also doesn't have all the data. It's kind of like not so great. It's a bit raw.

Another idea. This is what a lot of people do. I'm guilty of this. You can bring your data to Excel and then make the table there. Now, the problem is, and this is what I discussed before, is that now your workflow is not reproducible. It's a bit broken in that way. So not great. I mean, you got the table done, but, you know, at the expense of like breaking your reproducible workflow.

So our idea is you can use an API like Great Tables and work entirely in Python. It's reproducible. Probably less effort overall. And the tables look really good. So you're not losing anything by keeping it in Python. You're actually getting a lot because it's pretty great.

Okay. So Great Tables. It's our package. It's focused purely on the display of tables. That's its only concern. It's definitely not the only approach in Python. But we do promise you that it's comprehensive, actively developed. And it tries to go deep on all table related problems. So throughout the rest of this talk, we'll use Great Tables and illustrate the process and design behind making presentation quality tables. And for that, I'm going to give you Michael again, and he'll focus on the key ingredients of making a great table.

Sending a polished table is a chance to make a really great first impression, and it tells people it's worth paying attention to.

We talked a bit about the reproducibility chain that you probably already spend a lot of work. You worked really hard to be sure that your data and your analysis and a lot of your visualizations are already in Python. The neat thing about Great Tables is now you don't have to pluck your data into Excel. You can keep going with your Python scripts and have one single sort of tool chain to go from input data to reporting.

The best way to get started is to pip install Great Tables. We have a website, which is positdev.github.io slash great hyphen tables. We've tried to really fill it with a lot of examples. And along the way, as we've developed Great Tables, we've tried to blog and document what we think are some core problems that we've been chipping away at and trying to solve. So hopefully there are a lot of really nice nuggets there about table display and things we've learned along the way.

Thank you so much for watching this. And we hope that you have a lot of beautiful table use cases that you'll be able to throw Great Tables at. See y'all. Hope you make a lot of beautiful tables. Keep making those tables.

Great Tables: Make beautiful, publication quality tables in Python | Rich Iannone & Michael Chow

Transcript#

Tables as data visualization

Key ingredients: structure, format, and style

Advanced designs: nanoplots, data color, and formatting

Wrapping up

Featured software#

Great Tables