Kara Woo | Always look on the bright side of plots

Transcript#

This transcript was generated automatically and may contain errors.

Welcome to RStudio Global. Thank you so much for tuning in. I'm going to talk today about the things that we can learn about ggplot2 from our data visualization mishaps. So if you have spent much time visualizing data, you have probably ended up with some plots that did not go to plan. Maybe you had lots of overlapping blue hexagons, or text that took up the entire area of the plot, or giant hairballs. My name is Kara Woo, and I'm one of the maintainers of the Accidental Art Twitter account, and I'm a contributor to ggplot2 also. So I've seen a lot of plots that have gone totally wrong.

The Accidental Art account exists to showcase these plots, to showcase the plots gone beautifully wrong, the things that didn't work out but looked cool nonetheless. And I've seen also plots that just straight-up flopped in sometimes really subtle or hard-to-diagnose ways. And so there's this whole spectrum of messed-up plots that that I've seen. And I know that these can be really frustrating, but they can also be an opportunity for us to create better plots and to deepen our understanding of the tools that we use to create those plots.

So if you're like me, then when you create a plot like the ones that I've showed, you likely go to a place like Stack Overflow to find someone with a similar problem, find a hopeful solution, and bring that over into your code to hopefully fix the problem that you've had. And that's great, and that's totally legitimate. I'm going to argue though that when you get that working solution, it's worth taking the time to think about how that solution fits into the conceptual model of ggplot2. My feeling is that by examining these messed-up plots that our mistakes generate and how we fix them, we can deepen our understanding of ggplot2's philosophy to create better visualizations faster and more reliably.

My feeling is that by examining these messed-up plots that our mistakes generate and how we fix them, we can deepen our understanding of ggplot2's philosophy to create better visualizations faster and more reliably.

This brings us to the crucial piece of the theme system, which is that when you're customizing theme elements, the most specific element wins.

So to get a working plot, we would need to edit axis text Y right like this. So instead of setting our customizations to the axis text Y element, where they'll be overridden by the theme that we're using, we customize axis text Y left and axis text Y right. And that makes sure that our customizations get applied the way we want them to. And now everything looks good on the plot. These these two axes are looking exactly the same on the left and the right.

Recap and closing thoughts

So let's recap the ggplot2 mishaps we've talked about. We have mapping mishaps where we accidentally map the data to a single visual element over and over and over again. We have scale snafus, which is where we have two different ways of setting the boundaries of a plot where one of them discards data, one of them does not. And this can affect the statistical summaries that are shown on the plot. And we have theme threats where we try to customize the plot at a higher level of the theme hierarchy that then gets overridden by a more specific theme customization.

We haven't gotten into all of the components of the grammar, we haven't talked about stats, geomes, position adjustments in that much detail. But if you really want to get deep into the understanding of how ggplot2 works, and all of the parts of the grammar that we haven't covered, you should definitely check out the ggplot2 book. There's a chapter in there specifically about mastering the grammar, and it will really help you build the solid foundation for understanding all of these different components of ggplot2.

So I want to leave you with one last piece of accidental art. It's one of my favorite patterns that comes up over and over again on the accidental art Twitter page, this sort of weird geometric pattern that you often see in maps. So here we're taking data on states, trying to plot it on a map, but it's gotten totally messed up. So if you don't know what's causing this error, see if you can find out a fix to it, and think about whether that fix tells you anything about how ggplot2 works. For the purposes of this slide, I've not included the full code on the display, but if you want to see the full code for this plot, as well as all of the other visualizations that were included in this talk, have a look at this URL. And thank you so much for tuning in. Good luck with your future ggplots, and I hope that you have enjoyed the talk.

Kara Woo | Always look on the bright side of plots | RStudio

Transcript#

ggplot2's philosophy and the grammar of graphics

Mapping mishaps

Scale snafus

Theme threats

Recap and closing thoughts

Featured software#

ggplot2

rstudio