Hadley Wickham | {purrr} 1.0: A complete and consistent set of tools for functions and vectors

Transcript#

This transcript was generated automatically and may contain errors.

Hi, I'm Hadley Wickham , the Chief Scientist at Posit, and for the purpose of this video, the Chief Maintainer of the purrr package. So what is the purrr package? It's kind of hard to describe what purrr is to be honest, but to put it into words, I think I've got a pretty good sense of like what function feels like a purrr function to me. But I think the best way to think of it is as a toolkit for help, for supporting like functional programming.

What is functional programming?

What is functional programming? That's also kind of hard to explain, but the chief advantage is it gives you a bunch of tools for operating on vectors, or pairs of vectors, or triplets, or etc. of vectors, where you work on each individual element of the vector in isolation. And that allows you to apply a function to do something to each element, knowing that there's no way that the other other elements are going to affect the computation at all. So many of the functions of purrr are alternatives to for loops.

For loops are by any means like not a bad thing. You should never be embarrassed or ashamed of using a for loop. For loops are great because they're like very, very concrete. It's like very obvious what you're doing, because you're doing like, take this element and do this to it. Take that element and do that. And you can very easily think about like stepping through that. So the functions in purrr, particularly the map functions, are kind of a step up in abstraction. And that step up is, it's a challenge, right? But I think there's some good reasons to do it.

And the first one is it makes it easy to avoid, or ensures that you avoid some like performance bottlenecks with loops. Now for loops themselves are not slow, but if in your for loop it's easy to get to accidentally repeatedly modify or repeatedly extend a vector, then that can be slow. Although changes in the latest versions of argument are much less bad than it has been in the past. So performance, a little bit useful. I think one big advantage of switching to purrr is that you can also switch to, it's not future, but the fur package. So the fur package is, has exactly the same syntax as purrr, but it spreads the computation across multiple ports. So purrr, because it guarantees that all of the computations are independent, much much easier to share that work across multiple ports.

The other thing, particularly in this latest version of purrr, is that it gives you a really, a access to a really powerful useful tool, and that's progress paths, which I'll talk about a little bit later. But purrr, definitely kind of an advanced programming tool. You can go a very long way with for loops, but I think if you can master the map functions of purrr, you can write code that is more succinct, more clear, and more likely to be correct the first time you write it.

code that is more succinct, more clear, and more likely to be correct the first time you write it.

now we've got this clear distinction between flattening, which is decreasing always. When you flatten a list, you still get a list, but the number of items might have changed. And simplification, we're going to always get something that's an atomic vector or a factor or a date, but you're always going to have the same number of items in there.