Resources

Hadley Wickham @ Posit | Giving benefit to people using what you build | Data Science Hangout

We were recently joined by Hadley Wickham, Chief Scientist at Posit PBC. Listen in to hear our chat about building tools (like the tidyverse) to make data science easier, faster, and more fun. 36:57 - While I'm bought into developing open source packages to help deliver better processes, any advice to those of us doing that development in getting their company bought in? You have to give some benefit to the people using (what you’re building) You’ve got to either remove pain or add pleasure in some way because if you can't do that and you're not someone's direct supervisor, it's hard to get people to change. The way I think about the tidyverse is, how do we give people some sort of quick wins so they can be motivated to do the things that are slower where they're gonna have to learn some new ideas or some new tools. You kind of build up some equity with that person. They build trust that you've helped them in the past and now they're willing to invest a little bit more time before they see the payoff. But in the early days, it's all about delivering payoffs as quickly as possible. And I think if you're doing, like, you know “my company's first R package” - the easy pain points are: make themes for your company corporate style guide, make a ggplot2 theme, make an R Markdown, a Quarto theme. Make a Shiny theme that people can just use to get, you know, something that's reasonably close to whatever your corporate style guide dictates. That just feels like an easy win for people because it makes them look good inside the corporation and because you've put in all the hard work, it's like three seconds for them to type the right function name to get the right theme. I think the other bit is making it easier to get access to data. Set up some wrappers around DBI connections to the most important data sources. Provide some conventions around authentication so that stuff just works so that they're not struggling with “What packages do I need to install? What's the password? Where's the path I need?” Just give them some, like, a list of the top ten most common data sources and people will love you by and large. *Follow-up question:* Once you identify the things that you think would be useful for people - do you have a philosophy or a way in which you approach putting things together? When you're in an environment of scarcity when you've only got so much time that you can take out of your everyday job to invest in writing a package, it's really tough to balance. Like, how do I add new stuff versus making sure the old stuff continues to work? I think, again, some of it's about building up trust. So, give people some wins so that when you inevitably break stuff, you've got some kind of cushion so people aren't going to be really angry with you right away. They're gonna be like, ok, well there’s a little bit of suffering now, but this person saved me so much time. But yeah, it's really hard. And particularly as you're starting out, like, you're going to make mistakes. That's inevitable. You’re going to do things that when you look back a year later, you're like, why on earth did I do it that way? You’ll want to rip out the whole thing and ride it from scratch. And I think that if it feels horrible, you have to remember, that's great. It means you've grown immensely as a programmer. Certainly if you have my kind of mindset, you have to resist the temptation to rip things out and redo them as much as possible and just focus on making the next generation better rather than breaking what stuff people already have. So I don't have any great answers here, but I think you just have to think about those tensions of “how do I keep my forward velocity up while getting better as a programmer and evolving over time, but also thinking about how do you make the things you did a long time ago better?” ______ ► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu Follow Us Here: Website: https://www.posit.co LinkedIn: https://www.linkedin.com/company/posit-software Twitter: https://twitter.com/posit_pbc To join future data science hangouts, add to your calendar here: pos.it/dsh (All are welcome! We'd love to see you!) Come hangout with us!

Sep 11, 2023
59 min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Happy Thursday, everybody. And welcome back to the Data Science Hangout. And let's see if we remember how to do this today. We were on a little bit of a break for July, but so, so excited to see everybody back here, hope everyone's having a great week.

If this is your first time joining us today, I know this is a pretty big group here today. Uh, so it's so nice to meet you. Thank you for spending your Thursday with all of us. If we haven't met yet, I'm Rachel Dempsey. I lead our pro community here at Posit.

And the Data Science Hangout is our open space to chat about data science leadership, questions you're facing and getting to hear about what's going on in the world of data across different industries. And so every Thursday we feature a different, uh, data leader from the community and together we're all dedicated to making this a welcoming environment for everybody. So we love hearing from everyone in this environment, no matter your level of experience or area of work.

This is a casual conversation with us all here, but it's totally okay to just listen in too. But there's always three ways that you could jump in and ask questions or provide your own perspective too. So you can jump in by raising your hand on Zoom. You can put questions in the Zoom chat and feel free to always put a little star next to it if you wanted me to read it instead, maybe you're in a coffee shop or something, or I could call on you to ask your question and introduce yourself. And then lastly, we also have a Slido link where you can ask questions anonymously and Hannah or someone from the team will share that in the chat.

Well, I am so excited to be joined by my co-host today for the data science hangout, Hadley Wickham. Hadley is chief scientist at Posit and builds tools to make data science easier, faster, and more fun. You might know Hadley's work for packages like the tidyverse, uh, including ggplot2 and dplyr, much more. Uh, but Hadley, I'd love to have you introduce yourself. If you tell us something you like to do outside of work too, and a little bit about what it means to be chief scientist here at Posit.

Sure thing. Thanks for, um, thanks for having me, Rachel, for, uh, finally inviting me. It's starting to get a little hurtful, but, uh, glad I could finally make it.

Uh, like outside of work, I'd say the, uh, here's have a dog. That's Lola is a 13 year old Sharpie. And that's pretty much what she does these days is sleep.

Uh, I write books. This is one of my books that I wrote recently, uh, to the cocktail book, cause I'm also really into cocktails. Uh, this was, um, because I'm me, all the data for the cocktails is stored in a YAML file. And then it uses Quarto to, uh, turn it into a book.

So it has like a table of contents, like the primary organization of the block is by spirit. But, uh, because I'm extra, it also has one index that lists all of the primary ingredients in the cocktail. So if you want to find like something to do with orange liqueur, and then finally it has an index by name. So if you remember the name of the cocktail, I want to find it. Um, so there's a like fixing the things that like annoying me about the cocktail books that you can never find the cocktail that you're looking for, because they're ordered in some like arbitrary thing and the indexes are so bad. So thanks to the power of Quarto and, uh, Latex and make index, it was really easy to make a super highly indexed book, if you ask us.

I have to ask you, what's your favorite from that book? Uh, the favorite cocktail. I am enjoying them cocktails. I'm enjoying the most at the moment. I'm not actually in that book cause I've only discovered them recently. And I can't even remember what it's called, but it uses, um, it's rum. It's an equal parts cocktail. There's rum, fire, Aperol, yellow chartreuse and lime juice and rum fire. It's this kind of like crazy over-proof, uh, Jamaican rum.

Being chief scientist at Posit

I see there's a lot of fans here in the chat, but, um, but Hadley would love to have you share with everybody here. What does it actually mean to be chief scientist at Posit? What do you, what do you do on a daily basis?

Good question. Um, so like, you know, these days I'm, I manage a team, the tidyverse team who's responsible for kind of not only the tidyverse packages, but also many of the packages and the package development ecosystem, so test that in and then also some kind of lower level packages that underpin these things, which often live in the outlet, get up organization, but like management is not my passion. I would say I'm like, I do. Okay. Uh, but it's not like what I really enjoy and it's not where I want to spend my time. So I try and still spend as much time as I possibly can.

Um, so yeah, that's kind of what I do on a daily basis. Try and still spend as much time as I possibly can, like writing both code and books. Um, so I recently kind of realized, I just felt like, I felt like I hadn't been like getting that much done. And I think it was, it felt like I just sort of recognized like I was that felt like I was probably mostly in my head, but then I realized the key reason I felt that way as I wasn't working on a book project. So like my, when I am working on a book, like I always get up and that's the best thing I do for the first hour or two in the morning, uh, and it's now I've seemed to be like addicted to writing books so that when I'm not writing a book, I feel weird and uncomfortable. Uh, so now that I've started work on the next book, which is going to be called something like tidy design principles, uh, things are feeling much better.

So, yeah, so I do some management. I like writing books and then obviously like writing, uh, code. Uh, and recently, uh, the packages I've been working on are not tidyverse packages, but I've just been kind of diving and around posit various open source packages and kind of seeing where I can pitch in and help out. So about a year ago, I guess a long guy who helped out on the pins package. And then more recently I've been working on the RS connect package, which helps you deploy your apps to connect and the RN package. Um, which helps you create reproducible environments for your archive.

So if anybody ever feels like you're not doing enough, write a book, right?

I will say like my, I will say like my advice for writing a book is that it's easy. You just have to write for an hour. It's simple. I should say you have to write for an hour every day for a year and the end of it, have a book. Uh, and as far as I know, no one has ever successfully followed that advice. So not very good advice.

Managing the tidyverse ecosystem

So we have a really big crew here today. So I'm going to repeat how people can like ask questions and jump into, cause I know a few people joined after, and it's so nice to see you all here. And it's so nice to meet you. Uh, this is an informal casual conversation with us all here. So you can ask questions and chat with us in a variety of ways.

Um, but I see Mike Smith, you had a question in the chat. Do you want to jump in here first?

Yeah, thanks. Um, so this is, I don't know if, if you can answer it directly, Hadley, but if you can kind of give us a feel for it. So the tiny versus huge, it's a massive number of packages, massive number of developers, how does Posit kind of herd the cats and making sure that all of these different development pieces come together and work together effectively?

Yeah, I would, I'd say most of that is just like, we talk a lot amongst the team and we talk pretty frequently about, uh, like what, you know, like what when we're designing APIs, like, why do we favor this design over that design? Like if we've got something that's, that's, you know, important or complex, like that, that's something we'll talk over and a team meeting. And that is like having multiple people think about these problems. I think it's just the key because it's too, it really is past the point. Anyone person can keep it all on their head and it's very easy to accidentally introduce inconsistencies because it's, it's, it's, it's, it's inconsistencies because you've forgotten how some function you wrote 10 years ago works.

So it's just like that. A lot of, you know, we're a pretty close knit team that works closely together, that's really important. Uh, the other thing, uh, which I'm working on, which I can pop a link to in the chat is this book called, um, tidy design principles, uh, which is where like, now I'm trying to write up some of these principles, which is like useful, you know, it's useful for us because it encodes, you know, codes them in writing and we can look them up, but it was also hopeful, hopefully useful for the community. So you can both like see the patterns we follow and get more of a glimpse into kind of like how we think about these things.

So the goal is to have both kind of, these are things you should do, but also like, this is what we were thinking when we designed this interface. Like these are the thing, like these are the trade-offs we considered. These are all of the options we considered. This is why we rejected this one or preferred this one. And then, you know, looking back now, like, do we regret those choices? Like would we have made different choices if we knew then what we do now?

Python, R, and the data science ecosystem

One was, have you learned anything particularly noteworthy about the Python community and, or space since Posit's rebranding?

Uh, yes. I don't know, like, I don't know how to put this without seeming mean, but I don't just feel like the Python data science, but it's just like so hard to do data science, like so many things that are like annoying from like the environment management problem to doing things that I feel like they should be simple and pandas to like, you know, I don't know, it just, it feels like I'm a sort of continue to be amazed that people are as productive as they are in Python with the tools they have on hand. And so I think one of the things I'm kind of excited about is like Posit, you know, starts to contribute to the space is how can we kind of bring our magic of like, it just works into this space so that people can just like do data science and Python without, without having to worry about all of these kinds of extraneous things that aren't that interesting to data scientists.

Yeah, I would, security is a difficult, a difficult one because I think in most, like most of the tidyverse, it just doesn't like really apply. Like, you know, you don't have to worry about security and ggplot2, for example. Like we think about it more in the packages that have to require authentication. So typically, you know, when you're getting data, whether it's from an API or a database, you are going to need to have a security protocol that's going to need to store those credentials somehow. And one of the things I've been thinking about in particular recently, as I've been working more on like using our pro products is like you, how do you make sure those credentials travel kind of both seamlessly and securely from wherever you're doing the analysis to wherever you're deploying the analysis?

So currently I think we're at the stage where it's a lot around like, how do we help our users learn the right conventions? Like you should never be typing your password into the console because if you do that, it's easy for it to end up saved in your data history file and it's easy to accidentally commit that to get, and now you've shared your password with a much wider range of people than you should be. So at the moment, there's I think a lot of use of environment variables, which are a little bit safer, at least because you're not storing them literally in your code, but how can we do better at, you know, providing tools for saving those securely and then when needed, like sharing them across from your computer to connect or to Shiny apps or whatever. So I think lots of questions there, not a lot of answers currently.

Posit Connect and sharing data science work

Yeah. So I would say like the reason why you want Connect or it's kind of like main value proposition to me is it helps get your data analyses out of the hands of data scientists and into the hands of decision makers and at the end of the day, like it doesn't matter how awesome you are as a data scientist, unless you can show your results to people, your analyses to people in your organization who actually need them and so Connect is really about like publishing whatever you're doing in R to a form that other people in your organization can see, whether that's like Shiny apps or RMark or Quarto documents, or if you want to turn those into like a scheduled report so that people are getting emailed at 9am every day with the latest results in your dashboard, it's all about like sharing what you're doing as a data scientist and ideally like the goal of Connect, which we're not like a hundred percent of the way there, but we're trying to get there in the long run, very much the goal is going to do that just by like pushing a button in your RStudio, wherever you're doing the analysis, it should just be a single click or a single function call to get that up so you can stay focused on the, you know, the data analysis rather than having to worry about like running a web server, securing a web server, make sure all the authentication is working, make sure your email jobs are getting sent out correctly, and when those jobs fail, that you get notified that they didn't work and all that kind of other stuff, which is like really necessary for a product like this to work, but probably you don't care about as a data scientist and thinking about it just gets in the way of doing your, doing your most important work.

Posit as a B Corp

I'm curious, Hadley, if in the B Corp landscape, are there peers that look like Posit are doing similar things or are you alone? And what's that like to either have some sort of well-trod ground to follow or is it figuring out like every day, like strategically, what does Posit as a B Corp do in this kind of situation? And I'm sort of curious if that's like liberating or if it's terrifying, like what's that like kind of to think about operationally being in that space?

I'd say we do some benchmarking against other B Corps and the places where benchmarks are typically available. Like I know we do like our employee kind of satisfaction surveys. We benchmark those against other B Corps. But I don't. I mean, I don't really know if there are B Corps like us. I suspect there are not. There are just not that many like software B Corps that are not that many open source companies. And I guess that I don't know, that feels kind of like more freeing to me than worrying.

You know, certainly a lot of what we do is just a standard. You know, there's a lot of it's just kind of standard software business playbook. But at the same time, we don't have this like more about removing this drive to optimize short term profitability that gives us the room to kind of, you know, do the things that we believe to be right. You know, as much as possible. It's not always possible to, you know, it's like what is right is a very complicated question, especially as the size of a company grows. But certainly we you know, we we have more freedom to kind of think about that and act on that. And while, you know, certainly we still make mistakes, I think everyone is, you know, trying to try their absolute best.

Finding starting points and measuring progress

So you've taken a lot of solutions to massive pain points that are challenging and non-obvious. I know if I tried to do something like ggplot, I would make no progress whatsoever. So I wonder, how do you find your starting points for implementation? Like, where do you start from and how do you measure your progress and make sure you're moving towards your end goal?

Good question. It feels like a lot of. The packages that I work on just have this the sense of inevitability about them, that these are just things that like frustrate me and typically have frustrated me for years before, like some kind of solution crystallizes in my head. And then it's just a matter of like a lot of iteration. And, you know, now I think one of the huge advantages I have is that, you know, there's a there's a community of people who are willing to try out like whatever crap I produce and give me like feedback and tell me like, you know, this doesn't work or this, I don't understand this or this is inconsistent with a lot of our tidyverse thing. And that is just I find that that kind of interaction, like personally, like, you know, highly motivating that people care when people care enough to like critique it. I find that to be, yeah, really motivating.

So I think that that is just that. And some weird part of my brain that is just like collecting these little pain points and after a while realizing that all of these pain points, like maybe there's some root cause that I could fix all at once. Other times it's like, let's just, you know, knock them off one by one and then hope something falls out. Yeah, and it's just this, I still feel like a lot of packages, it just feels like I feel like there's a sort of compulsion to create them that occurs inside of me at some point in my life.

Getting buy-in for tidy data and open source packages

Do you have any recommendations for teaching non-coder data generators that it's good to use tidy structured data so we can use their data so much more easily?

Yeah, I think the. I feel like the challenge of teaching anything like this is always you have to figure out what are the benefits to them, not what are the benefits to you, because the benefits to you are typically very obvious, but are not motivating to other people. So you need to figure out, like, what's some angle where by adopting these new ideas, can they save themselves work or can they produce something cool? And maybe that's like if it's not about saving, if you can't find any way that it's going to save them work, maybe there's something you can do with the tidy data automatically to produce some kind of useful plot or motivating figure that will kind of help them say like, oh, this is, you know, this is kind of fun. So I think, yeah, like try and make it easier or try and make it more fun. Anything else? Like it's a hard, it's a hard road to get other people to change their minds and to change their processes.

Ben asked a question, I think Ben maybe had to drop for another call, but so I'll ask it. But the question was, while I'm bought into developing open source packages to help deliver better processes, any advice to those of us doing that development in getting their company buy in?

Again, it's like, how do you like, you've got to give some benefit to the people using this. And you've got to like either remove pain or add pleasure in some way, because if you can't do that and you're not someone's like direct supervisor with a lot of direct power over them, it's hard to get people to change. So I think like to me, a lot of it's like, well, the way I think about the tidyverse is like, how do we give people some sort of like quick wins so they can be motivated to do the stuff that are like slower, where they're going to have to learn more, like learn some new ideas or some new tools so that you kind of build up some kind of equity with that person. Like they build up some trust that you've helped them in the past. And now they're willing to invest like a little bit more time before they see the payoff. But in the early days, it's all about like, you know, delivering payoffs as quickly as possible.

Again, it's like, how do you like, you've got to give some benefit to the people using this. And you've got to like either remove pain or add pleasure in some way, because if you can't do that and you're not someone's like direct supervisor with a lot of direct power over them, it's hard to get people to change.

And I think if you're doing like you're like my company's first package, like the easy pain points are like make themes for your company corporate style guide, like make a ggplot2 theme, make an R Markdown, a core theme, make a Shiny theme that people can just use to get, you know, something that's reasonably close to whatever your corporate style guide dictates. And that's just like that just feels like an easy win for people because it makes them look good inside the corporation. And because you've put in all this hard work, it's like three seconds for them to type the right function and to get the right theme.

And I think the other bit is like make it easy to get access to data. Like so set up some wrappers around DBI connections to the most important data sources, provide some conventions around authentication so that stuff just works so that they're not like you're not struggling. Like, how do I what package do I need to install? Like, what's the password? Where's the path I need? Just give them some like a list of the top ten most common data sources. And like people will love you by and large.

Developing packages and managing change

So when you are thinking about sort of developing maybe that first package for your company or something of that nature, once you've identified the things that you think would be useful for people, do you have like a philosophy or a way in which you approach kind of putting things together? Because I know one of the pain points is as you're developing, you realize, oh, what I did originally doesn't quite work and not to tear it down and kind of build it back up again. So I don't know if you have lessons learned from developing so many packages that we could learn from.

Yeah, that's I think that's a tough one, because I think. Like as a team, we have got like vastly better at dealing with kind of deprecations and breakages, and I was reminded of this recently because I had to rebuild the first edition of Alpha Data Science and it just works like five years later, all the code in that book still works, which is certainly not like true. Like five years ago, if you tried to do that with five years of code in the tidy, it wouldn't have worked.

And, you know, I think that that that's obviously great to have that, but it's also like a huge amount of extra work to do that. And like we can kind of afford to spend that work. And now we're a team of like, you know, at least my team is like six plus full time people who can spend our time thinking about this, because I think there's really like when you're in an environment of like scarcity, when you've only got like so much time that you can take out of your kind of like everyday job to invest in writing a package. Like it's really tough to balance. Like, how do I like add new stuff versus making sure the old stuff continues to work?

So I don't have any great advice there. I think, again, some of it's about coming like building up trust. So like give people some wins so that when you do inevitably break stuff like accidentally or on Christmas, you've got some like kind of cushion that people aren't going to get like really angry with you right away. They're going to be like, OK, well, you know, there's a little bit of suffering now, but this person saved me so much time that it's not so bad.

But yeah, it's really hard. And particularly as you're starting out, like you're going to make mistakes like that's inevitable and you're going to do things that when you look back a year later, you're like, why on earth did I do it that way? And now you want to like rip out the whole thing and write it from scratch. And I think that like if it feels horrible, but you have to remember, like that's great because it means you've grown immensely as a programmer. And certainly if you have like my kind of mindset, you have to like resist the temptation to like rip things out and redo them as much as possible and just focus on making the next generation better rather than breaking what what what stuff people already have. So I don't have any great answers there, but I think you've just got to think about those tensions of like, how do I keep my forward velocity up while getting better as a programmer while evolving over time?

ChatGPT and AI tools

Ianson, what is your opinion on how it will evolve the data scientist's role in an organization?

I will caveat my remarks with I I would say 99% of my ChatGPT use is entirely frivolous. I use it to make things rhyme. I use it to make things sound like pirates. I use it to make things sound like southern grandmas who use a lot of country folk sayings. I use it to make acrostics of people's names. I use it now with the finance person I work with. I submit requests exclusively in the form of office screenplays.

So I like I don't know, I think it's like incredibly fun for stuff like that when I try to use it for programming. It's been like hit or miss, like sometimes it's great and sometimes it spits out something that looks like totally plausible, but is utterly wrong. So it's and it's hard for me to tell. Like, is that just like is that ChatGPT today? And like in a year's time, it's going to be much better. Or is it going to be like self-driving cars where they're kind of always coming like a year in the future and we never quite get there?

Self-driving cars is something I think about a lot because I have a Tesla and whenever I park it in the garage, it thinks the random collection of tools hanging on the wall is a semi like about to crash into us. And I'm like, it can't even handle like this. And people say we're going to have self-driving cars. So so I don't know, it's hard for me to say. I'd say overall, I'm kind of like a skeptic. I think it is going to be useful. I think it's particularly probably hard for me to say, like in some ways, it's going to be really useful when you're learning, because you can take the things that are words in your head and spit them out and say, like, how do I rotate the X axis labels on the ggplot and get something pretty reasonable, which makes it super useful for learning. But at the same time, as you're learning, it's going to spit out stuff that looks totally plausible, but it's totally wrong. And you're in the worst possible situation to try and figure out what's going on. So I don't know. We'll see. I think it's super exciting. I have a lot of fun with it, but I do not use it all professionally.

R on cloud platforms and ggplot2's future

So my question is, what's the implementation plan of R on cloud computing platforms? So my team works in Azure. And how do we say using R there is very inconvenient at best. So previously, before we moved to Azure, the work of my team, I would say 70% in R, 30% in Python. After moving to Azure, I quickly changed to 70% in Python, 30% in R. So Python is just a very natural option for modeling analysis there. So how do you plan to change this in the future?

Yeah, I mean, I don't know if change is quite the right phrase, because I don't know if we have the power to do that. But certainly, you know, what we are trying to do is make R as easy as possible to use everywhere you can possibly use it. And, you know, I think thinking about like as more and more people move to cloud computing, how can we make it easy to get set up and get configured and make things just work is really important. I know like one example at the moment is we're working on a partnership with Databricks to make using R inside Databricks clusters much, much easier. Certainly, like this seems like there's a lot of like just challenges getting things set up there that, you know, folks have kind of solved on the Python side and no one has put the time into solving on the R side. So I'm hoping that, you know, as a company, we'll continue to invest in those opportunities, just make it easy to use for everyone.

Lisa, I see you had a question earlier on ggplot2, do you want to jump in?

Sure. I think I have two questions about ggplot2, but I think I'll go with the first one, which is sort of how do you feel about people contributing, having so many people contributing to that package and like all the add-ons that have come from it? Sort of like any any things that sort of amazed you and then on the opposite side, any sort of major frustrations?

Yeah, I mean, I think the thing that's kind of kind of blows my mind about ggplot2 is it's like coming up to it's going to come up to its 18th birthday fairly soon. So I'm sort of joking about that will be time to like emancipate it. And that's going to just have to like survive by itself from now on. And kind of realistically, like I'm not that involved in the day to day management of ggplot2 anymore. That torch is going to be passed to Thomas Peterson. It was now, you know, starting to get, you know, like, I guess four years ago, I was like sick of maintaining ggplot2. I've been long ago, sick of maintaining a hired Thomas. And now Thomas is getting sick of maintaining it. So we've got to like figure out what to do next.

But yeah, like maintaining something that is like so popular is, you know, both a blessing and a curse. It's like incredible to know that, you know, literally millions of people have used it and it's helped them, you know, make graphs and make your life easier. But it also is like immensely challenging to make any changes because like so many people rely on it now and are used to the defaults.

So there's there's something like I think both. I don't know if I can quickly pull this off, but like one thing I think about a little bit from time to time, it's just like how many people like if I find the most popular, yeah, so the most popular page on the ggplot2 website is the Geom Bar documentation, which 15,000 people have read in the last month. Like there are not many ways you can reach, like you can have more of an impact on people's lives by looking at that page about bar charts and thinking, how can we make that better?

But at the same time, those changes tend to be kind of small and incremental. So it's also like fun to like, you know, embark on completely new projects that no one uses, so you're free to do whatever you want. And in some ways, I think like the ecosystem of packages is kind of it takes a combination of those strengths and those weaknesses because like ggplot2 can like stay the same and stable, whereas you can continue to like build on top of it with these packages that fewer people use, but are also like much more easier to change and flexible.

Focus time and managing your day

At least something that's like stood out in my mind that I've learned from you from internal company calls is how you handle focus time and the way you you manage your own time. And I thought it might be helpful just to chat a little bit about that and to share how you think about that.

Yeah, I once sort of one funny anecdote, I gave a talk about this idea of focus time and a couple of years ago, probably now at the company. And I was like, I feel like the truest expression of my philosophy would be to say, no, I'm not going to give a talk about focus time because that's going to interrupt my work for the day.

But I really try to like create a lot of time for like uninterrupted work, because I think, you know, like programming is like fundamentally creative work and like finding the time to kind of like process and think things through is is really important. So what my what I do currently is I like every day, as I said, I try and like first thing I do, I get up and write for a couple of hours or an hour at least. That's like the time of the day where I am most productive. And so I want to spend on the things I think are most important, you know, not the things that are necessary, most urgent.

And then, you know, I have a team, so I have to I have to talk to them. But and I have to talk to my colleagues that, of course, I love talking to. But it means I can't do other work. So my kind of compromise is I've got two days a week off on my calendar, Tuesdays and Thursdays, where I try really, really hard not to do these things. You know, sometimes if there's like a bigger meeting, there's like 15 people and it's impossible to find the time. Otherwise, I'll do it. But I really generally pretty good about keeping those days clear so I can tackle bigger problems. And I think that's, you know, really important.

It's it's, of course, challenging because you have when you've got a big four hour block of time, you still have to like train yourself to like resist the temptations of TikTok and Mastodon or certainly that temptation of Twitter has been totally removed from me today these days. So I guess that's good. But certainly, like you have to train yourself to like, you know, hard to like stay focused on one thing and not given to all the other more fun distractions that you have around you. But giving yourself that space, I think, is really important.

Book recommendations and looking ahead

Are there any books you'd recommend to this audience here or resources you'd recommend that have been useful for you in your career?

This is one of the books that's kind of been inspiring or has inspired the tidyverse design book. It's called A Pattern Language. It's actually a book about architecture and how you design houses. It's just a really like, you know, it's a very thick book, but it's designed to be kind of skimmable. I think it's just sort of fun and interesting to read about. And this like system of describing all of these components, it's really interesting to read about. of describing all of these components, the different levels of hierarchy and how you can join them together, I think is very much connected to programming.

I don't know, I think that's the book I've been thinking about like the most lately, some of the other ones, I guess I've gotten rid of because I have limited book space and I kind of like keeping these. I'm also this collection of like, you know, fairly historical books. This is like from 1923, because there's just so much. When you kind of look back at what people were doing like 100 years ago, like I think it's really revealing that in many cases, like the quality of visualizations is not improved that much, like the quality of the best visualizations is not improved a huge amount in the last 100 years. But the ease of making them like now you can like whip out 15 variations and ggplot in 15 minutes, whereas before you were like painstakingly draw them all out with pen and ink. So I think it's just sort of interesting to reflect on that a little bit. And like what's what's changed really in visualization in the last 100 years?

I know we're at the end here, but maybe one inspirational question to ask you, what's something that you're you're most excited about as you think about the tidyverse and Posit going forward?

And I am like legitimately really excited about us embracing Python more. You know, I'm still using Python, but I went to a Python conference recently and talked to a bunch of people using Python. And I'm just excited about helping out all these Python using data scientists because it just feels like this Posit, you know, it's like, it feels like this Posit, like it just works type stuff. There's just a huge, huge scope for that. And I'm excited to kind of help spread that like the Posit way of doing things, the Posit kind of community, how we think about people and try and spread that further and further in the world.

Thank you so much, Hadley. Is that our new slogan? It just works. I think it should be. We're not like, we're certainly not there everywhere, but I just think that's like, that's what the best experience you can deliver. You just try something out and it works. And people stop remarking on that because it just works. They don't have to think about that.

And I think that's, I will say, I think some of the best packages that me and my team have created are almost invisible because they just make problems go away and you never need to think about them. You never need to know about them because that's just some pain. You never need to experience. And then you can just listen to all the old thoughts cranking on about how terrible it was back in the day and how they had to trudge barefoot to school and three feet of snow and just listen to their stories and be happy. You don't live in that environment anymore.

And I think that's, I will say, I think some of the best packages that me and my team have created are almost invisible because they just make problems go away and you never need to think about them. You never need to know about them because that's just some pain. You never need to experience.

Thank you so much again for joining us today, Hadley. And thank you all so much for for joining the hangout today and and making this space what it is. Pretty impressive to have over 200 people on a Zoom call and not have to worry about what's happening. And thank you all for monitoring the space, too.

If this was your first hangout, we would absolutely love to have you join us again. So again, they happen every Thursday, same time, same place. But next week, we'll be joined by Mike Lopez, senior director of football data and analytics at the NFL. So he'll be our feature leader next week.

Thank you all so much for joining today and have a great, great rest of the day, great rest of the week. Bye, everybody. Thanks, Rachel. Thanks, everyone. Bye.