
The dessert-first approach to teaching data science | Mine Cetinkaya-Rundel | Data Science Hangout
ADD THE DATA SCIENCE HANGOUT TO YOUR CALENDAR HERE: https://pos.it/dsh - All are welcome! We'd love to see you! We were recently joined by Mine Çetinkaya-Rundel, Professor of the Practice at Duke University and Senior Developer Advocate at Posit PBC, to chat about her cake-first teaching philosophy, strategies for communicating technical results to non-technical stakeholders, and career advice on learning new tools later in life, among other things. In this Hangout, we explore Mine's unique approach to explaining difficult topics to students. She describes her method as "let them eat cake first," where she shows learners the final, compelling result (like a visualization) before teaching them the "ingredients" or technical steps required to get there. By establishing the motivation and the destination first, she finds students are more willing to invest the effort into learning the necessary coding and data cleaning logic. Resources mentioned in the video and zoom chat: Mine's Coursera 4-course series: Data Science with R Specialization → https://www.coursera.org/specializations/data-science-r TidyTuesday (Weekly Data Project) → https://github.com/rfordatascience/tidytuesday, The Test Set Podcast → https://podcasts.apple.com/us/podcast/the-test-set-by-posit/id1823736938 If you didn’t join live, one great discussion you missed from the chat was about the transition from Excel to coding languages like R and Python. Attendees agreed that Excel often acts as a "gateway drug" to data science, shared war stories about managing massive VLOOKUPs (laugh/cry emojis for everyone), and debated the undeniable and lingering utility of spreadsheets for communicating with business stakeholders. Spreadsheets will never go away and many of us are totally ok with that because we still use them every day! Let us know below if you’d like to hear more about this topic ► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu Follow Us Here: Website: https://www.posit.co Hangout: https://pos.it/dsh The Lab: https://pos.it/dslab LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co Thanks for hanging out with us! Timestamps 00:00 Introduction 04:42 "Why R initially and why still R?" 10:30 "How do you balance it all?" 15:28 "How do you stay up to date on newer techniques?" 20:13 "What is your approach for explaining difficult topics to students?" 25:58 "How do you identify appropriate datasets for beginners?" 32:21 "How frustrating is it when you see statistics being used in a misleading way?" 38:26 "How to communicate with people from different fields?" 44:23 Career advice 48:28 "Have you ever convinced an organization to abandon Excel?" 49:25 "How is statistics viewed these days in the context of things like AI and ML?" 52:38 "Was there a tipping point where you felt experienced enough to give keynotes?"
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hey there, welcome to the Paws at Data Science Hangout. I'm Libby Herron, and this is a recording of our weekly community call that happens every Thursday at 12 p.m. U.S. Eastern Time. If you are not joining us live, you miss out on the amazing chat that's going on. So find the link in the description where you can add our call to your calendar and come hang out with the most supportive, friendly, and funny data community you'll ever experience.
I would love to introduce you to our featured leader today, Mine Cetinkaya-Rundel. She's a professor of the practice and senior developer advocate. She is a professor of the practice at Duke University and a senior developer advocate at Posit PBC. Mine, I would love it if you could introduce yourself. Tell us a little bit about what you do, all the multifaceted what you do, and something you like to do for fun.
Yeah, thank you so much for having me. This is such a lovely crowd. I'm seeing some familiar faces and some new faces, so that's all great. Yeah, so as Libby said, I'm a professor at Duke University. I'm in the statistical science department. I've been here for 15 years, and I primarily teach introductory data science and data visualization with sort of a focus on statistical thinking as well as sort of programming. So I fell in love with sort of coding and programming much later in life myself and found it a struggle at the beginning to love it, to be perfectly honest. So I feel like I don't know if it's the scars from that first year of graduate school or just the mere fact that I've always enjoyed working with new learners. That has been my niche area, and I've also been working at RStudio slash Posit for many, many years now.
As part of that work, I have worked with the Shiny team, I think, when I first started, then primarily with the Tidyverse team. I've worked with Quarto and Positron as well, and oftentimes I find myself sort of in the space of there's something new and exciting happening that probably needs some learning materials, probably needs some documentation, and probably would benefit a lot from people sort of like working with it, maybe teaching their students with it and bring back some feedback from it. So I found this space where I have sort of a foot in both doors and get to navigate the exciting space of data science day-to-day from learners to builders to be an exciting place to be. And I also do lots of teaching outside of the university. I teach courses on Coursera, I like teaching workshops, so whatever conference I'm going to, I always try to see if I can manage to, you know, teach a workshop there and get to interact with folks who are learning new things, maybe not in a university course setting, but in other settings and for other reasons in their lives.
So it's possible I've crossed paths with some of you as a workshop instructor. I feel like I recognize a couple of faces in that way as well. I think we'll meet for one. I took your workshop a couple of years ago. And you asked for one other thing from me. Yeah, something you like to do for fun. What do I like to do for fun? Honestly, lately I like to do Pilates for fun. Is that too lame? That has been my thing that I've decided I'm going to try to make time for, but I also really love to spend time with my kid, particularly building Legos.
I like that it's hard. I can't think of anything else during that hour. I think that's why I enjoy it. It sounds very meditative. Yeah.
Why R?
All right. Well, let's hop into questions. My first question for Mine is, why R initially and why still R? Yeah. So my very initial intro to R was, I think, boot camp for graduate school, where we were told, hey, you probably know this. This is meant to be a reminder for you. And I don't think I had written a single line of code in my life at that point. I think I had maybe written a little bit of code. I worked for two years as an actuary prior to grad school. And we had like an in-house language. I remember it being called Ginsu, like a knife, because you used it to sort of chop data.
So yeah, I learned R because I was in a statistics PhD program. That's how it started for me. So I would say that the first few years of me using R was very much LM and then go on GLM and then go on. You fit a model. You get the sort of the results you need. But throughout that, I've also taken some computational statistics courses and realized that I am not ashamed to say the following. And I wear this badge very proudly. I'm pretty good at Excel. I had to get very good at Excel in my actuarial science job. And I'm still pretty good at it. And there are certain things that I use it for, not data analysis, but sort of like compiling data and whatnot. And when I realized that things I liked to do in Excel are a lot easier to do if you can actually code your way around it and document as well, I feel like that's when I decided this is something that I enjoy.
During the time that I was working, I wrote a lot of documentation because I inherited a lot of projects where you had to talk to another human. You had to make sure that human was in the office in order to get the information you need. And I was always wondering, why didn't anyone write down exactly what needs to happen? Then I realized, well, actually, if you code it, you don't have to separately write it down as well. It's sort of like in the code itself, what is happening. And that's when I really realized, oh, this is great. This is something that's worth investing time in.
I think for a lot of people also, it's finding out about Quarto and parameters reporting where they're like, oh, it can all be one step. It can all be one step where I'm doing the analysis, I'm doing the report, and I'm also having it parameterized out to multiple things, like a report for every state or a report for every school in your district or whatever. That's the click that I see for a lot of people where they're like, oh, this is worth investing the time to figure out how to not just do it in Excel, even though I'm really good at Excel and I'm a spreadsheet queen.
Absolutely. And my starting using R predates R Markdown even. However, I, to this day, remember the useR conference where the keynote was about R Markdown. This was in Nashville, I want to say 2011 or 2012, something like that. And I remember tuning out about minute 10 of that keynote because I was like, I'm going to use it right now. Like, I need to start using it right now. I cannot wait for this talk to be over.
I need to start using it right now. I cannot wait for this talk to be over.
And I cannot imagine like sort of being dropped into this ecosystem with all of this tooling here. I think it would be an absolute no brainer to just like dive right in and see how many problems it solves for me. Well, what's wild is that then 10 years later in 2022, I think you were giving the keynote on Quarto, which is where a lot of us heard about it for the first time. That's a full circle moment. Lots, lots of full circle moments. And I, it was a great delight to be able to give that keynote with Julie Lowndes as well. It was really fun working on it, but it was also, I think Quarto is one of the projects where I feel like I had like not just the opportunity, but genuinely the privilege of being involved with it from like day one. So I would be like testing things out as they were being developed. And it was just so nice to be able to sort of like play with something that I knew I was going to use basically every single day of my life going forward.
Balancing it all
How do you balance it all? Do you ever say no? And how do you do that?
I'm not sure. I think that I don't have a great answer for how I balance it all. I think if you ask folks that are sort of like personally close to me, they will tell you that I don't balance it all. And that I sometimes that I'm at my computer an obscene number of hours a day. But I will say one thing I have found over the years is to where I feel like it's possible for me to be picky about projects, to pick things where I can really bridge gaps in between and look for opportunities to sort of see, I'm supposed to do project X, and I'm supposed to do project Y, is there a way to sort of like, bring at least parts of those together so that I can sort of be productive on both even when I'm not like actively working on both.
Oftentimes with teaching, the way I do this is by if I'm working on some like new tooling, for example, writing documents, testing it out, I try to see, is it feasible for me to bring it to the classroom? I will say that the answer depends a lot on features of the class. Like, I think there are certain things that may be too premature to bring to an introductory classroom compared to say a more advanced smaller classroom where I might have like more one-on-one time with students, for example, but I always try to ask myself that question and I always try to sort of be brave about it, not in a way that should in any way diminish the student experience.
As an example, I'm working quite a bit on Positron things nowadays and I decided I'm going to teach my course in Positron because I'm teaching a slightly smaller, more advanced course right now, and it was after I could sort of like convince myself, no, this will be a valuable like investment for the students to have learned this tool, whether they use that very tool or not, the skills they'll get from having been in that IDE will help them with whatever they might want to do next, and I feel like whenever I can say yes to sort of getting over that threshold, I'm happy to jump in there with them. It does require a bit of a willingness to sort of troubleshoot with students one-on-one, but I often learn a lot during that time.
And do I ever say no? I do, I do say no. I try to learn to say no. I mentioned Stephanie knows I don't always because I did not say no to hosting USAR this year, which Stephanie, I was able to wrangle into sort of being on the organization committee with me. So not always, and I try not to for things I care deeply about, and that was a good example of one of those things that I care deeply about, but I do try to be selective, particularly when I feel like there may be someone else I can recommend. So once I realized it's hard for me to say no, I decided, well, if I really want to say no and I'm going to feel bad about it, maybe I can put in a little bit of time or reach out to my network a little bit and be like, do you know of anyone who might be interested in this? Because if I could just pass on their name, that will let me off the hook guilt free, and maybe, hopefully, it might be an interesting thing for them to be asked to do.
Staying up-to-date on new techniques
Yeah, I think that maybe compared to about five years or so ago, the landscape of how I stay up-to-date is different, and I do sort of long for the days where there was one place I could go to, and I knew that our stats was really active there, and I could learn about all the new things. So it has taken a little bit more time, I feel like, for me to find nuggets of information here and there. I never thought I was a podcast person, but I've been trying to make myself listen to podcasts more. But I hadn't really used that format for things related to professional stuff, for example, but I realized that it's kind of nice to sort of hear not just names of things, which might be what I might be getting from a social media post, but also a little bit of a description of how people are using these tools. So I have been trying to listen to some podcasts. I do weekly tune into the, or it's not necessarily weekly, but whenever I can, tune into the Posit test set podcast, for example. I feel like I've learned new things from there that I wanted to try out, for example, that I think is neat.
Additionally, I've been using Blue Sky more, honestly, just trying to catch up with that. And whenever folks have sort of like lists of things they have done, I enjoy that. And I particularly like following folks who teach as well, and can have examples from not just like where they're using certain tooling as part of their work, but how they have integrated it into their teaching. I find those to be the most aspirational for myself. And one other thing that I do to myself, I don't know that I would recommend this, but if they're educators among us, I say, give it a try. The best way to try out a new thing that you have already decided is worth your time, but you're just like not able to carve the time for, is to put it on your syllabus. And now you've made it public, and now you've got to live up to it, one way or another.
So I'm teaching my advanced data visualization course that is in R. I'm this semester for the third time, I think, at Duke. And I've decided the last three weeks will be Python data visualization. I've already promised it. It's going to have to happen. So I will do Matplotlib and Seaborn, sort of to give a, I think what I have dedicated is a lecture of sort of landscape, one lecture on sort of data structures and stuff, just so we can like get a data frame in there. And then one on sort of landscape of tooling that's most popular that you're probably going to come across, but then dedicate much of the time to Plot9 because, well, I'm a ggplot2 fan, so that's where my heart is.
The dessert-first teaching approach
What is your approach for explaining difficult topics to students? Let them eat cake first.
Yeah. So, my approach generally is to first sort of show, try to communicate why I think this is an important thing to learn. And that's usually without even calling out the name. So, I'll try to give sort of an example from my class, maybe to make it a little bit more concrete. I don't know about you all, but for me, regular expressions are a difficult concept. I know in the age of chat GPT, they may be a little bit easier, but still, for me, I've never like fully wrapped my head around exactly what I need. I've also found it in the past sort of difficult to motivate that. So, one way that I have sort of, I integrate some of this like text parsing sort of stuff into my classes. I will give a real data set to students and I will also give them a visualization saying, this is where we're starting and this is where we're going to end up. And we first break apart the sort of the visualization to say, oh, that's what's on the y-axis. That's what's on the x-axis. This is what we color the data by. So, we've decided we need these three or four columns in a data frame. That's our goal.
So, we need to get from our starting point data frame to this data frame that has these columns of information that, for example, it's voting district names, but they have been sort of concatenated in some way where you've taken away some of the letters so that you can join it with another shapefile or something like that. So, we work through the process of this is our end, like that's where we're going to get the at the end, the cake. That's the pretty picture we're going to make. We break it apart to see what are the ingredients of that cake. And then I say, you know what? It would be really tedious to go through this data set row by row and clean it up manually. Can we think of some logic we could write in order to be able to sort of transform these columns into that? So, instead of starting from the beginning and hoping that people will stick with you until the end, I tried to give away the punchline first and then take it back and say, hopefully, I have made it clear to you that it's worthwhile to stick with this lesson because you know where we're going to get to at the end.
Instead of starting from the beginning and hoping that people will stick with you until the end, I tried to give away the punchline first and then take it back and say, hopefully, I have made it clear to you that it's worthwhile to stick with this lesson because you know where we're going to get to at the end.
I often think about when I first learned linear algebra, for example, which was, I don't know, second or third year of college. And then when I got why anyone thought I should learn linear algebra, which was second year in graduate school, and I told you I worked for two years in between, that's a pretty big gap in between those two things. And that's not to say I didn't have a good professor, like that is not it at all. It just was not how I was taught versus, for example, our students at Duke now, our statistics majors have an option to take a linear algebra course that's designed specifically for people working with data. So, they get the nuggets of like, where will I apply this earlier on that they report to be a lot more motivating.
So, I sort of try to think about that delay I had related to this like important foundational concepts and how long it took me to get the punchline of why anyone thought I should be learning them. And I try to reduce that time as much as possible. I find that students, when they realize this is worth investing time and brain cells in, they are more motivated to stick with it or ask questions. And if I just tell them, like, if you just stick with me, at the end, I will show you that it's worthwhile. I lose quite a few of them along the way.
Finding data sets for beginners
How do you identify appropriate data sets for beginners? Do you have any particular resources or tips for doing that?
The way I think about data sets is not so much is it in its rawest form appropriate for new learners, but more is the context something that might be interesting to them. And turns out, there is one thing years of experience is bad for and that's staying connected with the youth. What I think is interesting every day is farther from what my first and second year undergraduate students think is interesting. So every semester, at the beginning of the semester, I always do a Getting to Know You survey. Sometimes I use some of that data just to genuinely get to know my students. Sometimes I teach courses up to 300 students, so I don't end up getting to individually know them. So just reading through the narratives that they write there whenever I can make time sort of helps me stay connected with them a little bit. But one of the questions we ask on the survey are like, what sorts of data are you interested in exploring?
So they will say some things like related to criminal justice or related to public health. And sometimes I try to prompt them to be as specific as possible, not like linked to a data set, but to be as specific as possible, just so I can sort of then keep that in the back of my mind. So maybe next time there's a Tidy Tuesday data set, I'm like, oh, one of the students had mentioned they'd be interested in something like this. It gives me a cue to take a note of it. I also mentioned I like listening to the radio a lot. I don't drive much actually nowadays, but when I do, I always have NPR on. And if they mention a study or something like that, I will quickly go look up if I think it might be interesting to see if the data is available with it and then just download it.
And then what I do is once I have the raw data, I think about it as, what do I need to do to this data to make it semi-prepared for the audience that I have or for the topic that I want to teach? And I am a firm believer in bringing real data sets into the classroom. I don't necessarily think that every real data set has to be brought into the classroom in the rawest form that I have found it. I think a little bit of mise en place, like a little bit of prep is okay, just so that we're not always sort of spending class time to get it to the point and we can get to the point that we want to make a lot more easily. And you know, whenever I teach, there's students work on projects where they work with data starting in its rawest form because they are finding the data sets themselves. Hopefully we teach them enough to, you know, the skills enough between different data sets to sort of do the tasks that they need to do to prep their data. But I oftentimes will do like halfway prep so that we can pick things up and I can just get to the point of the sort of the topic for that day.
I do try to sort of bring current data sets in whenever possible. But then there are some sort of canonical data sets that, you know, work well for things. But I will say that I love, I love the penguins data set. But last semester when I taught data science, I told myself I don't get to use it past week two because it's so neat and they're so cute. I sometimes feel like if I'm running out of time, I'm just going to plug it in there. I was like, nope, that's it. No more penguins.
Misleading statistics in the media
How frustrating is it for you when you see statistics being used in a misleading way in the media and social media? And do you address that with your students in your intro class and how you deal with that?
Yeah, very frustrating, as you can imagine. Although sometimes, I feel like I've heard this. I don't really watch a lot of TV, but I watch lots of clips of like late night comedy TV shows, you know, the talk shows. And they often joke that like if things are sort of not the brightest in politics, it creates a lot of like material for them. So the joke is that they don't want these things to be happening, but it makes their job easier. Honestly, sometimes I feel like seeing these awful, awful visualizations that are not just awful, because someone made an honest mistake, but they were clearly designed to mislead people. While it does very much break my heart, sometimes I can't help but feel like it makes my job a lot easier to say, this is not how to do things. So to be able to bring them to the classroom, and have the students reflect on it a little bit, and then sort of talk about how would we fix it.
Another thing that we do, I do this in my intradata science course, and in my visualization course as well. If I see a visualization like that, I either track down, try to track down the data set, or to be perfectly honest, the most misleading visualization examples I've seen actually are visualizing like five, you know, data points anyway, that you can sort of like glean from the picture, and make a data frame yourself by looking at it. And I ask students to plot it themselves on the correct scale, for example, just so they can see the stark difference, so that we can actually tell them, look, like it would have been hard to make an honest mistake here. Someone really was moving these points around the plot to make the story what they want to be.
I think one of the trickier parts of this is to sort of pick examples that are, where we can keep the conversation around the misuse of statistics, maybe perhaps as opposed to around sort of my personal opinions around whether we should be discussing that topic, or whether this is even something to be presented in that way or not. That's a personal struggle. But beyond that, in terms of the misuse of statistics, unfortunately, there are good examples of this out there, and I think I certainly do bring them to the classroom. In my introductory data science course, we do a module on data science ethics. When I first started teaching this course, this was the last module in the course, and quickly I realized that may not be sending the right message. The last topic is like, sometimes you don't even end up having time for it, because it snowed that semester or whatever. So I actually moved it to the middle of the semester, when students are looking for data sets and trying to come up with a proposal for their project, so that we are talking about data science ethics at the time that they are building a data science project themselves, so they can ask themselves the question of, is this even something I should be collecting data on? Is this even a question I should be posing in that way? And the three sort of units in that module is misrepresentation, so we talk about visualizations and other representation of data, particularly in news media, algorithmic bias, and data privacy.
Communicating with non-technical stakeholders
I think a little bit, and I'll try to sort of say some sort of bigger idea things in terms of how I try to communicate with folks like that, and then maybe some little tips as well. So on one hand, when I am talking with folks who are interested in the results that will ultimately be sort of squeezed out of the data, and the answers to the research questions they're interested in, and they are not that interested in how you got there, I sort of think about it, I don't know why I always come up with food analogies, but it's always like, well, when I go to a restaurant, and the dish is really good, sometimes I'm curious how it was made, and sometimes I'm not. Sometimes I'm just happy that someone figured out how to make this for me, and I can just enjoy it as is, and I might have questions about which wine it pairs with, but I don't need to try to make it myself. So oftentimes I'm thinking about it as, if the person is interested in genuinely sort of understanding how the analysis was carried out, then we can have a detailed conversation about that. But if they're just interested in the results, and if I'm talking to, for example, a practitioner, what I try to get out of them is, when I presented this table or this visualization to you, do these numbers make sense to you at all? Are these in the order of what you would expect to see or not? And if not, why? Can you articulate that to me?
I can then translate that conversation back to my analysis steps, and then come back with, well, I think here is the disjoint. Either you were right, I assumed something during my analysis that I shouldn't have, or no, you have this sort of preconceived notion about what this should have been because of what you know about the domain, but here is what I am finding, how I'm finding this data to be different. So how do we negotiate this gap between what you were expecting to see and the results that are coming out of this?
So I think I will basically say, I don't try to communicate about the code or the methodology sometimes unless that person is willing to invest the time in for that. The more sort of like tip I give my students, and mind you, a majority of them are sort of like intro students, but they probably will, you know, at a minimum, have an internship near term after that course where they might generate a data analysis report and someone's going to read. I find that they get so sort of hung up on how they got to a particular result, like the code that was needed to get there, because so much of what I teach ends up being about that, that they think of the number of hours that went into producing the results as something your audience should appreciate, and I just don't think that's true. Sadly, that's not true.
I often, not often, always tell my students when you're, before you finalize your data science project, you have to add echo false to your Quarto document and hide all of the code, and now read it again. Does it actually hang together? And what they submit, even though we do evaluate their code as well, what they submit is a write-up without the code. It's just the words and the results, and I want to make sure that they, at a minimum, once read it as such. It turns out, as much as I love Quarto, and I think I've said it enough that everyone believes me, when you actually have a document with a bunch of code in it, I think it's so hard to focus on the takeaway message that you just need to sort of like take that away and read that document one more time to see is what I'm saying hanging together to the person who can't see how these results were produced.
And oftentimes, I have them do peer review across teams that are working with the code hit, like they do a code peer review that's separate from the content peer review, if you will, and the comments they get are very different. And I think it's really important if the goal is to communicate results that you're putting in just as much effort into the narrative around the results and not just about like the excruciating pain you had to get to the results.
Career advice
Yeah, let me see what kind of, I think so many times in my life, I have decided it's too late to do X. And then turns out it's not. I tried to learn to play the guitar when I was 16. And then I told myself, it's too late. Like my life has passed already. All my friends who know how to play the guitar already are good. Like it's too late to get into it. And I think I've regretted it to this day that I did. And so I try to sort of like get things like if I can have one piece of career advice that it is okay to try new things. Actually, I love learning new things. I fully acknowledge I won't be as good at some of the new things I learn as I am at other things. And I have just learned to make peace with that.
And then the other thing, I'm not necessarily sure how many people this would apply to, but I will give this. So as we said at the beginning, I'm a professor. For those of you who are academics, you might know that a professor's life is like teaching research and there's like some service component. I'm on a bunch of committees. Sometimes people talk about this as a drag. I have managed to finagle my way into committee work that allows me to code in R and generate Quarto reports and impress people. So just find the things you love in the things that you might not enjoy as much and see if you can like bring them in there. That's how I try to find joy in some of the parts of my life that at face value doesn't seem like it's what I want to be spending my time on. But turns out I can always impress people with some data analysis. So I've enjoyed doing that.
Statistics in the age of AI
University of Nebraska recently announced they'll be shutting down their stats program. How do you feel about how statistics is viewed these days in the context of things like AI and ML?
I am aware of this and I am so heartbroken and sort of appalled by it, to be perfectly honest. I have colleagues and dear friends who work and teach there. For those of you who may not be familiar with it, I highly recommend reading some of the pieces that Susan Vander Plaats and Heike Hoffman sort of did an analysis of how these decisions were reached and how there are so many gaps in sort of the reasoning that led, that the university has said that led to this closure. So I cannot believe it, but I will say I personally find it hard to draw a line between sort of like how stats is viewed and valued in the face of like AI and ML and like what happened there. I think bigger problems existed in terms of how that particular decision was made.
I will say that I think it's my sort of like my feeling that statisticians tend to be on average pretty humble, I think, and sometimes we're not the loudest in a room where these conversations are happening. Again, this is not a comment about what happened at UNL, but in general in terms of this space of this like AI and ML, I think there is incredible value in sort of knowing and understanding and appreciating statistical modeling when you're operating in these circles, and I don't know if we managed to shout it as loud as we could from the rooftops. I do think that there is room for sort of statisticians in these domains, and one thing I will say is that when I look at some of these, you know, big companies that have big AI and ML teams, among their leadership is often folks with stats PhDs, and they're there for a reason for sure. That's not to say you need a stat PhD to be working on these problems, of course, but I think it's still heartening to me to see that those folks are there because when it comes to really sort of building a roadmap for like where to take projects to, that statistician's insight clearly is still valuable, and I believe it will continue to be so. At the same time, we as statisticians need to be nimble ourselves and need to sort of rethink our workflows, processes, techniques, sort of in light of these developments happening. So just saying, oh, this is all hype, I don't think is all that productive either.
We as statisticians need to be nimble ourselves and need to sort of rethink our workflows, processes, techniques, sort of in light of these developments happening.
From learner to keynote speaker
In your intro, you talked about learning R, teaching R, and then giving keynote talks. Was there a tipping point in your R journey where you felt experienced enough to give keynotes? What was that transition like for you?
Yeah, I mean, the teach part, in a way, came easily to me, because I had to do it, in the sense that I had to be a teaching assistant to get through my graduate program. That was a program requirement, and I am so glad that it was. You know, I don't know that I would have found this as something I enjoy doing, that I can be good at doing, that I can improve at doing, had I not been provided that opportunity. So, for someone who is maybe not in the academic circle, I would say, if that's the sort of thing you want to try your hand at, oftentimes, folks reach out and say, hey, can I, like, TA this workshop you're giving? Like, I'll just be there anyway. Can I help out in some way? Like, that could be a way to sort of get your feet wet, if you will. Plus, with, I think, a lot of the sort of the online mediums where you can build an audience and sort of, like, try to teach them yourself, if you want to give that a try, I think that's an option, too. But for me, it was the working as a teaching assistant first is where I realized I really enjoy this, and I found myself putting in more time than necessary into it because I got so much joy out of it, that that was, for me, the tipping point for I want to sort of dedicate more of my time to teaching.
For keynotes, I don't know. I couldn't believe it the first time someone asked me to give a keynote, to be perfectly honest. But I think that I try maybe not saying no, like, not knowing how to say no sometimes helps because people ask you things and you say yes, and the next thing becomes easier to say yes to. Once you've done it once. Yeah, if something sounds exciting, I'd say go for it. And what I have found is that when someone has approached me with something where I feel a bit like, I don't even know if I'm a good fit. Why did you come to me? They're often very willing to have that conversation and share some insights as to what I might expect from that audience and getting that feedback from them is usually helpful for feeling like, okay, I feel like I'm ready to do this.
Well, nobody who knows you, Mine would ever assume that you were never ready for a keynote talk or teaching. So sometimes it's believe the people around you and what they see in you. I think Catherine Girton gave that advice when she was here on the Hangout, and I love it. Believe other people and their belief in you.
Well, we're at the top of the hour, so we have to say goodbye. Mine, thank you so much for joining us. I hope you had fun. Yeah, and if everybody enjoyed this conversation about learning and teaching, Alyssa Dillman is joining us next week. She is another data educator and a huge, um, not workshop giver. What is it called? Hackathon giver. If you would love to talk about data hackathons, she's your girl. So come join us next week with Alyssa Dillman. We'll see you on the Discord server, and I will see you maybe Tuesday for Data Science Lab. Isabella and I will be joined by Kylie Ainsley. She will be talking about our projects as our package structures. Hope you have a fantastic week and weekend. We'll see you next time. Bye!

