Erin Bugbee - To Explore or To Exploit: Decoding Human Decision Making with R and Python
Every day, we face decisions, such as when to purchase a flight ticket to Seattle for posit::conf(2024) when prices change dynamically over time. As a decision scientist, I aim to understand these choices and the cognitive processes underlying them. In my talk, I'll delve into how I leverage both R and Python to decode human decision-making. I'll focus on optimal stopping problems, a common predicament we all encounter, in which a decision-maker must determine the right moment to stop exploring options and make a choice based on their accumulated knowledge. Attendees will be introduced to the field of decision science and learn how R and Python can assist in advancing the study of the human mind. Talk by Erin Bugbee Slides: https://erinbugbee.quarto.pub/2024positconf-decoding-decisions/ GitHub Repo: https://github.com/erinbugbee/2024positconf-decoding-decisions
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hi everyone, my name is Erin Bugbee and I am a PhD student at Carnegie Mellon. Thank you so much for coming to my talk, To Explore or To Exploit, Decoding Human Decision Making with R and Python.
Let's imagine that you're shopping for a flight to Seattle, as many of us have likely done recently for PositConf. There are many factors that you must consider when you're choosing a flight. Perhaps you have a preference of airline or time of day, but you almost certainly are trying to minimize the price that you pay. Now, this is a difficult choice because you want to collect some information about the available options out there, but if you spend too much time searching, it's likely that the price will increase. So the question is, when do you have enough information to make the decision and purchase that flight ticket?
This is an example of what is known as the optimal stopping problem, in which a decision maker has to decide when to stop collecting information when they're exploring the options out there and when to use that information that they've collected to make a purchase or make a decision. When making these types of decisions, there's a trade-off between two different actions, exploration and exploitation, and the goal is to find an optimal balance of gathering information for a search, this is called exploration, and exploiting the known knowledge that you've gained in exploitation. The tricky thing, though, is that humans do not decide optimally. In this book, Thinking Fast and Slow, by cognitive psychologist and Nobel Prize winner Daniel Kahneman, he talks about how people use mental shortcuts called heuristics when making decisions, and this can lead them to systematically deviate from what is rational, and this is called a cognitive bias.
So in the field that I study, cognitive decision science, we aim to study a variety of things. How should people make decisions if they're completely rational agents and they have lots of information about the world? How do they actually make decisions given what we know from psychology research and what we know about how our minds work? How can this process be modeled to both predict and explain behavior? And ultimately, can what we know from the first three be used to actually help them make better decisions?
About the researcher
So a little bit about me. I am Erin Bugbee. I'm a PhD candidate in my fifth and final year at Carnegie Mellon within the Department of Social and Decision Sciences, and I study cognitive decision science. I received my bachelor's degree in statistics and behavioral decision sciences at Brown University. This is where I learned about R and data science and Python. And I've interned in a variety of roles in data science and machine learning. This summer, I was a machine learning engineer at Apple, but I've also spent three summers in Seattle at Amazon and Microsoft, and finally, I spent a summer at Disney. I'm also on the job market for 2025.
Running human experiments
Okay, so today in my talk, I'll be sharing some information on my research and what I've learned about decoding human decision making in three main parts. The first is running human experiments. Second is simulating behavior in Python. And then finally, analyzing data in R.
So let's talk about running human experiments. The first step to running human experiments is to design the experiment. What do we want to learn about human behavior? So I choose a decision-making task, and in particular, I'm really interested in those optimal stopping tasks where people balance exploration and exploitation. Then I have to decide what factors do I want to manipulate to determine their effect on human behavior. And for this particular case, I'll be going through one experiment that I actually ran where I varied the feedback that people receive after they make their choices, as well as what do they know about the environment. In the case of choosing flight tickets, what do they know about the prices of the tickets out there? Do they know this distribution?
Then once I've designed the experiment, I develop it. Typically, this consists of two parts, a survey, typically in Qualtrics, where I can collect some demographic information and provide some instructions to the participants. And then this comes together with some sort of decision-making task. So to make that clear, let's all decide together when to stop searching a search of options. So you will all now become my participants in a study. And this is what my participants saw. They saw a box with a value in it, and they were told you want to maximize the value of the box in this sequence that's length 10. So you want to choose the box with the maximum value. You can either pass or select on the box. If you pass it, you cannot come back to it later. And also, if you reach the last box in the sequence, you have to select it. They do this many times.
So now I'd love you to raise your hands if you'd like to select this box. No. Okay. How about this one? Okay, I have a couple takers. This box? Still no. How about this one? Okay, I see a lot of hands here. What about this one? No. Okay, nice. So you can only raise your hand once. So for those of you, yeah, great job. This one? For anyone who hasn't raised your hand yet, you're getting close to the end. Okay. So for those of you who waited, this is what you get. Awesome. So many of you stopped at that box 72. As you see, that was not the maximum in the sequence. You stopped too early. For those of you who stopped on the box of value 80, congrats, you made an optimal decision. That's the optimal choice here. So that gives you a little bit of insight into what this task is. People do this many times, and then they receive some feedback. So I just gave you some feedback about your choices too. That would be detailed feedback. In this example, it's outcome feedback. People are just told whether they were correct or not. And there was also a third condition where people received no feedback at all and sort of just had to go with it.
So now I have the task, I've got it designed. I need to collect some data. So I need some human participants to get some data on. And there are a variety of online services to do so. One of them is Amazon Mechanical Turk, where you can post your task and have people complete it, and you pay them in return for their completion. It's also being increasingly used for data labeling and machine learning. So that's another exciting use for this. And so from there, I have my human data.
Simulating behavior in Python
At the same time, I'm often simulating behavior using Python. And there's a variety of reasons that I do this. The first is to make predictions, make some hypotheses about what I think people will do in my particular experiment. And the second is to explain their choices. If I can have a model that represents the human mind, then I can have some sort of explainable reasons for why they made the choices that they did. So again, I build a decision-making environment. I take this task that I've designed for human participants and I translate it into some Python code. Then with this Python code, I create agents. And the agents are decision makers that are Python-based decision makers. I typically use two different types. The first is an optimal agent. And this decides according to the optimal process. So there's a mathematically optimal process that you can calculate and create decision makers to decide in this way. This serves as a baseline for what a rational chooser would do. I also have cognitive agents, which decide according to what's called a cognitive algorithm. You can think of it as a machine learning algorithm that's based in psychology research, based in what we know about the human mind. I use in particular an instance-based learning model. I'll explain a little bit more about what that is. But it is a cognitive model. For this, I use a Pyble Python library. And I'll be going more into that. And this serves as a prediction of human behavior. This is suboptimal, like many of you, actually all of you are. This allows us to predict what people will do in these types of decision-making contexts.
So a little bit more about Pyble. This is a Python implementation of a cognitive theory of how people make decisions from experience. It's produced in my lab at Carnegie Mellon Dynamic Decision-Making Laboratory. And this allows us to create a cognitive model and to simulate cognitive agents. We can have one agent for each human decision maker. And then we can predict what choices they'll make. So a little bit about the code. I know this is PositConf, if you might want to see it. We start by importing the Pyble library. And then we create an agent. And we tell the agent, you're going to take an action. And you're also going to pay attention to the value of the box and the length from the end of the sequence. Because that's what we believe that people are paying attention to. We also use a similarity function. This basically is based on some psychology research of how people judge different options to be similar or dissimilar. From here, we populate the agent with some experience. This constitutes like prior information that humans may have from reading instructions. It's what the agent has to go off of as a starting point. The agent then sees a box and has to either make a select action or a pass action, considering the value and the distance from the end of the sequence. And it makes a choice. And then receives feedback based on the condition that that agent is in.
So, why do I use Python for this? Well, there's a variety of reasons. Pyble is a Python library. But there are also many other modeling libraries that Python is particularly well suited for. In addition, it's very general purpose and flexible. When I'm translating that human experiment into Python code, it needs to be quick and easy for me to be able to do that. And I find Python to be really great for that.
Analyzing data in R
Now, the final step, I've collected human data. I've simulated both optimal and cognitive agents. Now, I need to see what did they do? And so, I use R for that. I take the results of the survey, the decision making task, and I take the results of both the optimal and the cognitive agent. I bring those together, those CSVs, into a reproducible Quarto document.
Why R? There are many libraries for exploring and visualizing data that I think makes R, as you all know, really great for this type of step. The tidyverse and ggplot2 really make it quick and easy for me to see this data and to also produce really great graphics for my papers and for my presentations. R also excels in statistical analysis. It's not enough in this type of academic work to just look at the graphs and say that we see a difference in behavior. We have to actually make some statistical claims for that.
Results
So, you might be wondering now, are you like other decision makers in my experiment? And so, I'll show you some results. So, the first figure I have here is looking at the search length. How many boxes did these agents and people search through on average? And so, we have in black, the optimal decision makers, and in red, the human decision makers. We see that consistently, people stopped searching too early, and we even saw that in this room. The instance-based learning model, that cognitive model that I mentioned, accurately predicts human behavior in almost all conditions, with the exception of it's a little tough with no feedback, but because humans then have to rely on past experiences that those agents do not have. But generally, the agent does a really good job of predicting the differences that we see across conditions.
We also see that people learn from feedback. Now, I could have had you go through many problems like I did if I had more time, and you would have learned to do a little bit better at it. And we see that's the case, that when people are receiving feedback, whether it be outcome or detailed, particularly good in receiving detailed feedback, that their accuracy improves over time. So, this is really promising. You can get better at making decisions.
And then finally, having knowledge of the distribution of options actually hinders necessary exploration. I'll make this a little bit more clear. So, when I had you start and do that problem, you didn't know anything about the distribution of options. You just had to kind of figure it out from the values that you were seeing within the problem. However, when we gave some participants knowledge of the distribution of values, we showed them the distribution, we had them answer questions about it, and we see that those people that knew where the values came from actually explored less than those who did not know. So, when you don't know, you explore more. And since people stopped too early, this actually brings them closer to the optimal search length and improves their performance.
So, when you don't know, you explore more. And since people stopped too early, this actually brings them closer to the optimal search length and improves their performance.
Tools and workflow
I find this very fascinating. So, in conclusion, there are lots of tools that I use in this workflow for my research. Starting with the human experiment Qualtrics, that task that I showed you, I built in JavaScript, HTML, and CSS. For simulating the agents, that requires Python, the Pivot library, and I use VS Code, but now I'm very excited for Positron. And then finally, for analyzing the data in R, of course, R, Quarto for creating reproducible documents, and RStudio.
So, now, talking about exploration and using a tool. Clearly, I use a lot of tools here. I really try to avoid when you have a hammer, everything looks like a nail. I try to use the best tool for the job, and I'm seeing that is a theme in this year's PositConf. And I actually believe that bringing R and Python together in harmony is the best way to decode the human mind and other complex questions that we may have.
I actually believe that bringing R and Python together in harmony is the best way to decode the human mind and other complex questions that we may have.
Now, I'd like to give you some advice in your time here at PositConf. If this is your first Conf like it was mine last year, I was an Opportunity Scholar, so everything's really new. There's lots of new people, new technologies, topics to learn. You're really in this pure exploration phase. If you've attended many times before, you might be more inclined to exploit the knowledge that you've gained in the past. You've likely made connections with people that you've met, you might know which sessions are of interest, and you might lean towards that exploitation and using that knowledge that you have gathered. So, I'd like to encourage you to keep exploring. Be careful of over-exploiting that knowledge, and instead, keep challenging yourself to meet new people, learn new things, see a new place. And we're in Seattle, so try a new coffee shop or maybe take a hike and explore some nature if you have the chance. Well, thank you so much. Here's a link to GitHub repository with a paper that I published on this work, as well as the slides, and please feel free to contact me or say hi if you see me around, and then, finally, my personal website. Thank you so much for your time and for listening.
Q&A
Thank you. So, we have some questions. How would you extend this work to more complex systems beyond the choice of a box with a single number and two options? That's a great question. So, the trouble with this type of research is that, clearly, in the world, there's more than just one value. When we're choosing a flight, there's all those different factors that I mentioned. What makes it really complicated, then, is which of these factors is the most important, and which of these factors is the most important, and which of those are exactly influencing your choice? And also, what is our prior knowledge about how flight prices work? How does that play into our decisions? So, I've simplified here so that these insights are more generalizable, but, of course, my next steps will be to go to some more real-world data and try to work with others to see these sorts of decisions in the wild and what insights might translate from here to those type of contexts would be really great. I'm also adding in more factors and trying to extend upon the knowledge gained here with how many different factors might influence people's stopping behavior. Thank you for the question.