Bayesian MMM and silly side projects | Ryan Timpe | Data Science Hangout
ADD THE DATA SCIENCE HANGOUT TO YOUR CALENDAR HERE: https://pos.it/dsh - All are welcome! We'd love to see you! We were recently joined by Ryan Timpe, a Lead Data Scientist, to chat about Marketing Mix Modeling (MMM), silly data science side projects like brickr (see the link below to his 2020 posit::conf() talk), the benefits of working out loud, and using LLMs (Large Language Models) for data science coding (something he gave a whole talk about at posit::conf(2025)). In this Hangout, we explore a bit of what Ryan does, including Marketing Mix Modeling (MMM) and helping business partners better use their data. He shared that his work utilizes Bayesian methods to manage the complexity and high seasonality of his data, especially the Q4 holiday-buying spikes seen in most retail sales data. Using Bayesian priors helps keep the models within constraints and prevents overfitting. The foundation of the internal MMM package that Ryan developed uses tidy models, and you can check out the link to his 2023 posit::conf() talk to hear more about his model pipeline! The theme here is that Ryan gives lots of great talks - go check them out! Resources mentioned in the video and zoom chat: Ryan Timpe's brickr GitHub Repository → https://github.com/ryantimpe/brickr Ryan Timpe's 2020 talk on learning R with silly projects → https://youtu.be/oOG-aXP_ICI Ryan Timpe's 2023 talk on model pipelines → https://youtu.be/R7XNqcCZnLg Ryan Timpe's 2025 talk on learning from LLM pitfalls → https://youtu.be/vJrIahZWCw4 If you didn’t join live, one great discussion you missed from the zoom chat was about the importance of working out loud and publishing (even imperfect) data science projects. Attendees emphasized that overcoming their inner critic and sharing work helps with communication and professional development (but can be super challenging still). One participant shared that they got their current role by creating a fake MMM project with fake data, but that it showed their interviewer they could take initiative! ► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu Follow Us Here: Website: https://www.posit.co Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co Thanks for hanging out with us! Timestamps 00:00 Introduction of Ryan 04:41 "What kind of MMM do you guys do?" 07:01 "What was your favorite silly project that you did?" 08:06 "What is brickr?" 11:11 "Do you think brickr helped you get the job you have now?" 14:23 "How do you scope your side projects?" 17:42 "What do you recommend for setting up a Python environment?" 21:16 "When do you recommend not using LLMs, and why?" 24:19 "How do you shut up your own inner critic?" 28:00 "What kind of data do you use for your LLMs and models?" 30:32 "Do you use open source MMM package like Meridian or Robyn?" 35:13 "How do you communicate complex modeling nuances to non-technical stakeholders?" 38:06 "Does your love for LEGO help you build code better?" 41:00 "What drives priority for what your team works on next?" 43:21 "What resources do you recommend for learning marketing analytics?" 44:46 "What tools do you use for Bayesian analysis?" 45:50 "Was your career move to the LEGO Group intentional or accidental?" 47:49 "What is your end product delivery?" 49:21 "What tool do you use to implement a Bayesian MCMC model?" 49:52 "How do you make model-based reporting faster and more automated?" 51:31 "Do you have a piece of career advice that's meaningful to you?"
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hey there, welcome to the Paws at Data Science Hangout. I'm Libby Herron and this is a recording of our weekly community call that happens every Thursday at 12 p.m. U.S. Eastern Time. If you are not joining us live, you miss out on the amazing chat that's going on. So find the link in the description where you can add our call to your calendar and come hang out with the most supportive, friendly, and funny data community you'll ever experience.
I would love to go ahead and introduce our featured leader for today, Ryan Timfa. He's a data scientist at the Lego Group. And you might know him from his hilarious conference talk in 2020, where he talked about learning R with funny side projects. That was my first introduction to Ryan. Ryan, thank you so much for being with us. And it would be great if you could introduce yourself. Tell us a little bit about what you do and what you like to do for fun.
Awesome. Yeah, thanks for having me. Yeah, so I lead data sciences at the Lego Group. I've been there for six and a half years. So right before COVID and it feels like it went by super quickly. Everything I do is around marketing effectiveness and measuring how well all of our marketing and TV shows and everything that we do to try to get you to buy Lego sets, how that works and how successful that is. And it's a ton of models in the backend. And then it's a ton of working with the business to interpret those models and then action on them.
Before that, I had like 10 years in the tech consulting firm doing a lot of macroeconomic forecasting and market sizing and survey analysis. And this was kind of before data science was a real thing. So eventually I just changed my job title to data scientist and became real somehow. Besides that, I live in Connecticut, work in Boston, have a fluffy dog who's looking at me while we're not walking right now. And yeah, free time. Again, it was those silly data science side projects that I did a lot, especially a few years ago to help me learn a lot of things.
More recently, just a lot of still doing that a little bit, a lot of Lego building, of course, because childhood me was obsessed with it and still I am. And then when I hit 30 a few years ago, I joined an adult gymnastics class and that takes up all my nights and it beats me up. But it's really fun and trying to do things I should have started doing as a really young kid, but now trying to do as an adult. So my body's always in pain, but I have so much fun doing it.
I was going to say, have you taken out like an extra insurance policy, accident policy? That's probably a good idea. As a former insurance agent, I'm like insurance, insurance, insurance. No, but I'm very slow to progress because adults have fear receptors and I know that my entire livelihood is my brain. So falling on my head terrifies me. So yeah, I'm very slow, but I love it. So basically five nights a week, I'm at gymnastics.
Community and Q&A intro
Yeah, we're not made of rubber anymore once we get past a certain age, right? Well, I would love to remind everybody that you can ask questions in Slido and we already have so many questions, Ryan. I think that you've set some kind of record for the most Slido questions asked before we've even hit the 10 minute mark because there's like more than five in there right now.
Oh yeah, fine. Okay, I'm unmuted. Hey, I'm so happy I joined today. I did not know you also work at Lego, so this is fantastic. I'm a huge Lego person as well. Let me just say right now, officially, I'm talking about my job as much as I can and officially I'm not representing the Lego group today. We're very, very conscious about like brand identity. So I'm here as a data scientist, but yeah, definitely can talk about that.
I do work in MMM as well, so this is really fantastic. What I was wondering is what kind of MMM do you guys do? Is it more like the Asian frequentist and how do you guys go about it? Like what's the usual workflow? And we should say like what MMM is for people in the chat.
Marketing Mix Modeling explained
Yeah, so MMM is Marketing Mixed Modeling. It's an econometric equation to try to relate every type of marketing inputs to how much of sales or brand health or any of your KPIs you can allocate or attribute to each marketing lever. We definitely are more in a Bayesian side, so that's actually in 2022 I gave a talk about tie-in models and that. Tie-in models are the foundation of all of our MMM work, but we do Bayesian and for a very specific reason of where, especially in the Lego group, we're a very seasonal company.
We do look at our sales trends. They spike every Q4 and all of our marketing and every single one of our inputs basically has that same pattern. So you have an overfitting chaos anytime you even run the simplest model. So we found that we're using Bayesian as the first layer to kind of add a lot of business context into the model. What we expect to, like when you actually get to the MMM part, we have these Bayesian priors to kind of help us really keep everything within constraints and we do a lot of research before we get to the modeling process to actually look at these patterns and help them inform those priors.
So we found that we're using Bayesian as the first layer to kind of add a lot of business context into the model.
Yeah, that's more than I expected. I was expecting a Lego question or like a toy question at first, so yeah. If anyone has any questions about any of those terms, put them in the chat and everybody in the chat is going to help you understand them.
Favorite silly side project: brickr
So Zach had asked, what was your favorite silly project that you did? I think if I look back at everything I've done, brickr was the one where I turn images into Lego mosaics and I did that way before I joined the Lego group. That was my favorite just because that one was just one morning. I just wanted to see if it was possible and then it evolved over time and I was sharing on Twitter a lot and like I got a lot of community feedback and that was the one that I felt like my closest to my identity.
I kept on iterating on it, learning more things, eventually turned into 3D models based off of Tyler Morgan Wall's work and then miraculously the Lego group was hiring data scientists in my state and that was an amazing interview topic of conversation. So I have a very, very like warm spot in my heart for the brickr side project.
So it was it started out as so the Lego group had an online tool to make mosaics of Lego bricks from images. You'd upload an image and make a black and white mosaic from some a predefined selection of bricks. It's a very actually pretty simple math problem so I wanted to see if I could do an R and that's kind of when I hadn't used dplyr a lot by then but I was trying to learn it more. I wanted to see if this was possible with the tidyverse and it definitely was and then because it's all digital you can have any constraints you want or you don't have all the constraints on the bricks so you can make colors you can make different size bricks and just basically literally turning an image of your face into Lego bricks.
And then over time that blew up to a lot more 3d models to instructions to like tell it but mostly the square bricks but you could say I wanted a 3d house you draw an outline of it with some rules and then brickr could create that it as a 3d model in R that you could spin around and play with. So purely for fun but it was a lot of fun to build and it was a lot of fun to use and this was kind of the days when twitter data science and twitter is huge and you could just share stuff in the process and get a lot of input and everyone's cheering you on and definitely miss those days but it was it was a really fun time to develop things so it's definitely my favorite side project.
Working out loud
So um lego wasn't looking for mosaic or my brickr mosaic thing um I think it um but it was just a very good topic of conversation in my interviews actually my first interview was fine um it was like 8 a.m after being unemployed for three months and I wasn't really in the mindset and it was a fine interview but it wasn't amazing. And then in between that first interview and then the ambiguous next steps of a second interview I think some people who were already at the lego group heard that I was interviewing and actually uh one of them was very active on twitter at the time and she's like oh she told my boss at the time oh have you seen this and that got my my second interview I think.
And um I was shy about it at first I should not have been shy about but I just kind of assumed that a professional person wouldn't want to see my silly toy project but I was wrong. And yeah now when I'm interviewing people I love seeing these side projects. So yeah I made a miscalculation but the universe kind of helped recover me um but yeah it was definitely yeah it definitely helped get me that job I'm 100% sure of that.
I just kind of assumed that a professional person wouldn't want to see my silly toy project but I was wrong.
Yeah I'll work out loud even the silly stuff especially the silly stuff it's a better conversation starter. I have gotten two interviews because I put that I played roller derby on my application or on my um it was on my resume it's the bottom of my resume like I play roller derby that's the fun thing about me and twice I had interviews where I got the job where they were like we literally only just wanted to talk about roller derby like we we saw that you played it and like we thought you're like the fun person we put in our stack.
Scoping side projects
I think you will see by my lack of cran and my lack of or my big uh my abundance of incomplete repos that I don't scope out very well um I basically have a random idea while watching walking the dog or watching tv or seeing what someone else did get super excited and obsessed over for a few days or a few weeks depending on how big the thing is I definitely do not plan this stuff out well. And then I just build it and build it until I see can I do it and then if it's a cool thing to share I put it out show my friends or show the internet or show my co-workers.
Sharing imperfect work publicly
How do you shut up your own inner critic and just make it happen?
Yeah I don't think I actually have shut it up I'm almost always second guessing stuff I put out there and like back when I was on Twitter a lot I had deleted so many things not even just data science for that I'm like oh this is a good idea and like no never mind this doesn't need to be said. But then yeah like sometimes I just get really excited and my family isn't can't they're not nerds they can't show their excitement I just need to find like the correct audience for even I know it's not perfect but like data scientists are all they love imperfection they love incomplete things.
LLMs for data science coding
What when do you recommend not using LLMs and why when is it a trap?
See I feel like I'm still writing a lot of manual code especially if I'm writing stuff in R I'm still doing 99% of it myself um the Python stuff though it's just I could be staring at a blank page for a very long time until and then or I could describe exactly to the LLM what I want and then I try to fix it. Yeah um I don't know I still like having fun writing code so I write more than I probably should but I don't know I'm still learning how to use LLMs best for my work and I think we all are and especially as the LLMs get better I'm sure I'll start shifting it more and more.
We all are absolutely and I think for me the trap comes when I go into it thinking this will save me time because it never does. And what I should have done from the beginning was use the LLM to help me build a foundational understanding that I need to build the thing myself um or the foundational understanding that I need to check its work when I ask it to build functions or bits and pieces for me.
Bayesian MMM tools and pipeline
What tools for Bayesian analysis do you use do you stan and BRMS?
Yeah when I joined we had some Robin wasn't out there yet because it was 2019 but uh we had we have external firms and agencies that run MMM for a lot of our markets but they're very expensive and so for and eat a lot of some pieces of the marketing budget so I built one internally or we had to build run some MMM models internally. Uh my team and I chose to develop our own um and I'm on the second iteration of that right now and again I use tidy models as the foundation of all that uh was really what I needed at the time to have all the control over the inputs I built a lot of wrappers around that um so we're using an internally developed package in R right now.
We I had to write a bunch of custom step if you've seen tidy models there's a lot of process like it's basically setting up your model and then you're running it independently your recipe and you're running independently and there's step functions I had to write a lot of custom step functions to fit our data transformations that we needed to do for MMM. But the biggest challenge we have with MMM um besides the priors and running the models and getting all your hyperparameters is that we are at every time we have thousands of data points that we could or data series that we could put into models um a lot of them are highly correlated a lot of them are different ways to measure the same thing.
Communicating model results to stakeholders
Yeah so I have a wide variety of stakeholders with a wide variety of interest in the details and how much information they want to know about the back end and I have some stakeholders who I've only worked with for a few months some who I've been working with for five years and who I'm more comfortable sharing more and more details to without overwhelming them. Um I don't think any of them have had asked the Bayesian versus frequentist question because that's a very very niche question to people who are in statistics every day.
Everyone's also like always very excited to hear about how we're using their data in our models and trying to get as much information at this as they can. I think MMM is tough because since we're trying to cover every single data point we can't really drill into every super specific question they have at all times so it's about training them about the best way to use the model results and then once we deliver our results how they can integrate that with their multi-touch attribution their last touch attribution or their other attribution models.
Domain passion and career advice
Does your love for lego help you build code better like scoping sectioning loops modularization?
I'm not the strongest data scientist I'm definitely not the strongest ML person and as you know from my Python talks I'm not good at Python but I'm really good at just being passionate about the data knowing the tools to turn business questions into data science tools and then turn those back into the business.
All right everybody I have had such a good time Ryan thank you so much for joining us I hope you had a good time. Yeah no this was way more fun than I I was stressed but now I'm good. Yeah oh we made Ryan feel not stressed well we're gonna get Ryan on the discord server if you are on the discord server make your way to the data science hangout channel and tag me so I can make your name yellow give you a little badge that says dsh.
All right next week Kevin Dalton will be joining us senior data scientist at Great American Insurance Group and then we have Thanksgiving which is off I will not see you next week Rachel will be here with you Isabella will be here with you um and I'll see you the week after Thanksgiving hopefully. Thank you so much for being wonderful if you want to save the chat there's three dots in the top that will let you do that it'll save it as a txt file and of course if there are any resources that we can go save over in the discord everybody go do that. Okay because it will be the place where we can continue conversations and we won't just have an ethereal chat that goes away every single time all right everybody I will see you in a couple of weeks but come hang out with Rachel next week with Kevin Dalton okay bye everybody.