Resources

Bayesian MMM and silly side projects | Ryan Timpe | Data Science Hangout

ADD THE DATA SCIENCE HANGOUT TO YOUR CALENDAR HERE: https://pos.it/dsh - All are welcome! We'd love to see you! We were recently joined by Ryan Timpe, a Lead Data Scientist, to chat about Marketing Mix Modeling (MMM), silly data science side projects like brickr (see the link below to his 2020 posit::conf() talk), the benefits of working out loud, and using LLMs (Large Language Models) for data science coding (something he gave a whole talk about at posit::conf(2025)). In this Hangout, we explore a bit of what Ryan does, including Marketing Mix Modeling (MMM) and helping business partners better use their data. He shared that his work utilizes Bayesian methods to manage the complexity and high seasonality of his data, especially the Q4 holiday-buying spikes seen in most retail sales data. Using Bayesian priors helps keep the models within constraints and prevents overfitting. The foundation of the internal MMM package that Ryan developed uses tidy models, and you can check out the link to his 2023 posit::conf() talk to hear more about his model pipeline! The theme here is that Ryan gives lots of great talks - go check them out! Resources mentioned in the video and zoom chat: Ryan Timpe's brickr GitHub Repository → https://github.com/ryantimpe/brickr Ryan Timpe's 2020 talk on learning R with silly projects → https://youtu.be/oOG-aXP_ICI Ryan Timpe's 2023 talk on model pipelines → https://youtu.be/R7XNqcCZnLg Ryan Timpe's 2025 talk on learning from LLM pitfalls → https://youtu.be/vJrIahZWCw4 If you didn’t join live, one great discussion you missed from the zoom chat was about the importance of working out loud and publishing (even imperfect) data science projects. Attendees emphasized that overcoming their inner critic and sharing work helps with communication and professional development (but can be super challenging still). One participant shared that they got their current role by creating a fake MMM project with fake data, but that it showed their interviewer they could take initiative! ► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu Follow Us Here: Website: https://www.posit.co Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co Thanks for hanging out with us! Timestamps 00:00 Introduction of Ryan 04:41 "What kind of MMM do you guys do?" 07:01 "What was your favorite silly project that you did?" 08:06 "What is brickr?" 11:11 "Do you think brickr helped you get the job you have now?" 14:23 "How do you scope your side projects?" 17:42 "What do you recommend for setting up a Python environment?" 21:16 "When do you recommend not using LLMs, and why?" 24:19 "How do you shut up your own inner critic?" 28:00 "What kind of data do you use for your LLMs and models?" 30:32 "Do you use open source MMM package like Meridian or Robyn?" 35:13 "How do you communicate complex modeling nuances to non-technical stakeholders?" 38:06 "Does your love for LEGO help you build code better?" 41:00 "What drives priority for what your team works on next?" 43:21 "What resources do you recommend for learning marketing analytics?" 44:46 "What tools do you use for Bayesian analysis?" 45:50 "Was your career move to the LEGO Group intentional or accidental?" 47:49 "What is your end product delivery?" 49:21 "What tool do you use to implement a Bayesian MCMC model?" 49:52 "How do you make model-based reporting faster and more automated?" 51:31 "Do you have a piece of career advice that's meaningful to you?"

Dec 12, 2025
55 min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hey there, welcome to the Paws at Data Science Hangout. I'm Libby Herron and this is a recording of our weekly community call that happens every Thursday at 12 p.m. U.S. Eastern Time. If you are not joining us live, you miss out on the amazing chat that's going on. So find the link in the description where you can add our call to your calendar and come hang out with the most supportive, friendly, and funny data community you'll ever experience.

I would love to go ahead and introduce our featured leader for today, Ryan Timfa. He's a data scientist at the Lego Group. And you might know him from his hilarious conference talk in 2020, where he talked about learning R with funny side projects. That was my first introduction to Ryan. Ryan, thank you so much for being with us. And it would be great if you could introduce yourself. Tell us a little bit about what you do and what you like to do for fun.

Awesome. Yeah, thanks for having me. Yeah, so I lead data sciences at the Lego Group. I've been there for six and a half years. So right before COVID and it feels like it went by super quickly. Everything I do is around marketing effectiveness and measuring how well all of our marketing and TV shows and everything that we do to try to get you to buy Lego sets, how that works and how successful that is. And it's a ton of models in the backend. And then it's a ton of working with the business to interpret those models and then action on them.

Before that, I had like 10 years in the tech consulting firm doing a lot of macroeconomic forecasting and market sizing and survey analysis. And this was kind of before data science was a real thing. So eventually I just changed my job title to data scientist and became real somehow. Besides that, I live in Connecticut, work in Boston, have a fluffy dog who's looking at me while we're not walking right now. And yeah, free time. Again, it was those silly data science side projects that I did a lot, especially a few years ago to help me learn a lot of things.

More recently, just a lot of still doing that a little bit, a lot of Lego building, of course, because childhood me was obsessed with it and still I am. And then when I hit 30 a few years ago, I joined an adult gymnastics class and that takes up all my nights and it beats me up. But it's really fun and trying to do things I should have started doing as a really young kid, but now trying to do as an adult. So my body's always in pain, but I have so much fun doing it.

I was going to say, have you taken out like an extra insurance policy, accident policy? That's probably a good idea. As a former insurance agent, I'm like insurance, insurance, insurance. No, but I'm very slow to progress because adults have fear receptors and I know that my entire livelihood is my brain. So falling on my head terrifies me. So yeah, I'm very slow, but I love it. So basically five nights a week, I'm at gymnastics.

Community and Q&A intro

Yeah, we're not made of rubber anymore once we get past a certain age, right? Well, I would love to remind everybody that you can ask questions in Slido and we already have so many questions, Ryan. I think that you've set some kind of record for the most Slido questions asked before we've even hit the 10 minute mark because there's like more than five in there right now.

Oh yeah, fine. Okay, I'm unmuted. Hey, I'm so happy I joined today. I did not know you also work at Lego, so this is fantastic. I'm a huge Lego person as well. Let me just say right now, officially, I'm talking about my job as much as I can and officially I'm not representing the Lego group today. We're very, very conscious about like brand identity. So I'm here as a data scientist, but yeah, definitely can talk about that.

I do work in MMM as well, so this is really fantastic. What I was wondering is what kind of MMM do you guys do? Is it more like the Asian frequentist and how do you guys go about it? Like what's the usual workflow? And we should say like what MMM is for people in the chat.

Marketing Mix Modeling explained

Yeah, so MMM is Marketing Mixed Modeling. It's an econometric equation to try to relate every type of marketing inputs to how much of sales or brand health or any of your KPIs you can allocate or attribute to each marketing lever. We definitely are more in a Bayesian side, so that's actually in 2022 I gave a talk about tie-in models and that. Tie-in models are the foundation of all of our MMM work, but we do Bayesian and for a very specific reason of where, especially in the Lego group, we're a very seasonal company.

We do look at our sales trends. They spike every Q4 and all of our marketing and every single one of our inputs basically has that same pattern. So you have an overfitting chaos anytime you even run the simplest model. So we found that we're using Bayesian as the first layer to kind of add a lot of business context into the model. What we expect to, like when you actually get to the MMM part, we have these Bayesian priors to kind of help us really keep everything within constraints and we do a lot of research before we get to the modeling process to actually look at these patterns and help them inform those priors.

So we found that we're using Bayesian as the first layer to kind of add a lot of business context into the model.

Yeah, that's more than I expected. I was expecting a Lego question or like a toy question at first, so yeah. If anyone has any questions about any of those terms, put them in the chat and everybody in the chat is going to help you understand them.

Favorite silly side project: brickr

So Zach had asked, what was your favorite silly project that you did? I think if I look back at everything I've done, brickr was the one where I turn images into Lego mosaics and I did that way before I joined the Lego group. That was my favorite just because that one was just one morning. I just wanted to see if it was possible and then it evolved over time and I was sharing on Twitter a lot and like I got a lot of community feedback and that was the one that I felt like my closest to my identity.

I kept on iterating on it, learning more things, eventually turned into 3D models based off of Tyler Morgan Wall's work and then miraculously the Lego group was hiring data scientists in my state and that was an amazing interview topic of conversation. So I have a very, very like warm spot in my heart for the brickr side project.

So it was it started out as so the Lego group had an online tool to make mosaics of Lego bricks from images. You'd upload an image and make a black and white mosaic from some a predefined selection of bricks. It's a very actually pretty simple math problem so I wanted to see if I could do an R and that's kind of when I hadn't used dplyr a lot by then but I was trying to learn it more. I wanted to see if this was possible with the tidyverse and it definitely was and then because it's all digital you can have any constraints you want or you don't have all the constraints on the bricks so you can make colors you can make different size bricks and just basically literally turning an image of your face into Lego bricks.

And then over time that blew up to a lot more 3d models to instructions to like tell it but mostly the square bricks but you could say I wanted a 3d house you draw an outline of it with some rules and then brickr could create that it as a 3d model in R that you could spin around and play with. So purely for fun but it was a lot of fun to build and it was a lot of fun to use and this was kind of the days when twitter data science and twitter is huge and you could just share stuff in the process and get a lot of input and everyone's cheering you on and definitely miss those days but it was it was a really fun time to develop things so it's definitely my favorite side project.

Working out loud

I talk all the time about working out loud and the value of it if lego had been hiring and you hadn't done that you hadn't put the time in to make this project do you think you would have gotten the job do you think it like helped you put you over the edge?

So um lego wasn't looking for mosaic or my brickr mosaic thing um I think it um but it was just a very good topic of conversation in my interviews actually my first interview was fine um it was like 8 a.m after being unemployed for three months and I wasn't really in the mindset and it was a fine interview but it wasn't amazing. And then in between that first interview and then the ambiguous next steps of a second interview I think some people who were already at the lego group heard that I was interviewing and actually uh one of them was very active on twitter at the time and she's like oh she told my boss at the time oh have you seen this and that got my my second interview I think.

So I think it recovered my awkward first interview and him know my boss knowing about this project I did on the side that like really showed off my skills better than I could just show off an in a first round interview I think that helped pique their interest a lot more and so on the second round interview I talked a lot more about the development of this and I was a lot more comfortable because I was in control of that conversation and I could talk very clearly about what my goal was and it showed my passion for the brand so I think yeah definitely it helped.

And um I was shy about it at first I should not have been shy about but I just kind of assumed that a professional person wouldn't want to see my silly toy project but I was wrong. And yeah now when I'm interviewing people I love seeing these side projects. So yeah I made a miscalculation but the universe kind of helped recover me um but yeah it was definitely yeah it definitely helped get me that job I'm 100% sure of that.

I just kind of assumed that a professional person wouldn't want to see my silly toy project but I was wrong.

Yeah I'll work out loud even the silly stuff especially the silly stuff it's a better conversation starter. I have gotten two interviews because I put that I played roller derby on my application or on my um it was on my resume it's the bottom of my resume like I play roller derby that's the fun thing about me and twice I had interviews where I got the job where they were like we literally only just wanted to talk about roller derby like we we saw that you played it and like we thought you're like the fun person we put in our stack.

Scoping side projects

How do you scope your projects or do you even like have a concept of scope when you go in how do you know when you're done or what do you define is done and do you know this before you start or you're just kind of like this is cool and then eventually you wrap a project around it?

I think you will see by my lack of cran and my lack of or my big uh my abundance of incomplete repos that I don't scope out very well um I basically have a random idea while watching walking the dog or watching tv or seeing what someone else did get super excited and obsessed over for a few days or a few weeks depending on how big the thing is I definitely do not plan this stuff out well. And then I just build it and build it until I see can I do it and then if it's a cool thing to share I put it out show my friends or show the internet or show my co-workers.

Sometimes a lot of times I get bored before I have a complete thing to do but I've still learned something from it um but yeah in general I'm not a very good planner I love having a product manager at work to help do all that stuff I like proof of concepting things and moving really quickly and seeing what's possible um but once it gets to productionization I start and then you have to like worry about all the details and make it perfect to make sure it works out all over the place I struggle there.

Sharing imperfect work publicly

How do you shut up your own inner critic and just make it happen?

Yeah I don't think I actually have shut it up I'm almost always second guessing stuff I put out there and like back when I was on Twitter a lot I had deleted so many things not even just data science for that I'm like oh this is a good idea and like no never mind this doesn't need to be said. But then yeah like sometimes I just get really excited and my family isn't can't they're not nerds they can't show their excitement I just need to find like the correct audience for even I know it's not perfect but like data scientists are all they love imperfection they love incomplete things.

A lot of my side projects were inspired from other people putting things out there even putting incomplete things out there and me seeing what I could do with it and spin it off so I guess that this when it comes to sharing incomplete stuff I don't think about it at all I just like that's the natural thing you do you're creating things and I know I'm never going to perfect it so it's either hide it away and or just put out what you have there and it might inspire someone else or someone might learn from it.

I don't think I ever made an active decision to shut myself up from or like be afraid of not being afraid of releasing stuff I think it just came naturally seeing everyone else around me doing it and there is no risk to putting out silly ideas out there if they're just fun side products of you learning.

LLMs for data science coding

What when do you recommend not using LLMs and why when is it a trap?

Yeah and again I'm not an expert in LLMs I am using them I'm trying to figure out how to use them just like I think everyone else is um I still want to have fun doing data science and I don't want to offload all that fun to an LLM so I'm still a little stubborn with if I know how to do something and I like trying to or I I know I could figure out how to do something I like trying to figure out myself and then maybe using the LLM to help me with like small discrete unsuck things.

Um they're great at speed they're great at when I'm feeling lazy like sometimes when you're doing tons of data processing and you know you have this 40 line pipeline you're gonna have to write in dplyr be like I know I can do this let's just have the LLM do it um but if I'm trying to expand my knowledge and you're like really get close to the data and or try to learn something new I'm not relying on the LLM that much.

See I feel like I'm still writing a lot of manual code especially if I'm writing stuff in R I'm still doing 99% of it myself um the Python stuff though it's just I could be staring at a blank page for a very long time until and then or I could describe exactly to the LLM what I want and then I try to fix it. Yeah um I don't know I still like having fun writing code so I write more than I probably should but I don't know I'm still learning how to use LLMs best for my work and I think we all are and especially as the LLMs get better I'm sure I'll start shifting it more and more.

We all are absolutely and I think for me the trap comes when I go into it thinking this will save me time because it never does. And what I should have done from the beginning was use the LLM to help me build a foundational understanding that I need to build the thing myself um or the foundational understanding that I need to check its work when I ask it to build functions or bits and pieces for me.

Bayesian MMM tools and pipeline

What tools for Bayesian analysis do you use do you stan and BRMS?

Yep so tidy models works very well with our stan arm before that I used stan which I enjoyed the extra low level of flexibility it gave me but when it came to like define the models and forcing some coefficients to behave more than our stan arm lets you but I kind of liked how our stan arm has a philosophical restrictions on what transform or distributions you can use so because their thought is if they don't support it then you're kind of cheating with some of your distributions like you're truncated normal you could force everything to be positive or negative you want but then you're just kind of you're losing confidence in the model results so our stan arm for sure through tidy models and I've learned to love it slowly.

Does Lego use an open source MMM package like Meridian or Robin or do you like build from scratch use an external vendor?

Yeah when I joined we had some Robin wasn't out there yet because it was 2019 but uh we had we have external firms and agencies that run MMM for a lot of our markets but they're very expensive and so for and eat a lot of some pieces of the marketing budget so I built one internally or we had to build run some MMM models internally. Uh my team and I chose to develop our own um and I'm on the second iteration of that right now and again I use tidy models as the foundation of all that uh was really what I needed at the time to have all the control over the inputs I built a lot of wrappers around that um so we're using an internally developed package in R right now.

Um and I love it and I it works very well for us and so I I like maintaining that um it is always discussion what the next generation that's going to be um but we have so much in our pipeline right now that I think my package is safe for a few more years uh it just works very very well we can control everything we need to control for it.

So yeah I'm very proud of it so and yeah I think I talked about this a little bit in the 2023 talk that was about the development process of that MMM.

We I had to write a bunch of custom step if you've seen tidy models there's a lot of process like it's basically setting up your model and then you're running it independently your recipe and you're running independently and there's step functions I had to write a lot of custom step functions to fit our data transformations that we needed to do for MMM. But the biggest challenge we have with MMM um besides the priors and running the models and getting all your hyperparameters is that we are at every time we have thousands of data points that we could or data series that we could put into models um a lot of them are highly correlated a lot of them are different ways to measure the same thing.

So if you have marketing you can see you can see your investment or you can see your impressions which are one for one correlated you can't use both so we need a lot of ways for the data scientists and the business expert to know which ones to turn on and which ones to turn off so I built a bunch of wrappers that make it very easy to kind of set your transformations turn your toggle your features say what transformations they should have because you need some s curves for if it's uh if you're expecting diminishing turns and decay sometimes you just need some c curves.

Communicating model results to stakeholders

How do you communicate the nuances of MMM like priors um Bayesian versus frequentist whatever to stakeholders who may not be technical or do you even communicate any of those nuances do you stay higher level than that?

Yeah so I have a wide variety of stakeholders with a wide variety of interest in the details and how much information they want to know about the back end and I have some stakeholders who I've only worked with for a few months some who I've been working with for five years and who I'm more comfortable sharing more and more details to without overwhelming them. Um I don't think any of them have had asked the Bayesian versus frequentist question because that's a very very niche question to people who are in statistics every day.

Um but in general we have to explain to them that every kind of variable has a lot of different transformations we need to apply a bunch of assumptions about carryover impacts and diminishing returns and the size of the impact and everyone seemed to get those very high level terms so we have to basically have a bunch of inner mappings in my brain or on slides to link these very complicated data science transformations into sentences that they might resonate with the stakeholders.

Everyone's also like always very excited to hear about how we're using their data in our models and trying to get as much information at this as they can. I think MMM is tough because since we're trying to cover every single data point we can't really drill into every super specific question they have at all times so it's about training them about the best way to use the model results and then once we deliver our results how they can integrate that with their multi-touch attribution their last touch attribution or their other attribution models.

In general I love talking to stakeholders and sharing as much as I can about them and it's really cool to see them get excited um I've just I've learned that when I first started they'd ask they'd ask like technical questions I'd go all in on the very technical details I'd lose them right away so I now have to I have methodology slides and like different detailed methodology slides or presentations I can give them and I try the easy one first and if they're all on board and then I can have another conversation with them going another layer down.

But usually what I find is I think of the high level ones they're usually good enough for most people they want to feel like they understand the process um but yeah math and statistics is complicated and so uh it's very easy to lose them if they tell them too much information.

Domain passion and career advice

Does your love for lego help you build code better like scoping sectioning loops modularization?

I think there's probably a real thing there um again understanding how pieces fit together and understanding how to link different groups and splitting up correctly I I think I could see that um slightly unrelated I think my love for lego helped me do my job better in general just having being extra excited about the data that we're using so if you don't care about the brand.

I reversed my old job I did a lot of market sizing and forecasting for tech companies I was doing some similar work not mmm but a lot of modeling and our code and I did not care about the domain at all um it's about like networking pieces and um cpus and just shipments of computer components I wasn't passionate about the domain so I think I was not passionate about the work and then once I got the job with my current company I was doing data science which I love doing and I was working with data that I was actually excited about.

Um I mean if you know our portfolio we do star wars products we have jurassic park products I get to work with data every day that is all marvel pop culture and really fun data so um having that enjoyment with what I'm building really motivates me to do to enjoy my job more and do better at that.

I again and I know some of it's luck just finding for me just being passionate about the domain I'm in has helped me so much with my career because like I'm excited to do the analysis I'm excited to do the work I just really like talking about my job and when I'm producing now so I love talking about it I love showing it and I think that passion has shown and my bosses appreciate the passion so I think that has helped me more than anything.

I'm not the strongest data scientist I'm definitely not the strongest ML person and as you know from my Python talks I'm not good at Python but I'm really good at just being passionate about the data knowing the tools to turn business questions into data science tools and then turn those back into the business that's what works really well for me and I would not be excited to do that if I was like in the tech industry that I left like I just do so well at a company that I really enjoy and I'm passionate about so not concrete advice but try to find that passion.

I'm not the strongest data scientist I'm definitely not the strongest ML person and as you know from my Python talks I'm not good at Python but I'm really good at just being passionate about the data knowing the tools to turn business questions into data science tools and then turn those back into the business.

It's still I mean it's like do a temp check on on what you enjoy and what you don't enjoy I guess is the the follow-on that I would I would have like knowing what you don't like to do can sometimes be just as valuable as knowing what you do like to do so one thing that I did was I plotted all of my jobs on an x y axis of happiness and time so like over time here are all of my jobs and then the y-axis was happiness with smiley face icons and I went through and plotted and then I went through and asked myself what made these happy jobs happy what made the sad ones sad um I recommend everybody go do that.

All right everybody I have had such a good time Ryan thank you so much for joining us I hope you had a good time. Yeah no this was way more fun than I I was stressed but now I'm good. Yeah oh we made Ryan feel not stressed well we're gonna get Ryan on the discord server if you are on the discord server make your way to the data science hangout channel and tag me so I can make your name yellow give you a little badge that says dsh.

All right next week Kevin Dalton will be joining us senior data scientist at Great American Insurance Group and then we have Thanksgiving which is off I will not see you next week Rachel will be here with you Isabella will be here with you um and I'll see you the week after Thanksgiving hopefully. Thank you so much for being wonderful if you want to save the chat there's three dots in the top that will let you do that it'll save it as a txt file and of course if there are any resources that we can go save over in the discord everybody go do that. Okay because it will be the place where we can continue conversations and we won't just have an ethereal chat that goes away every single time all right everybody I will see you in a couple of weeks but come hang out with Rachel next week with Kevin Dalton okay bye everybody.