Dara Straussman @ Stripe | Partnership-model when forecasting | Data Science Hangout

We were joined by Data Straussman, Data Science Manager at Stripe. Dara's team focuses on forecasting, machine learning, and analytics in the Finance and Operations spaces. We were able to get a glimpse into their financial modeling methodology at Stripe after a question from Alan at 12:59 Question: How do you get buy-in from business stakeholders who historically would have been setting targets (like salespeople and their leaders) that the forecasting you're doing seems like a realistic enough picture of the world? Answer: It's an ongoing evolution for how we do this interaction because it's very important to get the input from stakeholders. There might be a big sales deal coming down the pipeline that no statistical model will know about. There might be some product launch plans that we're just not going to be able to learn from history. So we find that interaction very important. When we started doing the – we call it the statistical forecasting and the adjustments — it was all done in spreadsheets by finance folks. We did really have to prove our value and show how a statistical model will actually help. So our current system: 1. We have the statistical model that's all done by data science. 2. We do backtesting and reports to be able to build confidence in each level of the hierarchy so that people can understand where do they expect accuracy, where do they expect less accuracy. 3. We have a whole process with the finance and strategy team, where they make adjustments to the forecast and are pretty clear about where they are doing adjustments and how it's affecting the total. 4. Then we track both - the statistical versus the adjusted forecast. What we've learned over time is that making adjustments in certain ways or certain places is very important and very effective. We sort of encouraged the finance and strategy team to focus their efforts there. Whereas in other places, the statistical model is quite accurate, and it's best to spend effort elsewhere. So that's the baseline of how we think about it– again, ever evolving. We also have an embedded model so we sit with finance and strategy. We work really closely with them and really understand their problems. It's very much a partnership-type model rather than a "throw a model over the wall" type model. ► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu Follow Us Here: Website: https://www.posit.co LinkedIn: https://www.linkedin.com/company/posit-software Twitter: https://twitter.com/posit_pbc To join future data science hangouts, add to your calendar here: pos.it/dsh (All are welcome! We'd love to see you!)

Mar 15, 2023

58 min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Welcome to the Data Science Hangout. Hope everybody's having a great week. I'm Rachel. If we haven't met yet, let us know if it's your first time joining in the chat so we can all welcome you in.

This is our open space to chat about data science leadership, questions you're facing, and getting to hear about what's going on in the world of data across different industries. Every week we feature a different data science leader as my co-host here to help lead our discussion and answer questions from you all, and together we're all dedicated to making this a welcoming environment for everybody.

Thank you so much, Dara, for joining us as the co-host today. Dara is a data science manager at Stripe, and I'd love to have you introduce yourself and maybe share a little bit about your role and also something you like to do outside of work.

My name's Dara. I'm a data science manager at Stripe currently. I've been at Stripe for about five years now. My current role is leading a team around finance and operations data science. So we are very into forecasting. We do a lot of work on hierarchical forecasting, financial forecasting at Stripe, spend a lot of effort there.

We think a lot about metrics and overall company metrics and analytics and how we can understand our business performance. And then we work in the operations space as well. So we are working on some various machine learning projects there as well as capacity planning and forecasting on that side too.

Let's see, one thing I like to do outside of work, I used to be kind of an endurance sports person, you know, long distance triathlons and running, but I now have a 15 month old daughter who keeps me pretty busy and spend most of my time just parenting.

Hierarchical forecasting at Stripe

So, um, at Stripe, we're interested in forecasting as many businesses are in understanding how much payment volume are we going to have next year? How many active users will we have? What will our uptime be? There's a number of forecasting problems that come into play that, you know, are a business advantage. If we are good at them, we can set the budget, we can set our targets, we can direct our efforts appropriately.

I really, as a side note, I like forecasting as a data science area, because it's not one that you always learn about in a machine learning course or spend a lot of time thinking about in your more formal education. I didn't at least. But now in, you know, sort of on the business side, it's an extremely practical business problem to be able to solve at the same time. It's very interesting statistically.

So we do what's called hierarchical forecasting. This means that we want to have a forecast for Stripe overall. We want to know, you know, what's our total payment volume going to be for the whole company. But we also want to know what is it going to be for each region? What is it going to be for each business type? What's it going to be for each cohort? There's all these dimensions that flow down.

So you can kind of think of this as a hierarchy, right? Where all of the dimensions then need to add up to the total, right? We have to have a coherent hierarchy. That's how we talk about it.

And so we use this method where we can actually choose any forecasting method we want. For each node of the hierarchy, you know, we use a lot of classical forecasting methods like ARIMA and exponential smoothing, we also experiment with other types of methods. And we use those to make estimates and then we apply sort of a meta model on top of it, hierarchical reconciliation, to basically tweak each forecast as little as possible so that they all still add up to the total forecast. And that then, you know, we apply some additional adjustments on top of that, and that becomes the budget for the company.

Mark had a follow-up question on that. I'm asking, just curious about your, um, and you kind of went into a little bit of detail there, but curious about if you're able to expand on the target setting methodology, and if you dip into any, I guess, like non-financial targets and like, I guess, different level setting.

Yeah, we do. We're actually building a platform to do general hierarchical forecasting for any metric at Stripe. And so we've expanded this. We do some sales planning right now, which is a little bit financial. We do it for our active users and we're looking at things like uptime as well, like how much should we expect, you know, Stripe to be live. And we also do a lot in ops too, so capacity planning for our support volume, support tickets.

We've found that the methods are pretty similar across all of these types of problems. There's a lot of nuance in terms of making sure that you capture the data correctly and actually solve the business problem, right? You can't just sort of throw a general forecasting method at something. But we have found in general that the hierarchical format works really well in a business setting and that you often care about some more detailed dimensions and then you also care about adding up to the whole.

Exploring GPT and NLP in support

Yes, we are playing with it. We actually were playing with it before it was cool. Some went back when it was just GPT three before some of the hype this fall, we are experimenting with it in our support flow. So to be able to help agents answer questions more accurately and much more quickly for our users. So we are still early days, but we're pretty enthusiastic.

We've done a lot of NLP work generally on the upside. So we classify our issues using BERT, which is sort of a previous generation natural language processing model. And you know, that allows us to know what kind of questions are getting asked and what topics our support issues are about, which we then use to direct a lot of internal efforts. Like where should we be making updates or bug fixes in our product based on what the support tickets are about.

Career path: from computational immunology to fintech

I did my PhD in computational immunology. It actually started out as a PhD in pure immunology. So I was a wet lab biologist, you know, pipetting and doing experiments. I realized pretty quickly in my PhD, I took an R class and a Python class. And then I took more CS courses and I did more statistics and machine learning.

I realized pretty quickly that that was a direction that would enable me to be a better biologist. And also just an area of a lot of interest for me. So by the time I finished my PhD, I was doing fully computational work.

When I finished up, I thought, what do I want to do next? I really enjoyed academia, had a great time there, thought very seriously about academic postdocs. And then also thought, you know, I really liked this data side of what I've been doing. I really liked the statistics and the ML and all the programming that I've been doing. And so I looked at data science positions as well. So I applied to both.

And in the end I decided to take a data science position because I felt like it was much more different than what I had done before and that I would learn more in that position just because it would really force me to continue to build the skills that I had started to develop in my PhD and put me into an entirely new problem space that I hadn't been before.

So that's what I did. I went first to a small health insurance startup called Collective Health. I was there for two years. A great thing about going to a startup is you learn a lot of data engineering. And so I spent a lot of time building pipelines and really getting to know, sort of bolstering how you actually write code in an environment with other people, right. When your PhD, it's very much just sort of you and your notebook. And this was now actually having to write in a more production environment.

So learned a lot there and I did health insurance thinking that it would be somewhat related to my PhD work, which was on HIV and the immune system. It turns out not really that much overlap between those two subject areas. But I really loved the data science side of things and I really loved having impact in business and being able to see the results of my work more directly.

So after a couple of years, I decided to move over to Stripe. Stripe was very appealing because it was big enough to have invested in data infrastructure. There was a ton of data available. I was going to be able to do more statistical and ML type work, but it was still small enough that there were a lot of unsolved problems, which is actually, you know, five years later, still true with Stripe today.

And so that's what made me decide that I was more interested in the data problems rather than focused on a particular domain. It has turned out that financial payments are a complex system, just like the immune system is a complex system. So even though I do not use my domain knowledge from my PhD that much, I still use a lot of the skills that I learned. I still use a lot of how to present, how to do research, how to think about problems, how to break down complex problems, all the time.

If you think about it the right way, you know, skills are pretty transferable.

And so, you know, really have found that if you think about it the right way, you know, skills are pretty transferable.

Partnership model for forecasting buy-in

I'm really interested in the part of the forecasting where if you have to reconcile like SME input from people who like historically would have been setting targets like salespeople and their leaders, what's the process for buy-in from them that the forecasting you're doing seems like a realistic enough picture of the world that they buy it and that they'll go sort of work along it. Or if their input gets incorporated earlier in the process somewhere, so they're brought along kind of from the beginning in a way, just curious how that kind of gets negotiated out.

Super interesting question. And it's an ongoing sort of evolution for how we do this interaction because it's very important to get the input from stakeholders, right? There might be a big sales deal coming down the pipeline that no statistical model will know about, right? There might be some big product launch plans that we're just not going to be able to learn from history. And so we find that interaction very important.

And when we started doing this too, when we started doing the, we call it the statistical forecasting and the adjustments, you know, it was all done in spreadsheets by finance folks. And we did really have to prove our value of how does a statistical model actually help.

So our current system is we have the statistical model, that's all done by data science. We do backtesting and reports to be able to build confidence in each level of the hierarchy so that people can understand where do they expect accuracy, where do they expect less accuracy. And then we have a whole process with the finance and strategy team where they make adjustments to the forecast and are pretty clear about where they are doing adjustments and how it's affecting the total.

And then we track both the statistical versus the adjusted forecast. And what we've learned over time is that making adjustments in certain ways or certain places is very important and very effective. And we sort of encouraged the finance and strategy team to focus their efforts there, whereas in other places, the statistical model is quite accurate and it's best to spend effort elsewhere.

We also have an embedded model, right? So we sit with finance and strategy. We work really closely with them. We really understand their problems. It's very much a partnership type model rather than a, you know, throw a model over the wall type model.

It's very much a partnership type model rather than a, you know, throw a model over the wall type model.

There's a whole literature about adjustments and the best way to do that. And we applied some of those things, but there's so much more that I think we could do to really maximize our effort.

Centralized vs. embedded data science structure

So general data science team, and I'm using sit with, to be clear, I'm using sit with someone here. We used to sit with them when we were fully in office and now it's sit with them on Zoom, but centralized data science team. So we have a hybrid embedded model. Data scientists have a partner team that they work really closely with, right? So I'm on the business side with the finance and operations teams. Others might work with product or engineering teams.

And we really act like members of that team, get to know their problems, attend their meetings, their offsides, understand what's going on with them and do joint planning together. And so we like to think it's best of both worlds, making sure we are solving real business problems. But the centralized data science team keeps us pushing the envelope technologically and innovatively as well.

Building stakeholder trust and handling forecast misses

As much as you can bring in confidence intervals or backtesting, you know, statistics and show here's what we expect, those of course are only as good as your input data. And that can be challenging when you're first starting out, but finance folks are very quantitative, right? They're very data oriented. They want to know how, at least our finance folks are.

So that helps, and in our experience, it did take some time just to build trust. And so it was making sure that we had all the very detailed tracking in place. And so it wasn't saying, hey, we're going to replace your model. It was saying, let's run them in parallel, right? And let's just sort of track, you know, the statistical model, the finance model, maybe some combination of them. We want to do an adjusted model. And let's look for six months at how those go and see, let's just sort of be objective about it if we can. And that ultimately was very helpful for us in building some trust and confidence, but it is a relationship that just does take some time to build.

Ah, I would love it if we had Shiny , we don't have it set up at Stripe. So we have our own internal rolled forecasting, not forecasting, dashboarding tool. That is a SQL based and allows us to sort of build pretty nice dashboards that way on our data. So we use a lot of that.

We'll sometimes make sort of bespoke plots in R or Python and share those. We do a lot of PowerPoint or Google slides presentations as well. And sometimes when we're working directly with FNS, there certainly are sort of queries and spreadsheets that we try to speak their language where we can. So while we might operate in more of a SQL land, they might be operating in a spreadsheet land, and so we try to interface between those things.

IC vs. management and the coding question

These days, not as much. When I first transitioned, you know, managing a small team, I still did about 50% coding. And now I manage usually between eight to 10 folks, so I do more pure management stuff.

I think the key question to ask yourself, if you're thinking about IC versus management is how motivated you will be by enabling other folks to do work versus how motivated are you by solving problems and shipping work yourself. And most people would be motivated by both of those things, but it's really sort of trying to figure out for yourself where you want to prioritize spending your time and how helpful each of those roles might be to you in achieving that goal.

Model sophistication and forecasting methods

So our forecast, I think the way that we do the hierarchical reconciliation is very important. And that has certainly increased in sophistication over time. Underlying models themselves, we actually don't use anything most of the time that's super fancy. We're not doing some of the things that are winning the M5 forecasting competition these days. It's sort of bread and butter ARIMA and exponential smoothing for the most part, we'll play with other models too.

And we have found that for our particular business problem, that works really well because we have a decent amount of history. We have very important to capture seasonality. We're doing monthly forecasts. So we are able to capture what we need to capture for those models.

The choice of forecasting model is so dependent on the problem that you have and there are so many things to consider about your input data. I think if we had a lot of external regressors that we were trying to model, a lot of sort of parallel processes happening, then probably different kinds of models, potentially more of the deep learning forecasting models that have come into play recently, might be really helpful to us as well. But for the most part, a lot of bread and butter forecasting models work well for us.

Handling COVID and macroeconomic factors

Most forecasting models look at historical data to predict the future, right? So that is assuming that the past is the future, right? And of course we understand that can change, right? So for example, when COVID came along, it disrupted many models, right? Nobody could have forecasted that COVID would have had the effect that it had on businesses, right?

We do include external regressors. And now of course, like we didn't predict COVID, when I look back now, forecasting was so easy pre COVID and now post COVID it's a whole different ball game. So now we do include regressors that try to account for the past COVID changes in our data. What we don't want is to continue forecasting more COVID in the future.

We also do factor in macroeconomic variables on top of our statistical forecast. We do sometimes do scenario based planning as well, you know, baseline and here's sort of the upside or the downside scenario to try to give the business some framework for how to think about possibilities.

With the hierarchical forecast, the general pattern is that we are not that accurate at the lowest levels of the hierarchy generally. For forecasting one cohort, one region, one business type, that gets to be a pretty small set of data. We don't have as much history, or there may be erratic patterns in it. And so we don't have as much accuracy down there, but then when you add up to the total, you have a much more robust data set and we tend to be more accurate at the higher levels of the forecast.

We do a lot of tracking, we create a lot of dashboards. We try to be really clear with where we're doing well and where we're not. We do re-forecast throughout the year to be able to sort of update our best guess on where we think we're going to land given more data has come in throughout the year.

And then we just sit side by side with finance and strategy and do the analysis and try to understand, you know, when the forecast was off, what are the segments in which it was off? Can we identify the root cause for that? And how can we improve our forecast model for the next iteration? So no magic bullet certainly, but a lot of continual efforts to try to improve over time.

Managing a remote and distributed team

I have folks in Dublin, Ireland, in New York, DC, I'm in Minnesota, folks in SF and Seattle. So it's very distributed, have been for a while.

I talked a bit about the early stage technical feedback meetings that have actually been a surprisingly useful thing to do for just making sure that we have that sort of weekly touch point with the team. You know, you can do Zoom happy hours and those kinds of things, and those are nice, but it's nice to have a little bit of structure and topic to know what we're trying to do and to be able to have somewhat of a technical conversation in a very low stakes environment.

We don't expect you to have a polished presentation, but much more of a brainstorm type atmosphere. So having that touch point has been really nice.

Snacks, people love snacks. So I will, there are a number of services that you can send snacks to people that, I think there's a certain nice feeling that people get by sharing the same snacks on Zoom.

Focusing on onboarding, that's another big one. With when folks start remotely, that's a very different experience than it used to be where they were in the office, you know, among everybody in a starting class, seeing everyone. So I focus really closely on new folks. I meet with them every day when they're starting. I set up meetings with the relevant people that they need to meet. I make sure they have the resources that they need. I focus the first month or so just focus really closely on new folks and making sure they get integrated. Cause it doesn't happen naturally in a remote world, you really have to work at that.

Onboarding practices

We do have spin-up buddies. And so people get paired up with an existing data scientist or data analyst that they can meet with and ask any question to at any point. There's a Stripe wide sort of onboarding curriculum that teaches people about payments and the Stripe product. And then we have a data science specific curriculum as well. You know, how to set up your environment, how do we use Git at Stripe, some of the more technical classes that we run.

We have a spin-up project. So that's the other thing that we do very intentionally. We plan a sort of well-scoped project that we give to new folks that should allow them some room for exploration. It should be important to the business, but not business critical need it next week type of project so that they can ship something that is important, but not feel that sort of time pressure immediately. They need to have some time to spin up on our systems and meet folks.

R and Python usage

We used to, all of our forecasting was actually in R, recently Python has stepped it up a bit and we've been doing a lot more in Python. Part of the reason for that is that Stripe is somewhat more Python based in our infrastructure. And so doing it in Python allows us to integrate with the rest of Stripe systems much more easily.

We do support R here. We have R notebooks. We have the ability to schedule jobs in R. But I'd say it's more like 20 to 30% of data science usage and Python is a little bit more. R is close to my heart, that's the programming language that I started out in in grad school. And I could live in the tidyverse all day, but we use it a bit more for sort of analytical exploratory analysis work. I would say in Python, a bit more for in production jobs.

Career advice: choose the wild card

Something that sticks in my mind is to think about choosing the wild card when you're presented with multiple choices. So for me, when I was choosing my PhD advisor, I had two rotations that I had planned ahead of time that I knew I was very interested in the research that would go really well. And then the third one was this brand new professor, I would be the first grad student. It was a little bit outside the kind of research that I had done before. So I thought, you know, I'll just do this wild card rotation.

And of course, that was the lab that I ended up joining. I was the first grad student and ended up having just this fantastic experience with my advisor and with the research. And that also has held true as I've made other career decisions along the way, you know, choosing to go to Stripe rather than stay in something in biology, choosing to go to data science rather than do an academic postdoc.

You know, sometimes that wild card choice can feel a little scary, but also kind of enticing. And for me, just trying to not be afraid to choose that option among choices has worked out well.

Sometimes that wild card choice can feel a little scary, but also kind of enticing. And for me, just trying to not be afraid to choose that option among choices has worked out well.

Recommended books and frameworks

One of my favorites is The Manager's Path by Camille Fournier. She was the CTO at Rent the Runway. And she takes you through how you can think about your career from, as a manager, from managing an intern to managing a few folks, managing a larger team, all the way sort of up to being an executive. I think that's very useful both for ICs and managers to understand how to think about your career at various levels. It's certainly one that I revisit from time to time as I've moved through my career.

The High Output Management book by Andy Grove is a classic. It's one that we reference a lot at Stripe as well. These are more management focused, but I think good resources for thinking about your career.

One framework that I have been using in career conversations recently is kind of interesting. It's called the SCARF framework. It's a neurobiology thing, and it's about how you respond to stressful situations. It's things like connectedness, autonomy, relationships, and thinking about what's important to you in your work. And so I use that and sort of ask people to talk about what resonates with them in the SCARF framework. And it's funny how different people really value different things when it comes to their work and what's important to them. And so I found that a very helpful framework for learning about folks and for folks to sort of self-identify what's important to them.