
Forecasting AI Demand at Microsoft | Sajay Suresh | Data Science Hangout
To join future data science hangouts, add it to your calendar here: https://pos.it/dsh - All are welcome! We'd love to see you! We were recently joined by Sajay Suresh, Senior Director of Data and Applied Science at Microsoft, to chat about data center supply chain planning, forecasting AI demand, and navigating data science careers. In this Hangout, we explored how the emergence of technologies like LLMs changed projections for data center demand. Sajay discussed how forecasting for something with little historical data, like AI demand, required drawing analogies from the past, such as comparing the training/inferencing model to the iPhone and its App Store. A major complexity in current supply chain planning is the lack of fungibility with modern GPUs requiring specific infrastructure like liquid cooling, meaning data centers designed for GPUs cannot easily be repurposed for traditional compute/storage workloads, increasing investment risk if demand is lower than planned. Resources mentioned in the video and zoom chat: LLM Workflow Demo with Joe Cheng → https://pages.posit.co/05-28WorkflowDemo.html Posit::conf 2025 Virtual Registration → https://posit.co/blog/posit-conf-2025-virtual-experience-registration/ Sajay Suresh on LinkedIn → https://www.linkedin.com/in/sajay-suresh-12687631/ Find mentors on ADPList → https://adplist.org/ Officeverse R packages for Office documents → https://ardata-fr.github.io/officeverse/ Microsoft team meetup video on capacity planning → https://www.youtube.com/live/07j22d4B_hA?feature=shared Seattle Data And AI Security community → https://www.linkedin.com/posts/seattle-data-and-ai-security_microsoft-fabric-tour-seattle-data-ai-security-6891902675280633856-xLw3?utm_source=share&utm_medium=member_desktop Quarto Gallery → https://quarto.org/docs/gallery/ Quarto Guide → https://quarto.org/docs/guide/ If you didn’t join live, one great discussion you missed from the zoom chat was about communities and meetups recommended for networking and learning in data science. Participants shared various groups like R-Ladies, Data Book Club, local tech meetups, and specific conference recommendations like Shiny Conf and DataConf.ai NYC. What's your favorite data community? ► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu Follow Us Here: Website: https://www.posit.co Hangout: https://pos.it/dsh LinkedIn: https://www.linkedin.com/company/posit-software Bluesky: https://bsky.app/profile/posit.co Thanks for hanging out with us! Subscribe to posit::conf updates: https://posit.co/about/subscription-management/
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hey there, welcome to the Paws at Data Science Hangout. I'm Libby Herron, and this is a recording of our weekly community call that happens every Thursday at 12 p.m. U.S. Eastern Time. If you are not joining us live, you miss out on the amazing chat that's going on. So find the link in the description where you can add our call to your calendar and come hang out with the most supportive, friendly, and funny data community you'll ever experience.
I would love to introduce our featured leader today, Sajay Suresh, Senior Director of Data and Applied Science at Microsoft. Sajay, how are you today? Can you tell us a little bit about you, what you do, and what you like to do for fun?
Absolutely. Thank you, Libby, and thank you, Libby, Rachel, and Pawsit for having me on this forum. Can I say one thing before I get into it? I love the community vibe over here. It's different than most talks or conversations I've attended. I feel like people are a lot more comfortable over here, so congratulations on building an amazing community.
Yeah, so a little bit about me. I'm at Microsoft. I run an Applied Science function at Microsoft. My team essentially runs cloud infrastructure planning. So we are a bunch of data scientists and software engineers who create forecasting models for cloud and AI and figure out how to get the data center infrastructure for that in place. So I'm hoping it's touched your lives in some way or the other. If you have ever used a Microsoft product, be it Xbox or Windows or Azure, AI, whatever you have used, hopefully my team has had some role to play in it.
Background and career journey
You know, my career started around 15 years ago, right? It was a company called Mu Sigma, which was trying something called data science at the time. Data science wasn't really obvious as an industry. It would be surprising for a lot of folks over here, for y'all younger in your career, but data science, it wasn't really clear, but there's going to be a niche small industry or a really big industry that every company needs. So it started back in Mu Sigma 15 years ago. I spent two to three years with them where I myself got introduced to the idea of data science and how decisions are taken using data.
Post which I moved to BCG and BCG was interesting because BCG is a Boston consulting group. As most of you may know, it's a big management consulting firm, but data science was interesting for them, right? It's not the most obvious place for a data scientist to be 10 years ago, but that is exactly the reason I joined them because they were trying to build out their data science practice. What they had was data science as a small service with them, but the idea was to bring folks who are well aware with data science and consulting specifically to build out their data science practice. So it was amazing experience at BCG to not just, you know, be with fortune 500 companies trying to solve strategy problems, but also have like a startup within a large company and try and build a startup out.
So now if a lot of you folks are aware, it's called BCG X. I think it's rebranded itself from BCG MR to BCG X and it is a separate entity within BCG separate from the business management consulting. And that's when I consulted for Microsoft and saw that there was an interesting role in tech, which was around the data center supply chain. So seven years ago, I switched over to Microsoft.
As a data scientist, we were a two member team trying to figure out how to even plan for data center supply chains. Just to give you folks a sense of what a data center supply chain is, right? Imagine if say I'm in Seattle, right? And imagine if I want a new data center in Seattle and that's what I need if I need to deliver cloud and AI, right? It would take me three to four years to get a data center life. So my team essentially is trying to predict what the world would look like three to four years down the line so that we can make the investments today and Microsoft can capitalize on opportunities for their customers at that point of time. So when you see AI being big today, I bet I have caught it three years ago. Otherwise, we don't stand a chance of serving it. And as most things in forecasting, I kind of caught it and kind of didn't catch it. That's what happens in forecasting.
Evolving as a data scientist at Microsoft
The one thing I will say, it's tougher to be a data scientist, early state data scientist at Microsoft now than it was when I joined. I think data science and that is not just specific to Microsoft. I just think there's been an increased maturity in the expectation out of a data scientist over the last few years, which has developed. That means I think it's tougher to get into Microsoft, but the whole industry is that.
The one tip I would give myself if I joined Microsoft again was move from consulting to product very quickly within a tech company. So when you're in a tech company, we love software. We love systems. We love data products. We like consultants. Don't get me wrong. We like consultants. But what the tech companies are used to are system set up to run processes. They like to bring in consultants from the outside. So for me to succeed, it took me, I think for me and my team, I'd say to succeed, it took us a couple of years to figure out that we may be a talented set of data scientists, but for us to consistently add value to Microsoft, we need software and systems and we need to make it a data product that people can rely on and can plug into other processes.
The one tip I would give myself if I joined Microsoft again was move from consulting to product very quickly within a tech company.
Tech stack and deployment
You know, finally, when I started off coding, it was SAS. Maybe a lot of people, you folks right now don't use SAS, but that was a prevalent technology. And then I spent a lot of time in R, right? I love R. I'm a big fan. So I try to find creative ways to do things in R, even though they can be done better in Python. I'm like, nah, you know, I can do this better. But I know heart of my heart, there's some processes that Python is just better off. So I'd say R and Python are my primary go-tos.
We have a lot of statisticians, they're big on R. So we are a R shop too. Python is, of course, something you just have to have as a data science shop, right, with all the packages available there. So Python's huge. C Sharp is another engine that we use pretty heavily for the optimization part of algorithms. That does exceedingly well when you really know what kind of algorithm you want and you want to write custom algorithms. I think Python is really good when you want to use off-the-shelf algorithms. But if you have custom algorithm that you know is right for your business and is different from those traditional packages, think about C Sharp engine because of how much flexibility it gives you in setting up the algorithm.
One of the biggest sales for me for Posit, just using Posit Connect on my team, is that my data scientists have now become developers because they can deploy code and I don't need a separate dev team to go productionalize the code. One of the problems we had maybe like four or five years ago when we had separate dev teams was that a data scientist used to develop the models, including me at the time. We used to then send it to the dev team to go to production, which used to take another two to three months. And by that time, the model we wanted had changed. So that's where I think Posit Connect is a big enabler to convert my data scientists into developers and save me a lot of dev time and value that way.
Forecasting AI demand and data center supply chains
Great question. I'll tell you, I think data centers were always talked about before AI, but they're talked about way more right now as the enabler in the world of AI, because of a couple of things. One, more power requirements, absolutely. But also that, you know, if you think about large training sites for these LLM models, they need to be contiguous. They need to be in one place so that all the GPUs can learn together, run together, train the model, and they slide down together.
So back in, to just take you back in time, like around 2021, we did have something called OpenAI in our systems already, right, training their models. But we had no idea how big it would be. That was a very closely guarded project within Microsoft, including my team didn't know what it was other than being a business project. When we saw AI on what potential, and this, I think that one of the first applications was the image application they had, right, Dolly, which came out, which is publicly available. And I saw that and I'm like, wow, okay, this is game changer, right?
And it was really difficult. Now imagine three years ago, today, at least, you know, AI agents, you can see the applications. Two years ago, when chat GPT has just come into this world, you're like, the optimist in you is like, oh, the world has changed. The pessimist in you is like, oh, you correct natural language processing. Okay, yes, people have been working on it, you just cracked it. Okay, fine. But that's where we were trying to forecast AI demand.
And we don't have much data to forecast, right? As data scientists, we love history, use models, you learn from history and do it, we don't have history for AI. We spent a lot of time researching the right way to forecast it. And you know, funnily, the way we ended up forecasting it at that time was, we thought about historic analogy, an analogy similar to that, right? So what do you do when you have a new disease, for instance, that you don't have data on, you look for similar diseases in the past, and try and figure out how those diseases spread similar to Spanish flu and COVID. The closest analogy we could come up with at that time, and this is now two to three years ago, right? In hindsight, it may look like, oh, what's the big deal? But today, it is obvious. It was iPhone apps and the App Store.
So if you think of iPhone, iPhone, breathtaking, amazing new technology at that time in 2010. But the real economy was not the iPhone, it was the App Store, which added a value to a lot of people's lives, and created that whole economy around it. And that's how we thought about training and inferencing. We said, think of your iPhone as your training model, setting up the baseline for you. But your real value add for customers across the board is going to come from inferencing, and they're going to be applications like the apps, which there are going to be places like the App Store, which will host applications, which will add a lot of value to people.
So if you think of iPhone, iPhone, breathtaking, amazing new technology at that time in 2010. But the real economy was not the iPhone, it was the App Store, which added a value to a lot of people's lives, and created that whole economy around it.
One of the biggest complications that we need to deal with is lack of fungibility. So if you think of supply chains, when you're forecasting supply chain, if you have fungibility, it's amazing, right? We used to have that fungibility pre-GPUs, where a data center is a data center, it can host most workloads, compute storage can host most things. But with GPUs, GPUs are much more power intensive. And the future generation of GPUs need a different form of cooling and data centers, which is called liquid cooling. That means you suddenly don't have that fungibility. So if I plan a data center for GPUs only, that's all that can go there. And if the GPU demand goes lower, I'm done. I'm sitting on an investment which cannot be monetized.
So that is the biggest complexity we are dealing with today. It is, of course, the uncertainty and demand volatility of AI, which inherently will exist in a new technology, but also the lack of fungibility that it leads to in most cloud players. This is the same story for Amazon, Microsoft, Google, or any cloud player.
Take any opportunity you get to aggregate demand up higher. If you are told to forecast the S&P 500 and you want to have a small cone of uncertainty, do not forecast each of those 500 stocks individually. Focus the S&P 500 in an aggregate, you'll have much less volatility with it. So demand aggregation is a huge construct, which is very helpful in our thinking.
Differentiating as a data scientist in the age of AI
And maybe Sam, I'm going to take the opportunity to just fast forward. Imagine a data scientist in 12 months. For a moment, let's work back from, I like to think about like, maybe what's the not-star state, right? In 12 months from now, I'd expect most data scientists to be able to build AI agent apps, right? I wouldn't, it shouldn't be new to them in 12 months. That is a skill set because we have all seen, right, Sam, I think in the hackathon, we saw some of that together.
What AI agents can do is fundamentally changing the world and it will change the world. The technology is there right now. Tough tasks have become easy. The only truly tough task is what you thought impossible earlier. That's where the world is heading, right, with AI agents. So data scientists 12 months down the line, I'd expect them to know AI agents.
One of the things, I know it sounds cliche, but one of the things I look for even in a data scientist today, if I were to hire, would be their ability to learn it all rather than know it all. What I know today is irrelevant in six months. I need to know something new. So what I'm looking for is someone who is able to show through their profile and the conversation that they have switched multiple subject matter areas, have learned new things and constantly evolved. My favorite question is your biggest failure. I want to know your biggest failure and what you took away from that. Because that tells me whether this person will be able to learn with me.
Staying current and the value of community
I think I have learned more about what I don't know and what I need to learn in that than I've ever learned. So, you could always try reading up everything out there, but I'm telling you it's impossible. What you'd rather do is be part of communities like these, where there are folks who are going to go read, and you aggregate intelligence. And then figure out, oh, what does the industry really need? Oh, this is interesting. This is something I'm interested in. So, I would suggest think in terms of just being part of more and more communities.
And in every community, look for the smarter community than you. If that community is discussing everything that you already know, you can contribute to that community, but you're not going to learn. So, look for communities where you learn half the things that you learn in the call you don't know about, and you want to go learn.
Macroeconomic modeling and handling shocks
So, for instance, when I'm doing three to 10-year forecasting of a market size, trying to figure out how big this market is going to be in three to 10 years, I don't care about short-term shocks. Tariff shocks may come, they are cycles, they'll go away, there could be a recession that comes in three to 10 years, it'll go away. But I'm trying to understand in three to 10 years what is the market potential of this, so I'm not worried about short-term shocks.
Now let me give you an example, a different example of COVID. So, four years ago, this is around 2020, right? The lockdowns have just hit us. No one has any idea what COVID is going to be like. But we have conversations about an economic depression happening because economic activity came to a standstill with the lockdowns. And we were in charge of thinking about, hey, what will cloud demand look like?
And we did new fundamental research to figure this out, right? We said, okay, let's break the question down. One is, what do pandemics do to pick technology curves? So, the only pandemic we knew about was go to Spanish flu. And we studied electricity production. We saw that electricity production, while it was hit in the short term during Spanish flu, in the long term, it actually accelerated. So, pandemics in the short term cause a shock, yes, but in the long term, it actually accelerates. And then we tried to figure out why did it accelerate? Okay, what made electricity production accelerate? And here's an interesting research paper which talks about right after a recession, companies are much more willing to spend money to grow with technology. Because in a recession, sadly, they would have to let people go, and a lot of people, they will be smaller companies. So, when they grow, now they can grow with the latest technology.
So, that together, we came up with a hypothesis based on a macroeconomic study that, hey, you know, COVID for cloud, it may actually be a digital acceleration play that is coming through in two to three years from now. And it turned out to be true. It could have been false, but turned out to be true, thankfully. But if that gives you a sense of how we use macroeconomic data, it's not cookie cutter approach, right? It is like, what is my business goal? How can I use macroeconomic data and existing research? And actually, I'm going to stress on that even more, right? Data is one aspect, but there is a lot of existing research out there, which you should go and do a lit review of before you go off to your own project, because a lot of them have been analyzed and you can leverage information from that.
Optimization models and stakeholder communication
You know what I've learned about optimization models in my experience? Technically, the data centers are generally fine. The kind of constraints we specify, they're generally fine. We may make mistakes. We learn from our mistakes by running simulations and figure out what's the right constraint to put in. I don't think the challenge in optimization models is technical in nature. I think the challenge of optimization models is making the business connect. Like some of the things that we may think is a hard constraint isn't really hard constraint, or it could be a hard constraint, but not super valuable.
So the question I always ask myself in the optimization model space is, how can I get the minimal number of business constraints and get that in very cleanly without any guesswork around it? I love to keep my constraints at the minimal. I want more solutions because that gives me a range of possibilities, especially in an MVP to go run against my business stakeholder. So if there's any advice on optimization algorithms I'd give is spend more time identifying from your set of business constraints, not the technical constraints you're going to place on your model, the business constraints, which of them are truly essential while which of them are good to have because that helps you find a much more reasonable, feasible solution.
My business is absolutely stakeholder facing and that is the reason we succeed I feel because just to give you a sense once my team signals a data center plan it's billions of dollars and gets executed on. It's amazing impact but great responsibility and business communication is absolutely fundamental to us succeeding. And the tip I'd give you is like more than the type of visualization like you know the standard visualization charts are great right. Time series trends always win. Just go with time series trends because it's easy to visualize history and see the forecast and everything is much easier in your world from a cognitive standpoint.
But for a moment be the stakeholder. Be the decision maker and think if I am the decision maker what data points do I need to see to make the decision. Don't make the decision for people right. Human decision making works well when they make it with you. So be the enabler of that right so sit in there and say if I was a decision maker and this is a question my manager had asked me because I had produced a demand forecast for cloud with uncertainty bands and said hey here's my forecast and they did ask this question my manager asked so Sajay what would you do. He's like no no no I'm asking a simple question what would you do if you were the head of infrastructure with this data point and that made me fundamentally change how I think about it.
Career growth and operating at multiple levels
No that's a great point I think strategic is the key word what I've learned especially through my one-on-ones my earlier one-on-ones early in my career used to be very much like hey here is what I have done last week this what I plan to do next week what do you think it used to get very tactical about that problem statement and narrow down my manager's focus. What I've learned over time is to ask a different question the question I ask is hey what are the top three biggest things that's worrying you like what is keeping you up right and then you'll hear much more strategic problems. You hear like I think we get lost in the data or we are not analyzing this big business area because we don't have bandwidth here is a trend that is coming but we are not catching it you hear the strategic problem that people have and you'll be surprised how open people are to have that conversation if you untie it from a deliverable due this week or next week.
Talk about bigger picture saying hey what are the bigger things that you're worrying about tell me the top three things that your leader is worrying about you will get a human sense of the things that worry you. So you want to get out of the tactical problems and say get the high level picture so that you focus on the right things in your analysis.
She said something very interesting she said the way she thinks about success is can I operate one level higher and one level lower in my role seamlessly. If I am able to do that without getting stressed without having to stretch that tells me that I'm doing really well at that role because I can go help my team members and help my boss as and when needed. That means I'm ready to grow if I can help my boss and I can help my team members if I have to go one level lower and do the detailed work with them. That was one of the best advice I've ever received on how to think about success in a role.
She said something very interesting she said the way she thinks about success is can I operate one level higher and one level lower in my role seamlessly. If I am able to do that without getting stressed without having to stretch that tells me that I'm doing really well at that role because I can go help my team members and help my boss as and when needed.
Well we have two minutes left so I am not going to go any further Sajay thank you so much for hanging out with us you have been a fantastic guest. Can I give a call out to ADP list if some of it would help to have one-on-one chat from a mentoring or just a connect perspective where mentoring has a word just connecting with me I'm on ADP list where I'm talking with folks so feel free to schedule time with me over there.
And next week I would like everybody to come hang out with Pallas Horwitz analytics consultant and professional development coach if you would like to have more conversations about professional development in the management space or in the data science space Pallas is a great resource for that so hang out with us next week to meet her and ask her all kinds of questions. Thank you Sajay thank you thank you everyone I appreciate you spending your time over here.

