Michael Lopez @ NFL | Being forward-thinking & anticipating questions | Data Science Hangout
We were recently joined by Michael Lopez, Sr. Director of Football Data & Analytics at the National Football League to discuss using data and analytics to enhance the game of football. Timestamps: 4:29 - The process of using RFID tag data 7:31 - What kind of decision making is your work driving? 9:00 - Do you analyze player's health stats? 11:39 - How transferrable are sports analytics skills across leagues? 14:19 - Big Data Bowl and other community initiatives 19:20 - Structured vs unstructured data? 21:51 - Transition from academia to NFL 25:23 - Communication skills that have transferred over 29:22 - What data is captured from the RFID tags? 34:32 - How could someone get into the NFL space as a data scientist focused on health? 37:18 - How are you getting people to buy-in to analytics? 40:40 - How do you prioritize what your team is working on? 43:04 - Languages and tools used in analytics at the NFL 45:06 - Translator role in football analytics 49:09 - How does the NFL go about achieving the level of analytics that you do? 50:59 - Does GAIT analysis have impact on their performance? 52:59 - How do you evangelize new data points and explore things previously un-explored? 55:48 - Fantasy football 56:34 - What are you excited about in the year ahead? Diving into a specific topic on communication -- 27:38 “Sometimes we are answering the questions that we know they're going to ask before they've asked them. And I think the best presentations we give are the ones where they're asking a question and then just click Next, and it's on the next slide. Or they ask a question and I say, yep, it's in the Appendix. Click a couple slides forward and say, here it is. Because you have to anticipate the questions they're going to ask. Sometimes, we're collecting data before they've asked us to collect data because we hear something in the media or we hear something that they're asking in the replay room or their coaches are asking, and we're like, we're going to go build this. They didn't tell us to go collect data on quarterback slides, but we now know that quarterbacks are sliding more than ever before. We know quarterbacks are the most important position in football. So last year, Andrew Patton basically led an NFL data collection processing quarterback slides. Because we knew that when we got to the offseason, somebody was going to say, “hey, how are how are quarterbacks sliding?” Actually, we have that because we went out and collected it. So I would say that it's a combination of great data viz and then also being purposeful and forward-thinking with the questions that people are going to be wanting to ask. _______ ► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu Follow Us Here: Website: https://www.posit.co LinkedIn: https://www.linkedin.com/company/posit-software Twitter: https://twitter.com/posit_pbc To join future data science hangouts, add to your calendar here: pos.it/dsh (All are welcome! We'd love to see you!) Come hangout with us!
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Well, happy Thursday, everybody. Welcome to the Data Science Hangout. If this is your first time joining us here today, it's so nice to meet you. Thanks for spending your Thursday with us. Actually calling in from the POSIT office. Nobody's here with me today, but I have this nice view of the water. But if we haven't met yet, I'm Rachel Dempsey. I host the Data Science Hangout and lead customer marketing at POSIT. This is our open space to chat about data science leadership, questions you're facing and getting to hear about what's going on in the world of data across many different industries.
And so we're here the same space, same time every Thursday. So if you're watching this recording on YouTube later, you can add it to your calendar with the details below. But together, we're all dedicated to making this a welcoming community for everybody. And so we love hearing from everyone no matter your years of experience, titles, industry or even languages that you work in. So every week I'm joined by a different leader here from our community who joins us to share their experience and answer questions from you all.
So to let you know how to ask questions, first of all, it's totally okay to just listen in if you want to. But there's also three ways you can jump in to ask questions. So you can raise your hand on Zoom, and I'll be here looking out. You can put questions in the Zoom chat. And feel free to just put a little star next to it if you want me to read it, or else I can call on you to ask the question live. And then lastly, we have a Slido link, which Tyler will share here in right when I said that. Thanks, Tyler, where you can ask questions anonymously too.
But with all that, thank you so much for joining us here today. And I'm so excited to be joined by my co-host, Mike Lopez. Mike is Senior Director of Data and Analytics at the NFL National Football League. Mike, I'd love to have you introduce yourself, maybe tell us something you like to do outside of work too, and share a little bit about your role.
Thanks, Rachel. And thanks to everyone at Posit for organizing these. It is an exciting time to be in football data. And we just have a lot of questions that we have to try and use sort of naive or advanced models for, because there is so much about the world of football that we are just starting to learn. My background was, I was a high school teacher, turned PhD student, turned assistant professor at college. And at that point, the influx of football data became apparent. The league started getting next gen stats, which is each player wears a pair of RFID chips in their shoulder pads. And you sort of went from an NFL data history where every game had 160 plays, so 160 rows of data, which everyone was sort of analyzing locally on their laptops, to now a world where we have this tracking data, both RFID based and then now even video based. And it's just hard. There's just a lot that we are learning and trying to figure out.
So my job is to organize a group of really intelligent young data scientists that are helping us and helping the league make better decisions. Awesome. Thanks, Mike. Our preseason starts tonight. And I sort of now realize that from now until the end of March, most of my spare time is going to be keeping an eye on what happens in everything related to the NFL. So my spare time was over the summer. I have three young kids, and so hanging out with them is really how I spend my spare time.
But tonight, that is now done with an eye on the NFL game that's on the television. So watching football is what you do in your spare time. Yeah. It's a nice side gig.
Using RFID tag data
Well, Mike, I know there'll be a lot of questions coming in from everybody shortly here. But just hearing that initially about how you're using data from the RFID tags on the players, I love to just understand a little bit more about what that process looks like and how you ingest that data, what you do with that data. Just kind of walking me through a little bit of that process.
So the league has, I would say, two data warehouses of sort of systems of storing data. In the early 2000s, that's when the league initially started building a lot of the SQL server that the clubs still use and that we would still use. That would be your traditional game stats. So if you play fantasy football, the yards caught by wide receiver or the quarterbacks yards per game or whatever that would be, that would be your traditional game stats.
In 2015 was when the league first, it might've been 14 or 15, I wasn't with the league at that point, but that's when the league started collecting the player tracking data and they had seen it catch on other sports and it was driven by the media group out in Los Angeles. And so if the base game stats are stored in New York, the media data is stored and managed by Zebra and the next gen stats team out in LA. We ingest it. There are a couple of ways that the league gets it. One is using an API. The second is there is a SQL server that now is dedicated to organizing and managing that data.
One of the unique things about the tracking data is that it really sort of changes your data pipeline and how you organize and manage your code because you used to be able to just sort of load every NFL game, play-by-play, whatever it has, and analyze it, right? Fit a model, make a graph. Now you have to be a little bit more articulate about how you're doing that. I don't know if articulate is the right word, but you have to be a little bit, you sort of be forward thinking. A lot of times what we do is just sort of fit it on a play, right? If you want to figure out was the wide receiver open, don't try to do that on every play that wide receiver has ever done. Just try to do it on one play and identify a way of whatever metric you're creating or looking at openness for that wide receiver and doing it on that play.
That takes a lot of time because you're probably the first person to think about your level of ranking that openness or whatever. We're doing a lot of things in the league to increase the awareness of football data and to give people the code so that they can build their new work off of it, but it's still new. And then once you've done it on that one play, then try to do it again, right? Try to write a function to do it on the game. And then obviously extract it to all players and all games and all that stuff takes time. So that's kind of how the traditional, you would be an analyst in the year 2012 or 13 and you'd download CSVs and you'd analyze all the plays at once. Now you have to be a little bit more cognizant about how to do that just because of the size of the tracking data.
Decision-making at the league level
Thank you. It's really helpful. I see a ton of questions starting to come in here. Devin, I see you asked one in the chat. Do you want to jump in? But the question was, what kind of decision-making is your work driving at the league level versus the Moneyball team versus team decision-making? And Devin said, thanks for powering my fantasy leagues as well.
So we are primarily interested in what happens on the field. And if it happens in the field, we're hoping to measure it. And in the sense that that's how close are our games? How exciting are they? What are the trends and how teams are calling plays and how players are playing and how coaches are coaching? We do a lot with our officiating group trying to help them make better decisions. And then also both the sort of on-field decisions, but also the training. We work with player health and safety on making sure that the rules of the game are sort of the best match between keeping the player safe, making sure that their equipment is as sort of top-notch as it can be. And then also going back to the rules, like whatever we talk about a rule of the game, it's driven by, does it make the game more competitive? Does it make the game safer? Can our officials officiate it? Does it keep the game exciting? Things like that.
So we try to organize a lot of our sort of support for the league around a lot of those questions. The teams make the decisions to go forward on fourth down. They make the decisions to challenge a play, to start a quarterback or to draft a certain player. But a lot of what we're trying to help with is that there are new tools to help with those decisions. There's new metrics that we can help create. There's new data sets that we can provide them. And ultimately what they do with that is obviously up to them.
Player health and safety data
I know you just touched upon the health data, but Geraldine, I saw you had a question about that. Do you want to ask a follow-up there? Yeah, because when I asked that question, you were mostly talking about tactics and plays, et cetera. So I was wondering, do you guys also analyze players' health stats around things like whether or not they get concussions from impacts or dehydration, or is there like a separate team that deals with that?
So as part of the NFL collective bargaining agreement with the Players Association, the league doesn't analyze that data themselves, but there are two third parties that do, two companies that we work with a lot, IQVIA and BioCorps are their names. And they're both really good from an epidemiological side and an engineering side at looking at what are the trends in injuries? What are the both rates of concussions or injuries or missed time injuries or leg strains? How have those changed over time as athletes have gotten faster and the teams are passing more and things like that. So those are collaborations that we're a part of. The specific raw injury data is not something we touch on a lot only in the sense that when it comes time for changing the punt play or the kickoff play or the rules around how to protect the quarterback, it's never just a health and safety question. It's a health and safety and an officiating question and a sort of competitive question about if this is going to make our games closer or more blowouts or less exciting or less tradition, things like that.
So a lot of the way that we try to work with those types of questions is how does this impact the entirety of the game? And the other aspect that I think is really neat about where football is now is that it's not just the game data, it's also the practice data. And so most clubs have performance scientists that work with trainers on managing player load and sort of optimizing player performance from a, I think both a performance and an injury side. And that's not stuff that we're allowed to help with, but we do know that a lot of clubs are going down that route.
Transferability of sports analytics skills
Greg, I see you had a question in the chat too. Do you want to jump in here? Yeah, sure. Thanks Michael for being here today. I actually got a chance to see you speak at the University of Denver back before the pandemic, which seems like a lifetime ago at the sports analytics conference. So I was just wondering if you have any thoughts on how transferable analytics skills, sports analytics specifically are across leagues, right? If you build up experience with baseball or basketball, how does that transfer to football or other sports? I know some of them are more emerging than others.
Yeah. I mean, I honestly, I don't think you can be a great sports data scientist without really having a sense of what's going on in other sports. So I would say it's really twofold. One is the skills themselves are transferable. If you can code tracking data in football, you can code it in baseball, you can code it in basketball, etc. It might take you a little bit of time to get up to speed, but the specific functions that you're writing or the specific ways that you're transforming the tracking data into metrics that you can communicate with a coach, that you can do in any sport, right? Once you can do it in one. The hard part is being able to do it in one.
So I think that's one area. And then the second one, which is really maybe just as important is what are the other sports doing? What are they looking at? One recent thing that's caught on to a lot of folks in football is that baseball teams no longer draft a player and then send the player out there and the player performs or doesn't perform. They're now doing things in training to maximize the player's launch angle or their exit velocity or their swing path because they think that they can take a player and help them improve using data. And I don't think football has always been that way. I think maybe the corollary would be you would have a quarterback and you throw the quarterback out there and they'd either complete the pass or miss it and maybe you could get a new offensive coordinator. But maybe now you could be thinking about where is the quarterback holding the ball? What is their path to throw it? Is there something that we can do that's sort of maybe the corollary of a swing in baseball?
So that's one example that isn't certainly the only example, but I think it's sort of a combination of both the hard skills, hands-on keyboard stuff that if you can code, you can code. And if you know sports, I think that's going to certainly transfer. But there's also an awareness of what are other leagues doing? What are the metrics that they've come up with? I think it certainly helps in the data science to have that background and that skill set because there's just so much out there that the NFL can learn from other sports. And I do think specifically given some of the competitions and the work that the football has done in the last couple years, other sports are now learning stuff from football too.
Big Data Bowl and community initiatives
Great, I appreciate that. Thank you very much. Mike, we talk a lot about community in the Data Science Hangout and I think the Big Data Bowl is an amazing community initiative that stands out to me. And I was wondering, do you see this initiative happening in other sports or other industries now? And do you have any tips for people who are trying to do something similar? And it might be helpful to just explain what the Big Data Bowl is too.
So one of my weaknesses is I hold grudges. And so I used to always get a little bit irritated with some well-known sports conferences and organizations that would maybe run hackathons or share data in ways that, as an academic and as somebody who supports reproducible research, maybe didn't have those same values. And so when I started at the league, I was pretty steadfast that I thought the league could do a better job of sharing data. And one of the ways that we've done that is through an event called the Big Data Bowl. And it has a couple purposes. One is let's innovate in football. And we can be talented all we want as a league office and whatnot, but there's only so much that our group can do and the Next Gen Stats group can do. There's just a lot of other things. And the idea being that there's just a lot we can do. And if you share data, people will analyze it. So the quote from Field of Dreams is, if you build it, they will come. If you share the data, they will analyze it, especially in sports data.
So that's one of the main reasons we did it. We also have the motivation where we have this complex data set that no one's ever looked at. At least this was our initial philosophy in 2018. And if that's how we feel, that's also how the 32 NFL teams feel. And when you work at a league office, you're in some ways employed by the 32 teams. And so we're doing it for them. We're doing it to give them a pipeline of analytic staffers that they can hire to go answer their own questions. And one of the really neat things is, and I've sort of hinted at this, is that this is an opportunity for anyone to come in and analyze. And it really levels the playing field in the sense that before, when you would hire in sports data, you might look at GPA, or you might look at where the person went to college, or you might look at maybe where they interned last summer. Now you're looking at their code, right? You're looking at what they built. You're looking at what is their level of thoughtfulness in the questions and the way they communicate their results, which ultimately that's how you're evaluated when you're on a team, right?
If you share the data, they will analyze it, especially in sports data.
If you're on a data science team and that your job is to build interpretable metrics and to communicate those and to do a good job of validating your models and things like that, that's what the big data role is all about. So we've seen a lot of, one of the neat things is that now people are copying football and that was never a place where football was before. There was the big data derby and horse racing. There was the big data cup, which staff leads put on in hockey. I think there was like now a fantasy football one. And it's pretty humbling to know that largely started with the NFL sharing data and being open about it.
So we're certainly open to more leagues doing it. In terms of advice, the number one thing that we found is that the crowdsourced metrics are always better than anything we've ever come up with ourselves. And the more thoughtful and sort of, I guess, sort of thinking forward about what people will analyze, I think companies will get the most out of it. Like we've had competitions that have immediately turned into metrics that are shared on air. And our 2020 competition was on rushing yards. And within maybe 11 months from the contest ending, you watch the NFC title game, which has maybe 30 million viewers. They're watching a metric that came from our competition. And that's sort of a really exciting aspect of that competition. And so the better questions you ask, the better answers you'll get. But I think overall, we've certainly found that the crowdsourced approach has really helped.
Structured vs unstructured data
Yu, I see you've had your hand raised for a bit here. Do you want to jump in? Sure. Hi, Michael. Thank you for being here. I have one question. So what type of data are you using for the analysis? Are they tabular structure data or unstructured data like images or recording from the game directly? If the data are unstructured, how do you process them before you can use them for analysis and modeling?
Yeah, good question. I would say with each passing year, we're getting to more unstructured data. But I think the league has done a pretty good job. And when you're at a league office, we have a large number of data engineers that help that ensure our SQL tables are clean and easy to read from. I would say most of it's pretty structured. And the other reality is with player tracking data, at least in the NFL, because it's RFID based, it's very accurate. And so it's a tabular format. In fact, if you go to our big datable site, you can download it as a CSV and it's a big data set, but it comes out clean. That's not necessarily the case with maybe some of the newer data sets we'd be getting into.
If you talk about like video tracking, which would be like taking from an image and then trying to look at player location. One of the ways that football is going is that we have RFID tags on all 32 NFL teams and all the players and all the games. However, those players come from a pool of college players. The NFL can't make college players wear those same RFID tags. And so when you want college data, you either have your traditional stats or you've got to use video based data. And so a lot of that video based data on college players is stuff that is a little messier. And so that would be where you would have to extract it from imaging and things like that.
There are companies out there that do that. And it's sort of, I don't know if it's an arms race, but I do know that NFL teams are really interested in those skills because of the way that before you would have to have a dozen scouts go out and watch videos of the teams in the northeast and the southeast and all over, you'd send scouts. And teams still do that. But now if you can get some of that insight by coming up with an algorithm that can take the data from the video and turn it into a nice usable format, that can certainly save you some time and maybe get you some insight you weren't getting otherwise.
Transition from academia to the NFL
Lisa, I see you had a question a bit earlier on Mike's transition from academia to the NFL. Do you want to jump in? Sure. Yeah. I'm Lisa. I was just curious about how you made the transition from academia to working for the NFL and how you think that academic experience was helpful. And then maybe again, this is probably selfish, but like how did you sort of market that academic skill set?
I think the most helpful part about being in academia is that you are evaluated. And I was at a small liberal arts college. You're evaluated on your teaching. So you have to be a good teacher, right? I was a tenure track assistant professor and evals would come in. And when you weren't doing a good job communicating to your students, that's going to ding you or whatever. So I think that's the most helpful skill set is that my job was to take statistical methods and distill them down to students who were learning it.
I've been teaching R for maybe some software for a decade. But when you do that in 2020 or whatever, I couldn't be teaching the same stuff I was teaching in 2010 because software was entirely different. So you also have to sort of keep up to date. And that's kind of a lot of what we do now. Like our job is to communicate statistical results or methods to folks that aren't statistics experts. Now, instead of students, now it's executives. And one of the things that we are constantly working on is what do we show? How much is too much? When do we lose them? How many plots is too many plots? Do they need to know the ins and outs of our models? And whereas when you're teaching, you really need to communicate all the ins and outs. Certainly, we don't do that now.
But I think there's a lot of the experience that was most helpful was teaching and communicating results. We don't do as much writing. I don't have as many research papers maybe as I used to would have. But now, for better or for worse, I spend a lot of time making presentations. And I think a lot of that's the same approach where you're trying to take what you've learned and distill it into a template that others can understand. I started in 2018 when I worked at the league office, maybe doing one PowerPoint deck every couple months. And now, for better or for worse, I have to do that a lot because I have a really talented group of data scientists that are coming up with so many cool things. And I am now helping them curate material to share with executives.
And there's, like I mentioned at the start, there's a lot that we can do. And trying to figure out how to communicate that goes back to that experience in academia where you're writing a journal article or you're writing a summary of your work. So that's where it's been most valuable. And I also think one of the cool aspects about where we are now is that there's just still a lot to learn. And that's no different than being in academia where you're an expert in something and you're trying to build the next tool or you're trying to build the next model or the next paper. And there's, you know, I think it's not that, you know, I like to often refer to our work as we just, we have a laundry list and sometimes our job is to just start at the top. What's the most important thing we can do? And that's not much different when you're in academia and trying to do the same thing.
Communication skills and anticipating questions
Can we stay on this topic for just a bit here? When you were saying, when you were talking about the communication skills that have transferred over, I was curious if you do have certain tips for us for communicating to executives or what you've learned in many of those presentations you've given. Yeah. I mean, we've tried and failed a lot. I mean, there's things that I've pushed for or we felt strongly about that other people were like, yeah. And then there's other things where maybe I didn't feel as passionate about. And then suddenly maybe every, maybe the presentation got shared a lot more widely and I read about it the next day on ESPN.com. So I, it's hard to really know, like I haven't kept a spreadsheet of like what's worked or what hasn't in terms of communicating results.
I think the, we have to have good data visualization. If we don't, we just, we can't do our job. And I think I spend probably, my team would probably complain about how I do this, but like, I probably complain about things that I, if this was a student paper that like, I probably wouldn't, right. Because I have to make sure that the work that is being presented is as clear and concise as it possibly can be. It can't be confusing. It has to have the right annotations. The stuff that has come out in ggplot in the last couple of years and also in GT with like the sort of visualizing and tables has been really helpful in that front because for a while, like we would have to make tables in an old school format, right. Or I'd have to add annotations in PowerPoint because, you know, you couldn't code that. So that's, that's probably the most important thing is, is your, your viz has to be top notch.
I also, and this just comes partly from experience, partly because I played football in college and my dad was a football coach and I grew up watching way too much film, but the more closer you are to the subject level matter, the better you are at your job. It doesn't mean that you need it for all roles. You can be a great coder and sort of learn that skillset along the way. And I mean, football is in some sense a complicated sport, but in another, anybody can learn it if you try to learn it, right. So it's not like it's too far away from, from somebody to learn.
But a lot of the things that we do is we just answer their questions. And I think the sort of second part of communicating that result is sometimes we are answering the questions that we know they're going to ask before they've asked them. And I think, I think the best presentations we give are the ones where they're asking a question and then I just click next and it's on the next slide, right. Or they ask a question and I say, yep, it's in the appendix, click a couple of slides forward and say, here it is, because you have to anticipate the questions they're going to ask. You have to, sometimes we're collecting data before they've asked us to collect data because we can tell, like we hear something in the media or we hear something that they're asking in the replay room or their coaches are asking. And we're like, all right, we're going to go build this because they didn't tell us to go collect data on quarterback slides. But we now know that quarterbacks are sliding more than ever before. We know quarterbacks are the most important position in football. So last year, Andrew Patton basically led a NFL data collection process in quarterback slides because we knew that when we got to the offseason, somebody was going to say, hey, how are our quarterbacks sliding? And it's like, actually we have that because we went out and collected it. So I would say that it's a combination of great data viz and then also being purposeful and sort of forward thinking with the questions that people are going to be wanting to ask.
I think the best presentations we give are the ones where they're asking a question and then I just click next and it's on the next slide, right. Or they ask a question and I say, yep, it's in the appendix, click a couple of slides forward and say, here it is, because you have to anticipate the questions they're going to ask.
RFID data details and officiating technology
Sanjay, I see you had your hand raised a little bit ago. Do you want to jump in here? Yeah, hi. I was curious when you said RFID tags are used to track players. And so what all data do you collect apart from just tracking the player, the speed, direction, what all is captured? Just was curious to know. Yeah, so it captures the XY coordinate data, two-dimensional. We don't get their height. The tags are like the size of your thumbnail, and they fit into the player's shoulder pads. And some players don't even know that they're wearing them. They turn on whenever they go on the field. You could watch players on the sideline if you wanted to. I wouldn't recommend it. But then the league takes that data, they share. From the XY data, we get their speed acceleration. Because it's two chips, we also get their orientation. That's all curated and sent out to the teams from the league office.
So you could watch NFL, and you'd see like fastest players or fastest acceleration speed. You could get change of direction from it without too much effort. So that's the type of thing that you would get from this data. And do you have an RFID on the ball also? Yes. Yeah, good question. Yeah, there's an RFID on the ball. With players, I would say it's accurate usually to within a couple inches. The ball has a larger margin of error, maybe closer to five or six inches. And then with all of it, with both the 22 players, the ball, there's also RFID tags in things on the sideline, like the pylons and the chains. I don't think there's much use in those, but the league has been collecting it. And then it's all at frames of 10 frames per second, right? So a given player, a play is seven seconds long, you're going to get 70 observations of that player on that play. And you could animate it to sort of follow them around. You also get their data before the snap and after the snap. It's just a matter of what you find most useful.
Aaron, I think you had a question related to that about the ball too. Do you want to jump in here? I think the question was kind of about like officiating. Aaron said, I've always wondered why can't a tracker be placed on the ball to determine when it crosses the plane for a first down or touchdown. So is that something that's being looked at? Yeah, and I think Geraldine's question is similar. It is being looked at. There are a couple of differences that I think matter a lot. One is NFL stadiums are enormous and there's a lot of player overlap on almost every play. And so the cameras have to be further away, unlike tennis where they're right on the field. And unlike maybe a VAR decision in soccer where there's two, three players, maybe right in a tight space, arguably they're a lot more. So the technology that you can see in other sports is a little bit more complex to apply to football.
In my now six years, there have been several attempts, some gaining more traction than others to do video-based spotting. It is a challenge. And the current accuracy margin of error of the football itself using the chip in the middle of the ball can't replace the sort of on-field spotting for a couple of reasons. One, I mentioned the margin of error. If you're saying the ball is here, but then you look on the film and it's actually here, it's really difficult to justify that to a viewer at home while you're making a first down decision or a decision on the goal line. And I think the second part there too is the ball spotting process at the NFL is based on the tip of the football without totally changing the construct of the chip in the football. We don't know that. And the chip is in the middle of football. It's been tried to put multiple chips in, then you're sort of messing around with the mechanics of how that feels at the quarterback. And we don't want to do that.
So, you know, there are video-based attempts that my intuition is within a couple of years, there will be use cases of it that you'll see on air where the sort of technology is helping the process of spotting the football. But it's not as easy as sort of maybe it appears in the other sports. And I also think the other aspect for us is that pace of play is really important in the sense that tennis can have a break or soccer can have a break to look at this decision and maybe the ball isn't in play. The second the ball is put down, the ball's in play. There's no other than the couple stoppages for replay or in between downs, in between like a fourth down and a punt play or whatever. There's no example of sort of saying, pause, hold on, we're all going to wait, right? We just don't have that time. Coaches want the ball to move. We want the game to be three hours or so. And that sort of continued focus means, sure, you know, there might be a tool out there that's going to get us the exact right spot. But if it takes three minutes, you know, that's not a trade-off I think the league would be making.
Getting into NFL data science
Jared asked, how does someone get into the NFL space as a data scientist to help improve player safety and performance? So I think there's a couple of ways. One is, I mean, there are, I mentioned IQVIA and BioCorps, but they are two companies we work with a lot and they're specifically targeting, you know, things like the epidemiology of injuries, return to play, injury trends based on how teams are playing, the equipment, you know, safest helmets, safest cleats, safest shoulder pads, or the sort of characteristics within a play that, you know, the orientation of player limbs that could lead to injury. So I think there are sort of things that are specific to the injury side of things that are part of it.
Then there's also the maybe more performance side, which is all the teams want those same questions answered. And they have, you know, just like you see the next gen stats, clear tracking data in games, they get catapult data in practice, or there are other companies besides Catapult that do it, but Catapult is probably the biggest one. So you have the same data that you get on your own players in practice. Now that's not data that the league office would get, but if you're on a team and you could sort of identify like, listen, this player has had two or three really hard days in a row, maybe that comes with an increased injury risk of a hamstring strain or something. I'm just, I'm not necessarily making that up, but that's hypothetical, right? So those would be the things that you would get into is thinking along, like, what happens within a practice and how to make sure that that player that put forth a certain load, that traveled a certain distance, that ran at his high speed for, you know, 10 minutes, which is maybe a lot, or three minutes, which maybe isn't a lot, like how are they going to perform the next day? Because ultimately the job of teams is to win games and you win games when your players are healthy. So the coaches understand the value of this too.
Buying into analytics and prioritizing work
So my question kind of revolved around, I guess, communication of what you're finding and just kind of having some, like, a personal feel and read on, you know, football personnel and staff. So, you know, I know a lot, like the old guard guys might be, and folks in general, might be a little bit resistant to analytics and just kind of the philosophy. So I was wondering, and I know you kind of mentioned, like, you're doing more PowerPoint presentations and talking to people, like, how are you getting people to kind of buy into analytics as a whole? How can you take your, you know, the modeling that's the math and turn it into actual advice, but also in a, from a perspective that, like, is digestible for football coaches that are personnel, whoever might be working with you in a way that they can, you know, use it, accept it, and, you know, feel better about it. Because kind of around the football space, it seems like the word analytics is a little bit like a stink on it, if you know what I mean.
Yeah, I haven't had a lot of pushback. Like, there hasn't, I mean, I've been here six years, there's not been a single meeting where somebody was like, I don't want to see your results, or I don't care about X, Y, or Z. I mean, I mentioned earlier that there are certainly times where I've thought the league maybe could have done this, or a certain team could have said, hey, we think we should prioritize this more, and maybe that hasn't happened. I think a lot of it comes back to, we're not trying to answer different questions that they are. We're trying to answer the same question they are, and we're just trying to put numbers to it.
And, you know, there are lots of examples where we just get a sense of the questions that they're after, what they're interested in, and we say, here's what we're thinking. And I also think we're, you know, I think we do an appropriate job of pushing where we need to push, and then also saying sometimes, like, listen, I mean, there are a couple examples of random projections we've been working on in the last week where sometimes we've just said, honestly, like, we're giving you a projection for something for some time, but I wouldn't even really trust it. Like, I personally don't trust our own model, right, because there's too many uncertainties that, too much noise in the system. So I think we're also, we try to balance, like, when we push.
And one of the advices I got from somebody, I think in hockey a couple years ago, is I was asking that person why their team didn't do something. And their point was, this analyst that I talked to, is that they look at their role as they have a certain amount of times that they could really push. And sometimes you do it, and sometimes you say, like, listen, I think my team probably could increase my chances of winning by this, but there's something else that comes with it that maybe I don't want to, it's not necessarily, it moves the needle only this much, and it's not worth it. So I do think our team also tries to be a little bit purposeful with how we balance when we are saying, like, hey, this is something we should change, or this is something we should do, because, you know, we only get so many cracks at it. And, you know, there are times where I think, hey, we can really improve our decision making here. And there are other times where I would say, like, hey, listen, I can give you a number, but, you know, go with your gut, it's probably not going to be that far off.
I don't have a great answer there. You know, like, a lot of it is, honestly, like, we'll, we have a meeting in a week or so with some executives. And my last slide is here are six things we can do this fall. Tell me where you would prefer we spend our time. So that maybe that's kind of part of is just getting a read on it. You know, we're not a, we're not totally obtuse to what we hear in the media in terms of what people are talking about. Because if clubs, you know, it's a weird, it's weird to be in a public facing industry. But if clubs leak it to some media organization, it's a nonzero chance it makes it makes it way back to our group.
You know, like last, you're familiar with football, one of the things that that last year started early in the season is his teams would do what's called a push quarterback sneak where you would have quarterback lineup. And traditionally, they would sneak forward. Now you put somebody behind him and like two or three people are pushing. So some of the quarterback slide process, we were like, let's collect this data, because we saw it on pro football talk in week two or three. And it's like, if we don't collect it, they're going to get to the end of the year and complain about it. And it's going to be like, Oh, crap, we should have measured this. So it's like, okay, now we got to measure it. So it's just kind of being in tune with what's out there. The what we're seeing on the field, I mentioned, I know, Rachel, you asked what I like to do for fun. You know, I watch football for fun. But I also watch it knowing that whenever something happens, like sometimes there's going to be a question we need to answer Monday morning, how often does this happen? Is this getting worse?
Tools and roles in NFL analytics
Sure. Yeah, I think I asked a question about if most of this work is done in Python and R, are there any particular packages? If so, or if that code is mostly internal, or if there's any other kind of like software typically used in NFL sports analytics? Yeah, a great good question. We, I would say we're more on the R than the Python side, but there's, it's a kind of a decent mix. I've asked this to NFL team staffers. And one other, I think, good point that is just sort of exists is that teams are hiring more and more. So there's more and more folks with Python backgrounds going to teams. When I started, I want to say that there were maybe 70 or so data scientists, data scientists being a catch all term here, folks that were using football data to try and help the team on the field. Maybe 70 or so, and I think the most recent list is we're somewhere around 130 or 140. So within the last five years, teams have basically doubled their infrastructure on the football data side.
And I, of that group, you know, being familiar with a lot of them, you know, our team probably knows the majority of those folks. I'd probably put it about 70, 30 R versus Python. And that's probably where our team sits, you know, internally. That's not, you know, certainly there are, I think right now, most postings are like, you have to know R or Python. You know, a good number of managers are probably more comfortable in R because of the, you know, where football data has been, but that doesn't necessarily mean that it's any better. And then I also think there's a growing number of roles for people that know neither, but know what football data is.
So, and like, this would be like your communicator type role. So folks that maybe are really up to date with current football metrics and maybe have coaching experience or playing experience, and they want to be the translator. They want to be the person that is going and saying, hey, you know, I can't fit that neural net model, but I know you can, and I know what it's going to give me as a result. And this is, I think what's going to help this coach or this assistant GM or whatever. So I would even say there's even a good amount of room in football analytics for non-coders, right? People who, you know, have some ability to know what's going on in the background, know a little bit about modeling and know what the strengths and weaknesses are, while maybe not being able to do it on player tracking data. But also having that ability, you know, you really have to know football, right? To sort of get into that role. And not every team has them, but you know, if you stay up to date on the big data bowl and on next-gen stats and on sort of public facing research, you can provide a lot of value to your team, right? In terms of identifying trends that are meaningful and maybe trends that you would want to ignore.
I mean, if so, for example, you're an offensive coordinator for a team and you want to build a scouting report, you want to build a dashboard that says, here's what the opposing defense is going to do each week. There are lots of data scientists that could do that. A good number of them, especially the ones that have spent the last four or five years in football tracking data, might not know the names that your team is using for what that opposing defense is running, right? They might be able to separate man from zone coverage, but they might not be able to separate cover two from cover four from cover six, or they might not be able to understand the strong sideline backer for the weak sideline backer. And so now you need somebody to sort of say like, okay, this is what the coach is looking for. This is what this term is. This is what this splits is. And then your job is to code it all, right? And to sort of work with that communicator to make, to help that.
Communicating analytics to media and executives
George, I see you had a question in the chat a few minutes ago. Do you want to jump in? Oh, yeah. Thanks. So yeah, I was just wondering, this is something I always think about whenever I'm watching a sporting event, is how do I get my organization to report at the level of what I see, like the NFL or ESPN or any in-game commentator reporting on? The kind of analytics that come out of their mouths, I'm just in awe of. And it's historical. Sometimes it's like, well, the last quarterback who did this was in 1978, and he had like blonde hair and went to this college. And I'm like, I mean, first of all, like, I don't even think that's relevant. But the fact that they could pull that out of nowhere in 10 seconds is pretty impressive. I've tried doing that in my field in HR in the past. And that's always sort of the bar that I have. But I'm just curious, how does one go about even trying to achieve that level of analytics?
It's almost like a research team. And the NFL does have a research team that they, before every game, they provide the media with an enormous, you know, basically like a folder, right, of all the things. And this is automatically generated using various tools, right, of all the things that are applicable to these teams. And it's almost like scouting reports that you would see of like a baseball player or a football player. And so when a wide receiver catches the ball, the commentator is able to go and say, oh, this is interesting. He was, you know, caught the highest, you know, red zone target rate of any tight end last year, right. And a lot of it's not because they've done it. It's because somebody's, you know, sort of handed that to them. We don't do a lot with it. That's maybe certainly more on the media side. But I do know that that's, you know, the research and media team's job is to prep basically a portfolio for them to pull from. And I do think one of the things that they're balancing is how do I take a lot of the new metrics that nobody's ever heard of versus some of the old things like, you know, everybody knows the blonde hair, right, like that they could, you could pull from that that maybe wouldn't make as much sense.
Scheduling research and building trust
Good question. We've, and I can share a little bit of this, but we, because they shared it this past off season. But one area was sort of stuff I was just mentioning is that we did a lot of research on what I would call scheduling inequities and basically how things that the league sets impact game outcomes. And there were a good number of surprising ones where things that I think most people would perceive as having a big impact on a team's chance of winning the game maybe didn't matter as much. And so, you know, the scheduling team mentioned that when they made their schedule this year, they took some of our advice to heart. And a lot of that honestly was just trust, right? I think we had to