
Data 911: how Posit can support decision-makers in times of environmental crisis (Marcus Beck)
Data 911: how Posit can support decision-makers in times of environmental crisis Speaker(s): Marcus Beck Abstract: Over 200 million gallons of mining wastewater was released into Tampa Bay in March of 2021. Concerns about the environmental impacts prompted a multi-agency response to monitor water quality changes in the bay, producing thousands of sample points in need of synthesis and communication to a concerned public. This talk will describe how the Tampa Bay Estuary Program leveraged Posit products to create a data synthesis workflow and Shiny dashboard to inform decision-makers on how, where, and when water quality was affected by this pollution. Our experience navigating this event in real time will be shared with the broader community as a successful example of how Posit products can address environmental crises. posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
All right, well, good morning, everybody. My name is Marcus. I am the senior scientist at the Tampa Bay Estuary Program. And I'm going to be talking about what arguably was the most stressful six months of my life. So I really felt that keynote this morning. This is really about data science under stress and how do you make decisions when there's a sense of urgency around those decisions.
So we're going to do the retrospective approach here. If you go back to July of 2021, I live in Tampa Bay. That's where I work. If you were to walk around the shores of Tampa Bay, you'd notice these really conspicuous dumpsters with a sign out in front of them that said dead fish only. It's really gross. There were dead fish all along the shoreline, right? So there was like a massive fish kill that had happened in Tampa Bay. So many fish that the powers that be didn't really know what to do about it. So the city was like, we'll just put out these dumpsters. The public can clean it up. It was out of control. I actually looked in the dumpster. Zero of ten. Would not recommend. It was disgusting.
But it was just a gnarly situation. And so a couple months earlier, what was kind of leading up to this event, there's this facility on the southeast shore of Tampa Bay called Piney Point. It is a legacy fertilizer processing plant. Long story short, there's a convoluted history as to why this facility exists, why it led to this emergency discharge. But there was really a leak in the holding tanks for the wastewater that was sitting on site for years. And to prevent, you know, issues with public safety and destruction of public property, the best decision at that time was to release all of that water into the bay to prevent catastrophic failure of the stacks. This was a couple months prior to the fish kill. Nobody wanted this to happen. It was not ideal.
And this water was nasty. You know, very acidic, excess nutrients, all that bad stuff. Over a ten-day period when they were releasing this water, that part of the bay received an amount of pollution that it normally would have received in a year. So we expected all sorts of negative outcomes to occur. And to further underscore the horrific nature of this event, my boy Stephen King was tweeting about this. Back when Twitter was a thing, I used to follow him. And he's quite witty. But I think he has a vacation home in Florida. So he often would tweet about Florida-based things. And needless to say, this was receiving national attention, given that, you know, there was an expected negative outcome from this event.
Over a ten-day period when they were releasing this water, that part of the bay received an amount of pollution that it normally would have received in a year.
The data challenge
So I work in environmental science. I'm the only data scientist on my team. And we knew that this event was going to have a negative effect on the bay. And everybody that was involved, the public, environmental managers, policy makers, really wanted to know, obviously, how is this going to affect the environmental resources in Tampa Bay? From the environmental management perspective, they wanted to know where are the conditions going to be the worst? So where do they focus their efforts for, say, cleaning up dead fish? And we also just had a general need to understand what can we do about it? Like, what can we learn from the situation so that if it were to happen again, how do we sort of mitigate it or even prevent it? And this is a time of crisis, right? So these are questions that were important, but they needed the answers right now.
So an alternative title that I was thinking for this talk was good enough crisis management. You know, one thing I learned about data science coming to this conference is that we're really concerned with efficiency and refactoring everything to make it as good as possible as the best product we can make. That's not what this talk is about. This is about doing scrappy data science when you want to get information out there as fast as possible.
So my program where I work at, we facilitate a lot of the management, the decisions around how Tampa Bay is managed. We facilitated this response-based monitoring effort that was dozens of agencies going out there collecting data, hundreds of sites where they were going out collecting information, dozens of parameters they were looking at, and as an environmental scientist, tens of thousands of measurements is a lot for me. So a lot of good data that was being collected, but of course we all know data is not information. So that needed to be distilled.
So I'll cut to the chase here. We produced a dashboard. This is, you know, the end product here. This is how we distilled it. We were quite happy with this product. It was useful for the public as well as decision makers and policy makers. But it took a lot to go from that data set to the dashboard.
I appreciated the last talk kind of showing a similar example. This is like the scenario that keeps data scientists up at night. You just have tons of data files coming in. This was the situation we were in. We were the only organization at the time that was synthesizing this information. And I said, please, just give it to me. I'll deal with it. And if you look at that scroll bar on the right there, it goes down and down.
The synthesis workflow
So this is the situation we were in. Quickly my dumpster was on fire, and I needed to distill this information as rapidly as possible. So this is a common, like, workflow that I think many data scientists are familiar with. But this is kind of how I set this up. This is a gross overview of what this actually is. But we said, give us your data, put it on Google Drive, we'll deal with it. So it was great. We were getting the information or rather getting the data. But, of course, all that data needed to be synthesized into a format that was able to be basically pushed up to the dashboard. So I was in the trenches this entire time basically synthesizing this information. Then it was pushed up to GitHub. It was run through a suite of tests to verify the accuracy of that data. And then ultimately pushed to the dashboard.
So going back to this idea of efficiency that we think about as data scientists, I wrote a ton of code just making the raw data from Google Drive into a format that could be pushed up to GitHub and then the dashboard. This is code I'm not proud of. It's not pretty. It's not efficient. Sure, it could be improved. This is, you know, like me hand coding a lot of the renames for the stations. But this code worked. It did what I needed it to do. I was hand coding a lot of stuff. It could have been better. But I did not spend a lot of time refactoring this code because I just simply didn't have that time to do that.
Testing as guardrails
So I'm embarrassed of this code, but it worked. Where I wasn't going to sacrifice attention was the accuracy of the information coming out of that data synthesis workflow. And so as sort of an insurance policy that I set up for myself, when the data came out of that workflow, it was pushed to GitHub and I set up a series of tests using test that. This isn't your conventional testing that you would do for like a package, but these are things I wrote to verify that the data coming out of that workflow that I produced wasn't garbage. So are the names correct? Are the measurements in range? Are there duplicate values? Things like that. So was the data that I was churning out following the schema that I developed that was appropriate for the dashboard? These are my guardrails. This is the most important piece of this talk, I think, is this is where I did not sacrifice attention because I knew that the data coming out of this workflow had to be accurate and had to be something that would be useful on the dashboard.
These are my guardrails. This is the most important piece of this talk, I think, is this is where I did not sacrifice attention because I knew that the data coming out of this workflow had to be accurate and had to be something that would be useful on the dashboard.
So I'm going to look at this a little bit more carefully. This is just by the numbers over basically the six-month period from the discharge release to when conditions sort of improved. There are 641 commits to GitHub. Among those commits, there were about 8,500 tests. And the good news is a large majority of those tests passed. So my workflow, even though it was ugly, was churning out data that I was happy with. However, a small minority of those tests failed. And this is small in the grand scheme of things, but these tests that failed saved my ass in a lot of instances.
And I'm going to show you what that looked like. So if these tests were not in place, the obvious thing that would happen is we would break the dashboard. And this is a Flex dashboard, so we would get this Pandoc error if we pushed data that was not compatible with the dashboard. Obviously, we don't want to do this. We don't want to break the dashboard. It looks bad. Also, it's not serving the users. So it caught these instances where we would simply break the dashboard. But I think more importantly, it did catch more insidious issues that wouldn't necessarily break the dashboard, but would lead to us showing inaccurate information that could lead to wrong decisions about how to respond to this event.
And so this is an example where I'm showing basically just a couple days of data where I have chlorophyll A, which is sort of a proxy for water quality. The points are sized by the concentration. So you can think of the points, the bigger the points, the worse the water quality, very generally speaking. These small points here, these are actually instances where the units of those points mismatched what my schema was meant to be. So when the data was provided to me, it was different units. And what happens is because the units were different, the data basically shows up as NA values on the dashboard. They're still shown. This is a location, but there's no value associated with it. So this basically prevents or provides a false indication of water quality by showing these points. Because they're small, it looks like water quality is actually better than it actually is. And so the tests were able to identify that. And if it wasn't there, I would have pushed these points to the dashboard and provided a false sense of water quality that, of course, we didn't want to do.
Champion allies and user feedback
So what was kind of interesting when I was preparing this talk, we went through the speaker coaching, and one of the questions I got from one of the other speakers was when you're developing a front-facing dashboard, obviously you want to work with your users, get their feedback on how to, you know, suit their needs, basically meet them where they're at. How did we do this in a time of crisis?
So obviously, you know, the naive developer envisions this ideal scenario where, hey, I made this awesome thing for you. That's great. The user says, I love this, but can you please change this one part? And then you say, yeah, sure, no problem, I'll fix it for you, and everyone's happy. I realized the first keynote had a similar slide. I made this before him, so it was my idea first. But this is an ideal scenario. We know it doesn't work like that. It's more like this, where, you know, step one of whatever, you get some requests that make sense, some that are about incorporating data, some are just not helpful, and others are just downright mean.
So we wanted to avoid this. You know, I was busy developing that synthesis workflow, creating the tests. I didn't have time to filter all of the requests and address the user needs. Oh, and my heart was broken in this example. So what we did, and this is not something that is ideal when you're developing a product, but in a time of crisis, I think this is really important. I had what I'm calling champion allies, right, that served as a buffer between the users and me. And essentially, this was my boss and the assistant director. They were also in the trenches, talking to the media, talking to the public, so they had their finger on the pulse of what was actually happening. But what they could do for me is, you know, while I was in my cave developing the app, they could filter, you know, obviously the nonsense requests. They also, I've worked with them for a long time, so they know what I can accomplish. So a request could have been valid, but given the urgency of this, wasn't worth the amount of time. And they also, most importantly, know the science. So they know, you know, what water quality is all about. They know what type of information needs to be displayed for it, and they can then filter the most appropriate requests down to me. So I was insulated, and I could focus on what I needed to focus the most on and have my champion allies buffer me from all of this extra noise while I was creating these things.
The dashboard and its impact
This is just a short GIF of the dashboard, just to show you what it looked like. A lot of stuff on there. I'm going to show you some of the water quality results. You can click through multiple parameters. Here we're looking again at chlorophyll A. What's cool is you can toggle that and see which ones are quickly out of the normal range for that time of year and location, and you can zoom in and get more detailed information about individual sites. So this is a time series, and then again, showing that context of what is normal. So you can see that things are kind of out of whack.
And we also, if we hadn't had this workflow in place, we wouldn't be able to create infographics like this. And this is so important to my program, where we're kind of in the business of storytelling in a way, where we interact with the public and we want to tell a narrative about environmental health and environmental quality. And had we not had this workflow to support this narrative or this infographic, we wouldn't be able to tell this story. So this is something the public can see and look at and, you know, quickly understand what happened, but it is data supported, and it's supported by data that we were able to synthesize and make sense of using this workflow.
We know it was reaching its end users. So again, back when Twitter was a thing, we were getting some attention. Charlie Justice, he was the chair of our policy board, county commissioner at the time. He was giving us unsolicited advertising for the dashboard. And it was also really cool, the media was picking up on this. We weren't actively reaching out to the media, but because we were doing open science and pushing graphics on GitHub, they were finding them and using them and developing a narrative. So this is a newscast where this is one of my graphics here that I created. It's pretty funny. If you go to that link, you'll see a video of me being interviewed where I really needed a haircut. But again, this is what I love to see happen when you're doing data science. You want people to use this information. To me, this is the best example of how this can be done or how this was done.
Fortunately, you know, this was a couple years ago. Piney Point is currently being closed. Obviously the work I did was not the sole reason why this place is finally being closed down, but I do like to think that some of the work we did did help. It did help the conversation around what it means to manage these facilities responsibly and shed a light on how they can really have negative impacts when they're not managed responsibly.
Key takeaways
So I think this is, you know, pretty straightforward. Again, this was, like, how do we do data science when there's, like, tremendous sense of urgency? Obviously you can't focus on everything, triage, what's important, prioritize your time. But when you do make these shortcuts, use the guardrails, because they will save you. Like I said, the testing that I did was just instrumental to making sure we didn't break things or push inaccurate information that could lead to wrong decisions. And finally, get those champions as expert allies that can help you with what you're doing. I'm the only data scientist on the team, but they were so instrumental to me to help me do the work that I needed to do when everyone was sort of running around in chaos at the time.
So that's it. Those are my champions. Shout out to them. I want to thank PositConf for giving me the time to talk about this, and appreciate your time, and I'll take questions if there's time.
Q&A
Thank you so much, Marcus. One question here. Do you have suggestions on how to test for data quality issues that you don't necessarily expect to occur?
That's a very good question, because the things I coded for were things that I expected to occur. And I'll say, you know, I know some of that information that was getting out there was not perfect. A good example is I was, you know, what's common with water quality data is you have replicate measurements or lab blanks that are meant to be like QC information that typically doesn't go on a dashboard. I was pushing some of that information up there. It wasn't wrong information. It was just QC data. And what was cool is because this was a high-profile dashboard, the data provider actually called me and was like, hey, you know, you're pushing QC data to the dashboard, just FYI. So I don't have a good answer to that. We know it wasn't perfect, but it met our needs at the time.
Great. Could you expand on test that capabilities that you used for the dashboard? Maybe briefly?
Yeah. So I'm not a database manager or database engineer. I kind of developed my ad hoc schema of what I wanted the data to look like. So it was in that unified format. And I basically developed the tests around that. And it also looked for really egregious things like negative values for dissolved oxygen. It can't go negative, for example. So there's some obvious cases, but it was also supported by decades of monitoring data that we have for the region that gives context of what is normal, what is abnormal. And we sort of, we try to incorporate some of that knowledge into the testing as well. So it was informed by my experience as well as just what was feasible or physically impossible.
Amazing. Together with your champion allies, could you give a quick suggestion on anything you've learned about managing user requests, considering that you've probably got a lot of them if the dashboard was public?
Yeah. When we developed this dashboard, we kind of flipped the process on its head. It was a situation where this information needed to get out there ASAP. So I just pushed something out there. I think it was out there within four days of the release starting. So we had information out there because there was a concerned public, decision makers. So we didn't have the time to incorporate user requests like you normally would or the intentional design process where you kind of meet their needs and where they're at.
So I'll say we kind of just went at it in sort of an ad hoc way and did the best we could. And I think the proof of concept that it was being picked up by the media, it was something I tweeted about. I had analytics running on it too. I know people were using it. And I would go to talks and see screen grabs of the dashboard. So I had a good sense that it did what we wanted it to do. We just had to be really intentional with the types of requests we were getting from users.
