Resources

Nick Pylypiw @ Cape Fear Collective | Building up trust with stakeholders | Data Science Hangout

We were joined by Nick Pylypiw, Chief Data Officer at Cape Fear Collective. Nick shared that one thing he’s been successful at in his career is building up trust with stakeholders. How do you build that trust? First off, people like data and analytics if it confirms their preconceived biases and beliefs A lot of people are fans of analytics and data science if it's confirming what they already believe. One of the things working in pricing and marketing analytics at Lowe's, is if I went into a room with a bunch of people who are marketing lumber and I said, “Hey, the promotion that you did last week did really well.” They may think, yeah, we really like this guy. If I said, “Hey, but it missed these markets and these markets” …if it counters their beliefs then that's where they start poking holes and wanting to find fault in it. Find some quick wins. What are some small projects we can work on together that are low stakes projects that get people to buy into it and see, “Ok, now I can see why there's value in this way of thinking.” When that inevitable cognitive dissonance comes, then you've built up some trust and some capital with that partner. There has to be a boots on the ground connection. For the same reasons we talked about building trust and capital in a Fortune 500 company, if you think about the state in that same way, if you're from the beach town and you try to go up into the mountains and tell them what they don't know about their public health crises, you're going to get run out of there. You still have to build these connections organically. The same way we talk about building capital in a big company - especially with this line of work, you'd be surprised at how much buy-in you can get if you show up and you actually know how to pronounce the town name and the county name correctly. Do you go into Beaufort, North Carolina and you say “Buford”. That's South Carolina by the way. Spelled exactly the same, Beaufort in South Carolina and Beaufort in North Carolina. If you pronounce it the other way, you've lost them before even the first slides hit the screen because it's just some outsider coming in here who knows nothing about us. Learn about their area (of work) One of the things that our data science team has done is when you're learning about a new area, one of the first tasks that we start doing is putting together a deck of the five worst areas of poverty in this area. I want to know the five areas where there's great employment parity or great unemployment rates, all these different things. Through that exploration, you understand, oh, there's this town here and there's this county and you learn about that area. So now when you are in a conversation, you can intelligently talk about that region. It doesn't sound like you're just some person looking at a chart and saying, you need to do this and this. ► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu Follow Us Here: Website: https://www.posit.co LinkedIn: https://www.linkedin.com/company/posit-software Twitter: https://twitter.com/posit_pbc To join future data science hangouts, add to your calendar here: pos.it/dsh (All are welcome! We'd love to see you!)

Mar 23, 2023
1h 0min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Welcome to the Data Science Hangout. Hope everybody's having a great week. If we haven't had the chance to meet at a hangout before, I'm Rachel. It's so nice to meet you. Let us know if it's your first time joining if you want to say hello in the chat so we can all welcome you in as well. This is our open space to chat about data science leadership, questions you're facing and getting to hear about what's going on in the world of data across different industries. And so if you haven't been here before, every week we feature a different data science leader as my co host here to help lead our discussion and answer questions from you all.

So we are all dedicated to making this a welcoming environment for everybody and love to hear from everyone no matter your level of experience or background or area of work. It is also absolutely okay to just listen in if you want to. But there's also three ways you can ask questions or provide your perspective too. So one, you can always jump in by raising your hand on zoom and I'll be watching out for that. Two, you can put questions into the zoom chat and feel free to put a little star next to it if you want me to read it. And I can call on you to introduce yourself and add some context otherwise. And then third, we have a Slido link where you can ask questions anonymously.

And so we do share the recordings of each session to the Posit YouTube. So you can always go back and rewatch or share with a friend. But with all of that, welcome, everybody. Thank you so much for joining us. And thank you so much, Nick, for joining us the co host today. Nick Pilipiu is Chief Data Officer at Cape Fear Collective. And Nick, I'd love to have you introduce yourself and maybe start off by sharing a little bit about your role and also something you like to do outside of work.

Sure. So my name is Nick Pilipiu. I'm the Chief Data Officer at Cape Fear Collective. We are a Cape Fear movie. People are always like, is it Cape Fear like the movie? It is based on Cape Fear regions, the kind of southeastern North Carolina region, kind of Wilmington, if you're familiar with that area. We are a nonprofit based out of Wilmington that's really focused on the state of North Carolina now. And we're really looking to use data to move the needle on on these kind of social impact issues, social determinants of health.

My background is more in private sector data science consulting. So I worked for a couple of consultants and was doing work with Southwest Airlines and Procter & Gamble and Lowe's and Baltimore Ravens and Cincinnati Reds and a bunch of different organizations and really looked at this as an opportunity to kind of make an impact in the state that I'm from and that I love. Some things that I do outside data is just kind of building stuff in my yard. I'm building a porch now, a screened in porch in the back of my house. I'm into music and surfing and just kind of hanging out here. And I was still North Carolina.

Journey into data science

Well, thank you so much, Nick, for for joining us. And I it's really impressive to to hear your journey and all the different types of organizations that you've worked in. I know from your bio, you were once a teacher as well. And I'm curious to kind of hear a little bit about that journey into data science.

Yeah, so I did get my my undergraduate. I got a undergraduate in math and math education. I was a double dual major and taught high school math for a couple of years, geometry and AP statistics and found myself really every day in class with these AP statistics kids kind of talking to them about here's why statistics is so cool and here's all the awesome things you can do with this and the applications and realized, you know, of course, they're like, yeah, all right, whatever, Mr. P. And really realized that I was talking myself into going back to grad school. So came home one day and told told my office, I think I want to go back to school and get my master's in data science. So went back, left the classroom, went back to the NC State Institute for Advanced Analytics and got my my master's degree there and then kind of went into the to the data science world.

And I haven't really looked back since. I still miss teaching sometimes, but I get to do a lot of teaching in my leadership roles in my organizations as well. So I've got a lot of junior devs in. I've got six data scientists on my team at K4 Collective. I've had a number of interns and volunteers over the years that I still get to kind of scratch that itch for for teaching a little bit just outside the classroom.

Teaching and mentoring junior data scientists

I was just going to ask you, what does that like teaching look like now within your organization, like helping new new users learn?

Yeah, code review, just kind of scaffolding, you know, a lot of the things. I like to say that I can teach anybody to code. What I can't teach people to do is be curious and kind of think about the kinds of things they should be asking on a project. So a lot of times on a project, especially with a junior dev, I'm showing them my thought process. I'm thinking out loud and kind of like, you know, a lot of times, you know, they'll, they'll, hey, I've got this error and I can't figure this out. It's like, all right, let's screen share. It's like, all right, first of all, what's the error set? Well, I didn't look at it yet. All right, well, that's the first thing to do is look at the error, right?

Like, because these things sometimes tell you exactly what's wrong. Sometimes they're not. And some of the errors are less intuitive, especially, you know, depending on programming language. But we are in R shop. So we do a lot of kind of R code review. And a lot of it's just, you know, what do you, what are you expecting it to do? Is this what you expected to happen when you wrote this line of code? It's like, well, no, you know, so I think a lot of it is just kind of that modeling my thought process out loud for them to kind of see, here's how someone who's been doing this for a while, thinks about these things and approaches these solutions.

Building trust with stakeholders

But Nick, something I wanted to ask you is, I know you've worked at a few different fortune 500 companies, and you said you've been, you've helped them transform the way they think about their customer strategy. And I would love to learn a little bit about how you did that. I think a lot of people here in the audience might be in a position where they're trying to like get their company to be more data driven and trying to push new tools forward.

So as I'm sure you all have had someone else tell you, or maybe you haven't, but you know, a lot of people are fans of analytics and data science, if it's confirming what they already believe. And one of the things, you know, working in pricing and marketing analytics at Lowe's, if I went into a room with a bunch of people who are marketing lumber, and I said, hey, the promotion that you did last week did really well. They're all like, yeah, all right, we really like this. We really like this guy, right? And if I was like, hey, you know, but it missed these markets and these markets, then there is a reason why. Like, no, you didn't analyze it right. You didn't do this and you didn't do that. So, you know, people like data and analytics if it confirms their preconceived biases and beliefs. But if it counters that, then that's where they start poking holes and wanting to, you know, kind of find fault in it.

people like data and analytics if it confirms their preconceived biases and beliefs. But if it counters that, then that's where they start poking holes and wanting to, you know, kind of find fault in it.

So one thing that I've been very successful in my career is kind of starting to build that trust. Find some quick wins. What are some small projects we can work on together that are low stakes projects that kind of gets people to buy into and say like, okay, now I can see why there's value in this way of thinking. So that when that inevitable, yeah, exactly. Don't start with forecasting. That's good advice. When that inevitable cognitive dissonance comes, then you've built up some trust and some capital with that partner.

Corporate vs. nonprofit data science

Patrick, I see you have your hand raised. Do you want to jump in? Thanks, Rachel. And yeah, good morning, everybody. Happy to chat with you again. Nick, you and I spoke a couple of years ago. You may not remember, but you were helpful on a couple of things for me. And so I appreciate that and glad to see you again. I'm wondering about the comparison and contrast, you know, similarities as well of working in those big corporate settings versus where you are now and nonprofit and, you know, social services and that kind of thing. Personally, I've been in academic and nonprofit settings my whole career. And so it seems to me, it seems appealing that, you know, people really have their things together in a corporate setting and everything is well-funded and runs smoothly. And I assume you can dispel that myth for me, but I'm just kind of wondering about your thoughts on that.

Yeah, absolutely. And I think you nailed part of it and Marlene nailed a part of it here too, because there's certainly this, it is a myth that these large companies really have everything all put together because the reality is I saw some things in every, all these Fortune 500 companies I've worked in, I saw things that would make all of you cringe. And I'm probably not going to say any of those because I don't want to get myself in trouble, but things that it's just like, I can't believe you run your stack that way. That is not the way a $80 billion company should be running that. This file should not reside on one person's laptop, right? That should be in a shared platform locked down, et cetera.

One of the biggest differences though, for me, is in large companies, people can hide, you know, like there's, there's kind of, there's room, your errors and your kind of weak links on your teams don't kind of expose themselves as readily as on small teams. On some of the small teams I've been on, my previous company was 35 people and currently at K4 Collective, we're a team of 12 and there's no room for somebody to just kind of not, not be pulling their weight.

Another big difference is I think a lot of the Fortune 500 companies, whether or not they're actually implementing it is another thing, but they at least know that they should be. They've heard enough times through all the articles and through the conferences and through their leadership, like you need to be doing things, data, KPIs, measurements, dashboards, et cetera. They know they need to be working on it. There's some pressure to kind of do that. Whereas in the social space, a lot of the stuff's very new and it's a dramatic cultural shift. A lot of the nonprofits that we work with are used to being able to just, and I'm going to kind of say this kind of tough, you know, for a minute, they're used to kind of just patting themselves on the back and saying like we did good work and that's good enough. And the thing is like for the funders and the philanthropic organizations and grants that are funding them, it's not good enough anymore. You have to be able to show us that you're moving a needle, which means you have to set some KPIs, measure them, collect data, and show that you're making impact against that or else you're not going to qualify for that next round of funding. We can't just do feel good funding anymore.

Getting your first data science role

I see a few anonymous questions coming in on Slido and one was, can you talk a little bit about getting your first role in data science? Any tips for jumping from data analytics to data science?

Sure. And this whole analyst versus data science thing is such a weird, ambiguous thing anyways. If you ask, you know, there's people that I know who call themselves analysts or titles analysts and they're programming in Python and R every day. And it's like, that's a data scientist to me. And then there's other people who are like, well, I'm a, I'm an analyst. And it's like, well, are you a business analyst or are you a, you know, like there's, so titles kind of all get kind of, get kind of wonky anyways.

But for me, you know, my getting my first role, I really wanted to be somewhere where I had other folks who could challenge me that I could learn from. I think a mistake that I see in a lot of young devs and data scientists is they go right to a startup or right to a small company. There's nothing wrong with either of those. I've done both, but you're on an Island. You don't have the support that you need. But a lot of these larger companies have support structures in place. There's going to be senior data scientists, senior analysts who can help you and know the data sets and can, you know, so find one of those large companies with a solid structure, latch onto a couple senior devs and be a sponge, right? Just absorb everything that you can.

And then the other thing that I can provide that I always tell people who are trying to break into the spaces, nobody likes somebody who pretends like they know everything because none of us do, right? Like there's power in saying, I don't know. And if you apply for a job on my team and say that you know how to do natural language processing and modeling, and I put you on a project that requires that skill set, and it's obvious that you don't, I'll know like it's that you're not fooling any, you're only fooling somebody until it's time for you to use that skill. So be honest about what you know and how good you are at it. If you say you're a 10 in R, I don't even consider myself a 10 in R. So you better be really good if you say that.

And Marlene, that point about like, every time I think I'm an expert, I think there's a graph that shows that kind of like, the more you learn about something, and then you hit a point where you're just like, like, there've been times in my career where I'm like, I'm an expert Excel user. And then I'll meet somebody that I work with. I'm like, wow, that I, I'm like a two. And then I'll be like, okay, I've learned now I'm like a seven or eight, I'll meet someone else. I'm like, four, you know, so definitely, there's always someone out there who's better at it than you.

Another anonymous question was any advice for someone working for a large corporation that's very resistant to new tools to progress in the data science world?

Yeah. Brush up your resume? No. I think it's a, it's a hard thing moving. Someone else made the point earlier that, you know, big ships turn really slowly or something like that, you know, and it's, it's true. And yeah, I think other than just kind of keep, keep beating the drum and keep trying to prove value. Unfortunately, a lot of these, these companies just sometimes, it's going to take way more than you to get it. So what I do, or what I recommend to people is when you're interviewing for that job, make sure the questions that you ask are kind of hinting at what that data culture is. One of the questions that I think is a really good question to ask is what is a model or a process that your team has built that is currently being used for the company to make decisions on?

And you'll be really surprised to hear the wide variety of answers that question gets, because it's really obvious to hear someone's kind of like, like fumbling, like, oh, we built this thing, but it never got implemented because of whatever, or we've been working on this thing and for a couple of years and it's never quite gotten the traction that it needs. And there's going to be other teams that are like, oh, we've got this, this thing that's helping with our, you know, search engine optimization. And we've got this other model that's out there that helps determine, you know, where we build new locations. And what you hear there is this is an organization that is putting their money where their mouth is. They're not only hiring people to build these models, but they're making business decisions based off that.

The community data platform

Just to give us a little bit more context to you on some of the work that your team does today, I thought it might be helpful to hear a little bit about the community data platform or maybe like an example of a problem your team is solving.

Sure. So one of the problems that we hear a lot, or one of the kind of issues that we hear a lot is there's, there's not data for that. You know, you have, so our organization's really working to build data capacity across these nonprofits. There's 1200 nonprofits in New Hanover County and there's no shortage of passion there. They all think they're doing good work. And they are, but there's no real data capacity to understand, you know, what's the value in generating a hypothesis and measuring data, collecting data.

So what we keep, what we always hear is, let's say, well, hey, you've been doing this thing for 30 years. How's it going? How's it working? And they're like, well, you know, people really like our food boxes or whatever. It's like, okay, that's cool. Is it actually like moving the needle on food insecurity or, or, or hunger or health issues in your community? They're like, well, there's not really data for that. I'm like, let's see about that.

So one of the first things we did is we pulled in Census Bureau data, Bureau of Labor Statistics, CDC, FHFA, and all these publicly available data sources that for us as data scientists are really easy to grab because we know how to use an API to go do this. We know how to scrape a website. We know how to search Google specific terms that are going to lead us more successful results. But for the average nonprofit community leader, they don't know how to use an API. They don't use Python. They don't know any of this stuff, right? So for the first time ever in our community, we really have this holistic 360 degree view of what a community looks like.

And the way we're using that is to do things like you've got a nonprofit who wants to do something. We're going to find a metric that works for you and help you build a mission around that. We are going to decrease children in concentrated poverty in this neighborhood by 5% over the next 10 years. And we need $200,000 to do that or whatever, right? Like that is a statement that is impossible to make without understanding how many people currently live in that situation, what the current projection is for that neighborhood, and kind of like what a reasonable decrease is for that metric.

So a lot of our work is really kind of around that, but there's some other larger projects. One that I'll talk about is this awesome video that we just put out. It kind of walks through this, but the Michael Jordan Foundation just put these two clinics in Wilmington. Michael Jordan's from Wilmington, and he wanted to put them in areas where they're going to make the most impact for low-income, marginalized populations. The hospital system said, that's great. We know a lot about chronic disease and we know a lot about health conditions. We know very little about the kind of social determinants of health that you've been studying. So help us figure out from a data perspective, specifically where these clinics are going to make the most impact. So that's kind of our flagship project and one that we're really proud of.

Measuring outcomes in the nonprofit sector

Morris, I see you had asked a question in the chat a bit earlier. Do you want to jump in next? Yeah, thanks. No, I think you just answered it. I'm working in a non-profit and I'm a social worker. Transformation is always really the big question. How do we measure that? That's what we want. We're pretty good at tracking outputs, but the outcomes and it's a little bit more difficult. But these are helpful. Thank you.

Yeah, and that's actually a question that we struggle with a lot too. So I'm glad you brought that up because a lot of times it's not fair to say, hey, food bank, it's on you, Food Bank of New Hanover County, to turn this area into a non-food desert by yourself, right? And not only that, but the lag in data is such that the USDA only classifies these food deserts every five years. So it might be five, 10 years before we even know if you've moved the needle on this kind of macro metric. So what we try to do is we try to tie these kinds of micro data metrics to those leading indicators. So we might not know the Census Bureau data for another two years because of the lag, but what we do have is this real-time emergency department data or real-time police data that we have for our region. And if we can find the correlation between those things, then what we can do is we can say, we can see a signal here at a right now level that we know is tied to this kind of macro level. And then we can kind of hypothesize that you're moving the needle at this greater level that you haven't been able to measure before.

Another question that was asked anonymously a bit earlier was, how do nonprofits think about the substantial costs of a data team? Are these getting more common with the trend of needing to prove moving the needle?

Yeah, I mean, the reality is it's unattainable for the average nonprofit to have a data scientist on staff. But that's okay because that's one of the things that we're hoping to do because part of our business model is we don't need athletic charity missions of Wilmington, North Carolina. It doesn't need a full-time data scientist. All they really need is 40 hours with somebody on my team to kind of help kickstart the way they think about data. So, instead of saying, hey, every nonprofit in North Carolina needs to go hire a data scientist, it's how do we kind of become that knowledge hub that kind of through contracts, through pro bono contracts. Over the last three years, we've done $2.8 million in pro bono data science consulting. So, what that looks like is somebody on my team sitting right next to a nonprofit looking at their Salesforce or looking at their Excel or downloading their data and saying, hey, did you know that 15% of your clients are from this neighborhood, but that actually the majority of the need in our county is over here? And they're like, hmm, never really thought about that.

Working with multiple data sets

Yeah. I mean, if I wish I had a better answer for it, but the reality is I just started grabbing stuff. You know, I like to say I've never met a data set in this space that I didn't like. And that's still true, even though we've met some garbage data sets, we still bring it all in because we think that the kind of the triangulation and the whole combination of all these different data sets really adds to the story.

And I'll give you a kind of an example of this. We're talking to a hospital in New Hanover County, and we had a hospital who was looking at doing some outreach for COVID testing. And they're looking at kind of where do we need to put the Spanish flyers, right? We've got these English flyers, we've got TV ads, we've got signs, where do we need to put flyers that are written in Spanish? And, you know, we told them, well, you need to put them here. And they're like, well, based on the Census Bureau data, only like, you know, I'm making the numbers up by camera off the top of my head, but only like 5% of the people there are Hispanic or English second language or whatever.

And it's like, okay, that might be true. But for reasons that we don't have to discuss in this call, a lot of times non-English speakers are de-incentivized to fill out the Census Bureau. And actually there's a fear factor there and everything else. So the Census Bureau historically undercounts them, number one. And then number two, based on your data, you have a large portion of Spanish speaking patients who live in this area. So there's a mismatch between the Census Bureau data and the hospital data. Who's right, who's wrong? Don't know, but it's like, if you only had one of those data sets, you might make a decision that's not actually the right one. So by having them both, you have a more complete version of the story.

Regional focus and building community trust

Yeah, it was really natural for us. We had some really early success in the first year where people in New Hanover County and that Cape Fear region were like, we love this, we love it. We started seeing like, all of a sudden people in Raleigh are asking us and people in Charlotte are asking us. And we're like, are we going to go statewide? Are we going to go regional or are we going to be like Eastern Sea? When does this stop? And it kind of auto corrected itself because we kind of got to the mountains of North Carolina and people up there all of a sudden were just like, I don't care where you're from in Eastern North Carolina. You don't know anything about us up here, right? So there still has to be this kind of boots on the ground connection to the region or else for the same reasons we talked about building trust and capital in a Fortune 500 company.

If you think about the state in that same way, if you're from the beach town and you try to go up into the mountains and tell them what they don't know about their public health crises, you're going to get run out of there with a stick, right? So I mean, you still have to build these connections organically. So that's kind of limited our business plan in a way that I think makes sense, not in a bad way like, oh, we can't do that. Now the other thing is we've made it very clear that we want to work kind of by, with, and through and not around. So areas that already have, if there's an area in North Carolina that has an organization that does what we do, we definitely don't want to go in there and say like we're going to replace you, right? It's how do we work together? How do we start sharing some tools on some of this work?

Because we understand that you're here and we can't, there's plenty of work for all of us anyways. So we don't really have our sights set on any sort of, there's so much work in Eastern North Carolina. Here's a stat for you. If you carved North Carolina in half down highway I-95 and you just looked at the East part of it and made that a state, it would be the poorest state in the country. So no Raleigh, no Charlotte, no Winston Salem, no Greensboro. If you just looked at East part of the state, it would be the poorest state in the country. So, I mean, that's the level of challenge that we're working with.

If you carved North Carolina in half down highway I-95 and you just looked at the East part of it and made that a state, it would be the poorest state in the country.

Tech stack and infrastructure

Yeah, so we're using, like I said, we are an R shop. So we're using R Shiny and Shiny Apps. The community data platform that we built is a MySQL-based product that is in AWS. So we're using S3. So a lot of the stuff that we're using is cloud-based. And we just kind of made that decision early on because it was just so much easier to scale it. And cloud storage is so cheap now that it just doesn't really make any sense. So definitely excited to see about kind of the changes that come with Posit. I'm not just saying that because this is a Posit call because I'm an R guy. I love R. I've used Python. I've used SAS. I actually started my career in SAS and quickly switched to something else. But yeah, that's what we're using.

Skills beyond the technical

Hi, Nick. Hi, everyone. Love your talk so far. I'm always curious. You seem very articulate. And obviously, you know your stuff with analytics. But I'm always curious, is there kind of a skill set or talent that you think most people wouldn't associate with analysts? Kind of like a preconceived notion about we're just nerds that sit in cubicles and don't know how to interact with people, right? What's a talent or skill set that you think people should probably know, like it's probably handy to have, that maybe it'd serve you well, that we wouldn't think of as analysts and scientists?

Yeah. You know, I think it's going to sound cliche, but just like listening and being able to have a conversation. Like I can't hire robots. There's no position on my team for robots. And these days of kind of like we keep the nerds over there. We kind of like give them their assignment under the door and don't touch them, leave them alone, let them do their thing. Like those days are over. Like there's no space for somebody who does that in most companies.

In general, like listening and being able to go back and take feedback and not be offended. Because a lot of times, you know, there's this, I was actually just laughing at this meme I saw on LinkedIn, where it was like, you know, how people see a data scientist. It's a bunch of people kind of like sneering over at this alien who's typing in a foreign language. And it was like, how data scientists see everyone else. And it's like, they're all like cavemen, like, and they're all kind of, you know, waving these clubs around. And there's some truth in that. And it's awful, but it's like, there's been so many times where I've got out of a meeting. And somebody said, like, oh, well, they don't really know what they want. I'm going to build them something anyways. It's like, okay, that's a route that you can go.

That's a possibility. But like, where does that get you? That gets you, now you spent a whole sprint making a development that you know, they just said they don't want. And now you're going to be frustrated and, you know, double down and say now they really don't know what they want. But it's really just because you didn't listen. So taking those extra moments to kind of really listen to what they're saying, and not, and there's a difference between like listening to the question they're really asking and like, finding the question you want them to be asking and what they're saying, right? And that's a subtle difference that I think needs to be made.

Yeah, I definitely had to coach some people on that. You know, I think a lot of times it's just, I definitely have the management style of just kind of modeling and scaffolding that behavior. Like, it's, I feel like it's easier for people to learn it if they just kind of see you interacting with clients in a respectful way and kind of listening to them and asking those probing questions. Because you can kind of just say like, hey, make sure you're listening to the client and make sure you're kind of really asking those probing questions. But like, what does that look like, you know?

Building organic regional relationships

Sure. Hey, Nick, thanks for doing this. So when you were talking about the difference between like, you know, East Carolina and the coastal Carolinas and like, how do you organically build those relationships? Because I feel like maybe there would be, maybe not resentment, but some sort of like, kind of like standoffishness of like, okay, this is a coastal guy. He doesn't get it. He doesn't know what's going on. So how do you build that trust in that relationship?

Yeah, and it's even worse when they find out I live in Raleigh, because I actually grew up in Eastern North Carolina on the coast, but now I'm in Raleigh in the capital. So then I'm talking to some of these small towns and it's like, oh, just some guy from the capital, from the city telling us how to do stuff. If we ignore him long enough, he'll go away and start bugging somebody else, right?

The same way that we talked about, you know, building capital in a big company, you know, especially with this line of work, you'd be surprised at how much buy-in you can get if you show up and you actually know how to pronounce the town name and the county name correctly. If you go into, you know, Beaufort, North Carolina, and you say Beaufort, that's South Carolina, by the way, spelled exactly the same, but it's Beaufort in South Carolina, it's Beaufort in North Carolina. If you go into Beaufort and call it Beaufort, you've lost them before even the first slides hit the screen, right? Because it's just some outsider coming in here who knows nothing about us.

So just being able to go in there and say, and one of the things that our data science team has done is kind of, you know, when you're learning about a new area, one of the first tasks that we started doing is put together a deck of just kind of, I want to know the five worst areas of poverty in this area. I want to know the five areas where there's employment, you know, great employment parity or great employment rates, all these different things. And through that exploration, you kind of be like, oh, you kind of understand, oh, there's this town here, and there's this other town here, and this is this county, and you kind of learn about that area. So now when you are in a conversation, you can kind of intelligently talk about that region. It doesn't sound like you're just some person looking at a chart and saying, like, you need to do this and this. So that's really helped us to really just learn about the areas that we're working in.

Collaborating with public health agencies and universities

Coulter, do you want to jump in next? Yeah, sure. Thanks. Just a quick background. I actually work on a data science team in my local public health agency, and we quite often help out nonprofits. We get tons of requests, hey, we need some data, but, you know, we legally can't get it. Can you analyze this? We want some spatial analysis. But I was just thinking how cool it would be if we had a local nonprofit with the skill that your organization has. So I was wondering if you've ever had a chance to collab with local public health agencies or academic institutions and kind of what that relationship was like.

Yeah, and I'm really encouraged to hear that some of the local governments are starting to get that skill set in-house because I don't think that's very common. So that's awesome to hear that. We work closely with a lot of county governments, city governments, universities, definitely UNCW, NC State, UNC, some folks at Duke and UNC Charlotte and some others. But to be honest, at first, especially the county, the local governments are always all in. They're like, yes, please help us, help us figure this out. There's been some kind of resistance at first from some of the universities are just kind of like, who's this plucky little nonprofit who's coming in here and acting like they can do this data science work? And then when they kind of see like, oh, actually, they've got some people on staff who have some credentials and some experience and know what they're doing, then their tune kind of changes a little bit.

But, you know, a lot of times, you know, one project we worked on was, there's a county in North Carolina that should be in the MSA that Wilmington's in, but because there's a river between the two, there's a bridge, it's actually in the Myrtle Beach MSA. So like we did some work with the Brunswick County government because their thing was, look, 60 percent of our people work in that county right there, you know, but we're in this other MSA in South Carolina who doesn't care about us. We don't get any support from the funding there. So we're kind of in this weird, ambiguous limbo space where like, you know, how can you help us make the case to the Census Bureau that we should be in this MSA? So we compiled all this data and basically found all these commuting patterns and, you know, residency patterns and everything else that basically showed, you know, this belongs in this other county. TBD, whether the Census Bureau is actually going to listen to us because we haven't got that result yet back, but that's kind of an example of one of those projects that we did.

Interacting with clients and setting expectations

Thanks for the opportunity. You know, I think I'm getting the sense that I may be in the minority and that data science is not my primary work. And so I am trying to sort of know enough to be dangerous about data science just to interact effectively with you guys. And so I'm curious when you sit down with a client, like a university, how, like what expectations do you set or what deliverable do you talk about?

Yeah, a lot of times it's just knowledge sharing and data sharing at first. It's, hey, we'd love to work with your organization because a lot of times universities are the path to opening up greater data access. There's financial institutions, for example, who are way more willing to give UNC Chapel Hill a big pile of people's credit card transactions than they are my team of nonprofit data scientists, six folks sitting on the coast of North Carolina, right? So a lot of times that conversation is how can, we've got a gap to fill projects that are maybe too small for you because a lot of these universities don't also want to sit next to, you know, cause for pause and canines for veterans and help them, not because they don't want to, but because they have bigger projects they're working on, Blue Cross Blue Shield and other large projects. So we're filling a need that you can't fill, but we need resources that you have.

So we also have a couple of projects where we'll enlist kind of the research arm of these facilities because love universities, love research, love academia. Things move very slowly in the university world and too slowly for a lot of the organizations that we're working with. So yeah, that's kind of where we see that lane is. When it comes down to doing a longitudinal study, we'll kick it over to you, UNCW, but right now we just need to kind of do some basic analysis.

So a lot of times, you know, all these counties have to fill out these community health assessments every year or every three years, I believe it is. They're mandated by law to do these community health assessments. The universities are totally capable of doing them, but they kind of bid for them. And what you end up with is like UNC's done these and NC State's done these and ECU's done these. And now it's really hard for these counties to compare how they're doing against these other counties, because this other county has a completely different format, completely different metrics they focused on. So like one of the things we've pushed on is how can we help you align on a common set of metrics so that when you're doing it, or you're doing it, or you're doing it at Duke or whoever, we all are at least looking at, you're still going to have room for your own flavor and your own research and academics, but can we all at least agree that like these 10 metrics will be in every single report, measured the same way, talked about the same way, so that it allows us some level of knowledge sharing and comparison. Like that's something that we can, a recommendation we can make as an outside entity that none of them are going to make to each other.

Tidy Tuesday and open data

Yeah, so as a user, I've done Tidy Tuesday a couple times. I think I did one on slave migration, one on tennis stars, maybe one on a couple other random things. I think it's just such a cool project to just everyone rally around a data set and kind of code for a couple hours on a Tuesday. I'm trying to think about how to wrangle our data platform into something that would be not too overwhelming for that because we've got this big SQL database that has, I think it's got 170, 180 tables in it. Obviously, we don't want to just kind of say like, here, play around in this big, because people will just get lost. So can we create like a CSV that we've kind of joined three of these tables together and say like, all right, you've got food insecurity, social vulnerability index from the CDC, and poverty at a census track level. Go. What can you do, right? So yeah, we're definitely going to put a submission in just kind of how we provide something that's like the right scale for an activity of that size.

Teaching as a way of knowing

Yeah, I mean, if I'm being honest, it wasn't really a choice. I think people who kind of go into teaching have a certain personality where it's just, I was going to be teaching wherever I was going to be at anyways. And also, I think I learned that way. Like some people learn best through teaching. And when I have to explain something three different ways, it just cements my understanding of it that much better in my head. And so I think that's the thing that I really enjoy about teaching is like, if you know something, you can always tell when somebody really knows something if they can explain it a bunch of different ways. If you ask somebody to explain something and then they say it, and you're like, I'm still not getting it. And they just say the same thing over again. It's like, maybe you don't get it either. Both of us, I think, don't get it. But it's like, if they can kind of use different analogies, then it's like, all right, yeah, I got it. So I think that is just part of how my brain works. And I just enjoy interacting with people that way, too.

Balancing descriptive and advanced analytics

Sure, thanks. So we get pushbacks sometimes for our projects not being data sciency enough. And so I'm curious, some of the stuff you've described, obviously, is more advanced. Some of it sounds fairly descriptive. I'm curious what those proportions are and how much that influences your decision to take on a project as a team.

Yeah, is that pushback from staff? It's both. It's both. If people are like, you know, that's not a predictive model, man. Why are you spending your time on that? I was like, well, it's what was needed, right? Right, I know. I think those conversations are easier for me when it's somebody pushing back and saying, like, this model is not complicated enough. I'm like, I can build you a support vector machine to do what this logistic regression does if you want me to. But it's going to cost you another five weeks of work. And the R squared is going to be just the same or whatever.

So I think that conversation is easier. The thing that I think I really struggled with when I was building this team, so I was the first data scientist. They brought me on in April of 2020. And they're basically like, hey, we want to do this cool data thing, this nonprofit data thing. I'm like, cool, what kind of data sets do you have? They're like, we don't have anything. I'm like, I'm in. Let's do it. And they're like, who else is on my team? They're like, it's just you. I'm like, sweet, I got it. So as I started building the team out, I had to be really honest with people I was bringing on board that like, if you're the kind of person who wants to do like NLP, bleeding edge, TensorFlow, this kind of stuff, this is not the role for you. We're going to be doing a lot of logistic regression, k-means, k-modes, summary, mean, equals, whatever type stuff.

And so I just approached it by just being really honest with them up front. I said, but what I can promise you is there's going to be a lot of impact that we're going to make. Because it's not a cop out to me to say there's so much impact that we can make with just descriptive statistics and summary stuff right now in this sector that we're just scratching the surface. By the time we get into some of that advanced stuff, and we are doing some of it, we've got some pretty big projects where we've done some forecasting, we've done some more kind of modeling stuff. But I think that's how we've approached it is just been super honest about this isn't that role. It's not going to be a Google engineer role.

Open data and GitHub

You had mentioned a handful of these projects where you were able to really take advantage of these open data packages that seem to unlock some of the questions that the nonprofits were asking. I'm just curious how much of those projects are shared if you and your team maintain a GitHub. I'd really like to start connecting some of those dots myself, just sort of intellectual curiosity.

Yeah, we do have a GitHub, of course, but it is private to our organization right now. Not because we want to hoard the information, but really just because we're just not at a state where it's kind of messy in there. We've got some cleaning up to do. We've been going so fast for a couple of years that a lot of stuff needs to be organized. And we really need to figure out too which pieces we're able to kind of elevate to the public. We do have a desire to be, I just put the link to our community data platform in the chat. But that's kind of a public interface into the back end of the SQL database. So go and click around in there, you can see some of the sources of the data that we're working with.

But the thing is, there's really a couple partitions that we have in our community data platform. There's what you can see through this link, which is Census Bureau, CDC, all that kind of stuff. But then there's some private partitions that, for example, Wilmington Police Department arrests. We don't want that on the public GitHub or the public user interface because we have a very specific agreement with them that we're not going to do that. New Hanover County Schools, discipline reports and progress for children in public schools in the county. That's in a separate lockdown partition of the community data platform that we can use for some of the metrics we're creating and some of the analyses we're doing, but that we're not going to expose an endpoint to everyone else. So I think absolutely the in-state would love a world in which people are kind of pushing and pulling from our GitHub and saying, like, we want to add this new data set. And we're like, OK, here's the conventions for the data set. If you put it in this format, then we can add it to the community data platform. But we're just not there yet.

Starting a data science consultancy

Thank you. So one more Slido question. And it was, what advice do you have for someone who's interested in starting their own data science consulting company?

So I've known some people who've gone that route, and it can be very rewarding. I will say, I almost all of them will also tell you that they're able to do it for a couple of years and they got burnt out on it. And the main thing is acquisition or customer acquisition takes a lot of time. So I mean, just kind of pitching projects.