Resources

Data Science Hangout | Tori Oblad, WaFd Bank | Getting Executives to Support Data Science

We want to help data science leaders become better. The Data Science Hangout is a weekly, free-to-join open conversation for current and aspiring data science leaders. An accomplished leader in the space will join us each week and answer whatever questions the audience may have. We were recently joined by Tori Oblad, Enterprise Data & Analytics Officer at WaFd Bank. Here are a few snippets from our conversation: 1:14 - Start of session 3:00 - How to build an internal data science community 11:40 - Showing the art of the possible 14:00 - How do you get others to lead topics and foster engagement? 26:17 - Writing starter scripts for new users 35:55 - When to use R or Python versus BI 36:38 - Building toy models in Excel to explain it to people / to build relationships with business 38:33 - Avoiding vendor lock-in, being technology agnostic 43:35 - How to build confidence with IT and compliance 49:15 - Working with business users and creating business value 53:21 - Getting business and executive support 1:22:30 - What data scientists should focus on when communicating with stakeholders: value ► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu Follow Us Here: Website: https://www.rstudio.com LinkedIn: https://www.linkedin.com/company/rstudio-pbc Twitter: https://twitter.com/rstudio

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Welcome everyone to the Data Science Hangout. Welcome back to all the familiar faces. I think most everybody's been on here at least once, but also welcome to those joining for the first time.

Thank you, Rob, for covering for me last week while I was out on vacation. These discussions on data science leadership in the enterprise range from very human-oriented topics to a little bit of technical talk, but really focusing on questions that are most important to you all. So there's no agenda. Please join in live or put any questions that you have in the chat.

With that said, I'd like to just jump in here and introduce Tori Oblad, Enterprise Data and Analytics Officer at WaFed Bank. Tori is extremely passionate about training and building community, so I'm super excited to talk a bit more about that today, too.

Tori, would you be able to kick things off by introducing yourself and maybe sharing a bit about the work you do on your team? Sure. So for background, I have degrees in econ and math. I've been doing data science professionally for about 20 years in health care and in banking. I'm currently working at WaFed Bank where I'm actually starting the data science program there.

What's exciting in data science right now

So one that I'd love to ask you is what are you really excited about right now with data science? This isn't totally new, but when I first started, I was in SAS, and that's not very accessible. So over the years, it's been fun to see the accessibility, not just of tools, but really of data science being able to be shared with people. I'm now having conversations with people where they're starting to want to learn, and you couldn't take SAS to your house and learn it. But right now, currently at work, one of the things I'm instituting is having R as a standard. So anybody in the company that actually wants to do it can start learning.

Building an internal data science community

That's something I'm super interested in because a lot of people will ask me about how they start their own internal user group or build their data science community. And I'm curious, what is starting to or what has worked well for you, or what are you doing to try and start to build this? What I've found the most useful is it's really time expensive, but starting with a few key people and areas that want to learn something, working with them one-on-one on something very if I can fix or help them figure out how to fix their problem. So it has to be something that affects them. Otherwise, it goes away. We forget what we've learned.

Starting that community where it's one-on-one, hands-on training. What are the issues? How do you think you should go about it? When you can find those key people that are excited, they will actually start learning. And they become these internal, wow, this is really exciting. And it's more credible to hear from somebody that is not a data scientist or super programmer that's scarier, that you can do it.

I've found that is really helpful to have residents whether they are Excel users or whatever type of more beginner analyst to help internally a community.

What I've heard from some people lately is how do we actually find those people that would want to learn or the super Excel power users that could maybe benefit from using a tool like R or Python? Like you said, they may not have that data science title. How do you go about finding all of those people? One thing I've found very helpful is just for data science in general to be successful is to partner with the business. Generally, your business users are not programmers. A lot of that communication where we have an objective that we're trying to meet through those discussions, generally you'll find small bits of how are you going about that? Maybe we can tackle that in a different way. It turns into a lot of small side projects, but it's those relationships that you're making for the bigger data science issues that you find.

At my company, what I found to be very helpful is leading little learning cohorts. I really focus on EDA because RStudio is awesome for exploratory data analysis. I can create these little cohorts, some business analysts in there who work in, let's say, inventory management or transportation, pair them up with people working who you'd say are more technical data analysis roles. If you can get them learning together on a cadence, they form the relationships, they understand each other's problems, and they learn R at the same time. Then once you write any Excel user who discovers R, it's unlikely that they're going to say, wait, I'm going to only use Excel from now on.

Frank, if you feel comfortable doing so, it'd be great to introduce yourself as well to the group. Yeah, I'd love to. I'm really glad to be here. This is my first session. I work at Target. I live out in Minnesota. For the past three years, I came to work with Target in there. I came to build, let's say, a hybrid business intelligence team. I built teams of pretty good Python and R users. Anyone who didn't know that, I taught them R first because I think R is a great introduction. Then if they want to do more machine learning, NLP stuff, they get into Python. For the past three years, I've been building teams that really sit in the middle of the business and then being able to do data analysis at a pretty rapid rate.

I lead a team now that I call decision intelligence. It's a little bit of a branding strategy on my part, but I need my stakeholders to be thinking about the decisions. When they say, hey, what do you do? What is your team doing? I say, well, we start with the decisions you're making and the actions you're taking. Then we can find the analytics to support that. I lead a team, mostly support transportation within the supply chain of Target. I'd say my passion in Target is teaching people, especially heavy Excel users, how to get to the next step of what I call higher-powered data analytics.

Frank, you brought up a really good point. Something I found very useful when you start having a community, even if it's just a data scientist and you don't know who is interested in learning R or even what data science is, is creating more of a, it has to be organic or it doesn't really work, but having those more of lunch and learns where it's, hey, invite your friends because I think people forget how powerful their networks are. You invite this friend who has another friend.

I'm new to this position. I'm still building that internal community. From where I came from, it was interesting because we had a few data scientist groups throughout the bank. Over just a few weeks where people would come and say, hey, this is the cool thing I'm learning, we started getting in compliance humans, which do not speak computer. It was exciting where the understanding of even what is possible out there really started to spread organically with people you wouldn't even expect.

Don't forget your friends and your friends' friends. It is a very powerful and good way to create that organic growth of information and knowledge. For the lunch and learns that you do, is that something that's weekly or monthly? We started out, I think, quarterly, but then people were really interested and they wanted to actually show, oh, I'm actually working on this thing and it relates to this other person's thing. We started doing it not quite every other week. That's a lot. I think we're doing it about every three weeks.

That consistency probably really helps too. It's interesting too. Something that started to happen, it started out more data science heavy, but what I thought was useful is a lot of data scientists really love to talk about the nerdy geeky stuff. It turned into not just algorithms, but more context where people are getting a good audience of practicing for who you really need to be talking to to get buy-in. Your executives, they're not going to totally care about your algorithms, unfortunately. Understanding how to speak to that broader audience. It became fun for people though because it was, this is this neat thing that I'm doing, why I'm doing it, and how it works. Let me explain it in layman's terms.

Then what was also nice is people from other parts of the bank that weren't in data science, but the girl that was running up ads, she had no background in it. She explained how her job worked, optimization of Google works. She started working with one of the data scientists to make it better, which I don't think would have happened without that internal community, which is awesome.

Showing the art of the possible

Yeah, sure. Hi, everybody. My name is Sep. I find this a very interesting forum just to see how people in different industries even are impacting their organizations. I think for me, being in healthcare and healthcare tangent organizations, which in general, stereotypically are not very technical forward sometimes, I find explaining a lot of or showing a lot of the art of the possibility to be a fun exercise. I don't like to just do necessarily flashy, but just to say, hey, look, there's efficiency gains that could be made if we do things this way.

I think too, the flashy, even the simple, what generally we think is really simple, most people don't realize how easy it is to get a data set and play with it and make it easy. The horrible manual processes that people go through, none of us would do that anymore, but it's everywhere. Fixing that, they go into Excel and manually changing things every single month, just revealing that that is not something you have to do ever again is mind blowing. The funny example that I always give is because my name is Seth, if you put Seth in Excel, it renders in September. I find it annoying and it's another reason I just hate Excel.

Getting others to lead topics and fostering engagement

Eric, I see you had a question around fostering that internal community. Do you want to ask that one live? Sure. Great to be here. For those who don't know, my name is Eric. I'm in life sciences industry and quite a passionate R user in my various endeavors. I lead an internal R forum at my company and it's very well attended. Everybody seems to enjoy it. I try my best to let others feel like be comfortable sharing the cool things they're doing with R because we know R is being more prevalent in our company and across life sciences in general. I have a hard time getting others to lead topics. It tends to fall on me or somebody over what I'll call the honest power users, if you will. I don't know if you had advice on fostering that engagement of everybody feeling comfortable to share their wins with R, cool things they're doing, even if they just have questions. It's a bit loaded. I don't know if you had any advice on making people feel comfortable to present their work at a forum like that.

It's hard. Yes. It's really hard. I found out the hard way. I've had to spend a lot of time one-on-one with the person and just be like, we're planning to present this. Would you be part of that if you need any help prepping? I would love that, but it'd be so great to debut your work because X, Y, Z. It's relevant in this way, which people haven't seen or they could relate to you, but it's time expensive. Training and getting people on board anyway is super time expensive unless somebody's found a magic pill. I've been looking for a long time. I have not found it yet.

The only thing that I've found is helpful is I haven't, what I've done so far hasn't been restricted to a specific topic at all. You're not required to do R or something amazing. It's more of, we just want to share something interesting with each other that might benefit. That's helpful because it's less stress.

Something that's helped me with the Boston user group is if I can't find a speaker sometimes, I'll make it a series of lightning talks and so have very short presentations that people will give. People seem to feel a lot more comfortable with that, especially if it's their first one. I think that's a much easier ask to someone than saying, leave this 45-minute session.

Hi, my name is Edgar. I work with Daimler Trucks. I'm happy to be here and join the forum too. I was going to share, Eric, what we've done in our company is kind of phrase it as, what gap have you mastered lately? Then that gets people talking from an I learned standpoint versus I don't know how to do this standpoint. That's one approach. Then the other approach is we just rotate who facilitates the meeting and that tends to take different interests.

kind of phrase it as, what gap have you mastered lately? Then that gets people talking from an I learned standpoint versus I don't know how to do this standpoint.

Maybe I can chime in on how to get people to show their stuff for Eric. I'm Rick. I also work in the life sciences. I work a lot with researchers that are doing and do a lot of training in R. 90% of my work is doing training and teaching and mentoring and all variety of different capacities. One thing that I've done a lot is to get them to be more comfortable discussing and showing their work because there's a lot of hurdles for new starters in any topic to get over is that it's not perfect and it's not the right way. They're a little bit embarrassed to show their stuff.

To help them get over that, what I'll do is I've done this in person and now it's easier. Now I do things all online. I get them to share their screen or I get them to explain what they're doing. If I give them an exercise I would just ask them how they solved the exercise. Maybe something that a lot of pedagogical people disagree with, I don't give them really simple questions that have a really nice, easy, clear answer. I give them an example that has the answer and I say, okay, do it. It's amazing how many times people just don't do exactly what's in the example. They're trying all kinds of weird and wacky things.

What I did in the workshops I would usually do with researchers is they would have time to do their own analysis and at the end of the workshop I would be looking at what they're doing and if I saw somebody that had a really cool example or really nice and nice progress, I would say, hey, do you want to take five minutes and show everybody the stuff I did in the class? Then I would help them and you can be like co-presenters and we would present it together. I was kind of their co-pilot in doing that. That seemed to help them a lot and then they would be a little bit less self-conscious about it.

Oh, that's very nice, Rick. Lots of good nuggets there. Thank you, Rick. I feel like that kind of leads into one of the questions that Hugh asked as well around like helping new R users power through the initial learning curve. Hugh, do you want to provide any more context on that question?

Helping new R users through the learning curve

Sure. My name's Hugh Welch. This is my third time on these calls and my first time interacting. I'm a management analyst with the Veteran Health Administration. I've been using R for the last couple of years to build informatics products, and we've been getting a lot of interest in those and people around different facilities wanting to kind of replicate some of that. Like, that's cool. How do I get into R? And I kind of feel like, you know, okay, well, here's two books, a couple data camp courses, vignettes on all the tidyverse packages, read through these, and then we can start talking about what to do. It's been kind of difficult getting people who, some of whom have some programming background, maybe they've done some visual basic or they've done familiar with SQL or familiar with some stuff, but a lot of them are just Power Excel users, and kind of getting them through that mental shift into there's a different way of thinking about this has been an area that's been pretty difficult.

I've noticed remote is harder for me to get people into it, because I used to just go to somebody's desk and sit down next to them and talk through even getting R on your computer, setting it up and doing simple things initially, because then the person is set up, they know these simple things they can get into. But without the what I've now started doing is doing a lot of screen sharing. Screen sharing where you can interact at least if you can't type on each other's, you can either have the person watch you for a while and then have them do something so that after you hang up, they can at least refer back to the ones that you went over. But if I don't do that, if I'm just like, hey, go watch this tutorial, it doesn't go anywhere. I've noticed there has to be some initial, I can do this on my own when somebody's not holding my hand. And then people are comfortable going to different resources. And it has to be absolutely applicable to their daily life or they're not going to do it.

Definitely. Finding that practical application is sometimes challenging, especially with new people who want to do something really complicated. Let's start with this very simple problem and do some EDA on it, do some basic aggregation, pull in some files. We're not going to build a model in the first 10 minutes that tells you what you need to know.

I was thinking actually about putting together maybe some company R Markdowns with some very simple applicable to broad areas. We're pulling in data. Well, it's generally Excel. Make that simple for them to swap out what data. And we want to graph it. Make it very simple, just so they have little snippets of code that is super simple if you've played in R. But if when you're new, you're like, what is GG? What are my inputs? And it's overwhelming, so people just stop. So just making it, I'm going to make you feel powerful and make something beautiful from a very simple script that you can run on your own.

Yeah. And I just finished a master's in analytics, and some of the professors had great primers on where to start because the backgrounds vary. So establishing difference between R and RStudio and RStudio libraries and how powerful RStudio is, I mean, you immediately fall in love with it. But the cool thing is, you know, the primer includes things to the effect of create a folder, put this file in Excel in it. You'll read it with this script, but then go open it and check out what you did because you're going to edit the RStudio. And that just not only builds kind of a mental model of how things work, but also the trust in your new interface. So that really worked well for us.

Following off what Tori was saying, I feel as though I've had good success writing starter scripts for people just taking a bit. So we are lucky at Target to have both an awesome package called Bullseye Connect written by this guy, Alex Meyer, that connects to all our databases. So you install the package and then you connect it to any database basically in the company. So that's one. And then the second thing that's been huge is we have hosted RStudio and hosted Jupyter Notebooks. So all you do is go to a link and then say, like, start a new instance of RStudio. Just like if you were doing it on Kaggle, but we have it for Target so you're on our VPN and you can access the data.

So that has been huge and it makes it very easy for me as someone who's interested in R and say, hey, what data do you use most often? OK, let's find your data source and then basically just build a script that's, what, 20, 30 lines of code and go through, read in the data, maybe do a tiny bit of cleaning, aggregate it, build a cool visualization with Plotly and immediately they're like, oh, this is cool. And then you can change one or two fields and say, like, you have an awesome facet plot here, change one field, and now you have something completely different and you're seeing your data in a way that you've never seen it before. You can get people to that really fast and some people will stick with it.

Don't discount the people that don't use it, though, because at least they know what's possible and those people can become very good evangelizers. Like, knowing the possible, yeah, but they're like, oh, wait, I know if you talk to so-and-so you can actually solve this problem. That is a big cohort because a lot of people are not going to keep going. It's just how it is. It's still not a waste of your time, so don't get too discouraged.

No, you're good. I was just going to say thank you to Frank. I really like that idea of doing some of the grunt work for them to make it really easy to grab some stuff and get some immediate results so that they can kind of engage quickly and then, if they're interested, figure out how it actually works so they can do it themselves and break it apart and make it better. And they will. There will be plenty of those people that run with it and think this is amazing and, yeah, then they start doing it every day. Some of that setup stuff scares a lot of people away because they see, oh, there's code and connections and I don't know what this means. I'm just going to, you know, I'm going to move on.

What was that called? Bullseye Connect. That's actually the first package I wrote here because I was like, people get stuck on connecting to the data and you can't do anything without that connection. Others don't have that internal thing. That, I think, is absolutely key. You can't get to the data. They're kind of stuck and they're not going to get out Excel because they don't have access to anything else.

Building the mental model and pseudocode approach

If I can jump in again, I completely agree with the starter script package or the idea of having the starter script. And I think one thing that I add on top of that is that it's helping them to, a lot of the, really, people that never did anything programmatic or command line tools or any typing in the terminal or anything, they just don't get the, they don't have the mental model. Somebody mentioned this idea of the mental model. I think that's really important. Not just in terms of the algorithms, but also like how this thing fits together, right? So they don't appreciate the self-referential functions that change the input, that you can't run them twice, right? Like if I do a log10 of a variable, I can't run that function again because I'm going to get error messages.

So if you have a solution in Excel that they've already done, and this comes back to the other idea, right? Like how do you make it relevant for them? That script is going to be based on something that's really relevant in the project that they're working on or some idea. And then I say, okay, so let's write basically just like pseudocode. Like what do we need to do? We need to calculate this and we need to calculate this and we need to calculate this and we need to make this plot. So then we build like the pseudocode piece by piece by piece. And then they can basically fill in the blanks of all the commands and they can see all the different steps. And then in doing that, I also tell them if you can't get one part, don't worry about it because the analysis part here is separate from the visualization part. And it's separate from this part down here. So if you get stuck on that part, just do what you can and just leave it and then go to the visualization.

Compliance, explainability, and vendor lock-in

David, would you be able to introduce yourself and your role and ask that question? Sure, I'll be happy to. My name is David Dreyer. I'm a technical expert at Syngenta Crop Protection. We work in a pretty regulated environment in the pesticide industry. As a result, we have compliance with what are called good regulatory standards. We have to understand the data from cradle to decision essentially. With a lot of these guideline studies, they rely on black box methods for data analysis and system statistics. There's a lot of opportunity here to potentially use some of the open source software to improve essentially the analysis, transparency, etc. of our studies. I'm just curious. Obviously, you work in a very different industry, but still compliance is common. I'm interested in how you manage those relationships.

In my experience, if I cannot explain my models, compliance isn't going to go for it. Unless it's from marketing, you have a lot more leeway. Within modeling and particularly in banking, if you do not have good explainability, at least none of my compliance people or auditors are okay with it.

Working in healthcare, you're right, explainability, the ability to backtrace whatever you've done and show your work is a requirement. There's regulations out there that you have to be able to show somebody why the algorithm picked whatever it picked for them. Most of what we do is, we're not building black box models that I can't tell you why this neural network spit out this answer. What I'm more trying to do is tell you how many patients are going to be here tomorrow, and where are they going to be, and what's their likely cancellation rate, so you can maybe overbook or things like that.

That has been pretty good, actually, because Excel, what everybody has been doing for years, it is not easy to tell when people make a change or why they did the change. And so showing change control and how you can revert in the past and exactly what the change was, why, when, it makes people go, oh yeah, why haven't we always done that? They're like, we actually need to understand what's happening when. I was like, oh, well, we got a change out of Excel, guys, and let me show you why. And they're like, oh, yes, this is exactly what we need. And it made compliance super duper happy.

With your relationship with compliance. So I have a lot of talks regularly with compliance people. I've built some good relationships. So we have open conversations where we're actually teaching each other some things so we can have more common of the language because if you don't have that relationship, compliance and audit can be horrible. But if you can find people that you start having more of a let's understand each other's vernacular and where we're coming from, it goes a lot easier. It is a lot of work. The way a lawyer type thinks is very different from a data scientist. Very different. But the constant communication and camaraderie is useful in the long term.

It also builds the confidence in what you're doing. If they feel like they can see transparency and ask you questions without feeling stupid. Yes. And I've noticed that in a room with more than three people, they will not speak. I think most people are scared of looking stupid. The title data scientist is scary to most people. Unless you can destroy that bridge, that wall, people aren't going to talk to you.

I was just going to say, in balancing accuracy versus explainability, depending on the methodology you choose, sometimes we may have to take a hit in accuracy because we can produce something that shows this is why it came up, it followed this path. We use a tree model over something else because I can give you a visual to say this is why the computer did this in this way versus something that's not. I can't give them a handout or a PowerPoint to explain the work. You just have to trust me.

Our environment has been changing such that there is more room and more opportunity for these kinds of things, but the industry and in particular the folks who drive the quality system and change control at that broad organizational level, it's pretty hard to bring in new things and have confidence. Now we're pivoting back and becoming a divisionally oriented function that is probably going to have to figure out some of these things again. That's an interesting place to be thinking about where we have enjoyed being really nimble and flexible in terms of architecture and infrastructure and how we approach things. We need to start wrapping some process around it again. It's just on my mind a lot about how do we retain a lot of that flexibility and I think the power that comes with it while being able to demonstrate that our environment does what we think it does, that we could demonstrate that if need be.

Tying data science projects to business value

Hi, I'm Colin. I have a question. I'm just going to jump in here. I'm curious if you could speak to how you think about the different data analytics projects that you tackle and how it ties to business value. I say that because I find often I can go off on sidetracks of what I think is pretty interesting and kind of go down rapid holes, but then I have to step back and be like is this actually creating any business value? I'd just like to get your opinion on that. How do you think about the project that you're tackling and how it relates to creating business value?

It's really easy to go down those rabbit holes, but I've found works best for me and my team members that I've had do things is make sure that you are constantly working with business so that you're not going too far down a rabbit hole that they're like I don't care about that. That doesn't affect me at all. Where if I can get the business partners feeling like it's their project and they're right beside you helping through it, that decreases the rabbit holes. If you have that consistent relationship with the business person where you are truly working together and it's not just okay data scientists go do your thing and then come back when you're done, that's when I think the rabbit hole problem gets bad.

As the data scientist, I think you just cannot you can't go for that. I require if I'm doing work with you or for you, you have to be my partner. Otherwise it's not going to succeed.

Sorry, maybe I can jump in. Sorry to interrupt you. What you said before about talking with the people in compliance, I think was really relevant here also because I think oftentimes it's a communications issue. One thing that I found is that people will ask how to do something and I will show them and maybe I'll spend half an hour or even more doing literally word for word what they ask for and then I'll give them this beautiful result. I'll be so proud of myself that we came up with this beautiful solution and they'll be like, no, that's not what it was. That's literally word for word what you asked for. They'll be like, yeah, but no, what I meant was this. Then I realized that I didn't really get what they really wanted. What is the minimal thing, the easiest, smallest thing you can do that can have the greatest impact on your work and on your life and make things easier? Then focus on that, the 80-20 rule basically.

I think that's a common theme that keeps coming up too is communication with the business. Understanding how to communicate so that it's relevant to them. I find that, at least for me, the objective is to make them feel ownership. If I can get that ownership where it's their baby and they care about the success or failure, that's the key. They're like, cool, it's your idea, then we're good.

I find that, at least for me, the objective is to make them feel ownership. If I can get that ownership where it's their baby and they care about the success or failure, that's the key.

Tori, if I could follow up on that. What does that look like tactically for you? Working with the business? Yeah, like communicating, getting them on board. For this current position, I actually created a whole vision which then I had to go evangelize. I had to get top level support where it was the executives, all the executives were like, yes, we agree. I said, okay, if these are the projects you need, who is responsible? Who is my point person that has the flexibility to take the time to be with me? You have to tell them. This is important, not just to me, but to you as the boss and the executive leadership.

Then it's been a lot of one-on-one conversations. I could try to make it not just about the data science part. I think a lot of the communication to build that relationship to get the ownership is really understanding what are the real underlying business problems. A lot of people will be like, they think problem A is their problem, but it's not. Only during a lot of conversation and really understanding and feeling their pain do you understand what the real problem is. That is what you got to suss out. It makes the relationship good. People feel heard if you actually are understanding the real problem.

One last thing before I need to drop off. The way I operationalized this was during the engagement process, we were doing elicitation. I would ask the leaders that were asking for the work for a co-pilot. This person was going to share and learn. They share business, we share tech. Eventually, they take over the analytics so it can keep surviving. They didn't upfront commit to a co-pilot to do the switch when the project was almost done. That told me that they were really not that upset. They were just kicking tires. We never really engaged with them.

Something that I've also found that helped build the relationships and helped the business take ownership is when you're presenting, have them do it. That really makes people feel ownership and proud. They care, particularly if they're presenting. Then they feel cool, which is great.

Documenting the problem discovery process

Tori, I see Rick has a follow-up question to what you just shared. Rick, do you want to jump in and share that? Sure. I put it in the chat. The question was, when you mentioned this thing about getting them to come up with the real problem, I'm also doing that. The question was, do you document that somehow? Do you have a protocol where you state what the problem is and then you take the steps? Then we can see the progression of how we went from the original problem to the so-called real problem, which is interesting in itself to understand how that process worked. You can show people how that worked. Also, the other one is it covers yourself because you can show whoever, like, well, that is on paper. That's what we said we were going to do and we did it so that people don't come back afterwards and say, hmm, it wasn't. Do you have some kind of strategy for doing that?

Because where I am now, it's so new, I've not been as good at documenting that whole process flow. Previously, yes. For each project or working with somebody, we'd start out with saying, this is the objective and where we're going. We'd basically have minutes and notes that would just tack on. Not that fancy. Just note meetings, the notes from the meeting.

Yes, so by the point at you're actually producing some output, yes. But like, to get to that output, there's so much iteration beforehand that a lot of the findings, they're documented, but not as strictly as final outputs. Like here are the iterations, this is what we've tried, this is why we didn't use this variable. I found that's a lot messier. At least for me, I wish I could come up with a stricter, but I think I'm not as clear cut as I probably should be in that part.

Why now — moving from SAS and Excel to R and Python

I know there's one question from earlier that actually kind of touches on a few of the earlier points around building the community and training, but touches upon, like, why now? Like, why is your team starting to make this change, whether it's from SAS to Excel and moving over to R and Python? What has been the key driver there?

Actually, from the other side of the table, the last few years I've been teaching R for students, and I think the main challenge is how to get students on board. I also teach Tableau, and I think in comparison, I think R is more user-friendly, and then Tableau is like a different Excel. For me, still most students prefer to use Tableau because they thought maybe it's more popular. I tell them the truth is that if you learn R or Python, there are more jobs available.

I would say maybe in the future, if you can organize more data analytics competitions for students or probably meetups or probably these kind of recognitions. I mean, the next generation of students that could benefit from these skills. Otherwise, the graduates, they only have Excel or SPSS. I understand maybe a few years ago when I started teaching R, I had maybe less than five students who picked up the R programming language, and at the end of the semester, he told me that he was working for a banking company, and he said he was using sentiment analysis to analyze customer complaints for the first time ever.

Tori, is that something that WaFd Bank does for internships or maybe data competitions? We are in our absolute infancy, so no. Previously, one of the things that I found useful, not where I am, but my prior bank, we did a lot with the local university, which was awesome. It started building a relationship between the data scientists at the bank and the university, and it helped drive actually what the university was teaching and found some hires there. That was awesome.

The issue with banking is so many of the universities are like, can we get an actual data set? Real data versus the stuff we have, they're not comparable. Real data is so much messier and harder than the lovely Iris data set. People come out of college, and they don't actually know how to really deal with problems in the data set.

Advice for aspiring data science leaders

I don't think I see any other questions in the chat, but one other question that I'd