Data Science Hangout | Mike Miller, Engine | Adjusting for Stakeholder Tendencies
We were recently joined by Mike Miller, Vice President and Data Science Team Leader at Engine. We dove deep into a conversation towards the end of the discussion regarding surveys (whether internal or to clients): Responder fatigue is a huge concern and something we deal with on a daily basis. ️ Keep it short, keep it focused. Make sure you’re asking people what you want to get out of it. It’s easy to go down a rabbit hole with data you’d like to see. Every time you add a question make sure it’s focused on your main objective. ️ Five questions, maybe 10 max. More than that you should be paying your respondents. ️ Double-barreled questions are bad - check for these. (questions composed of more than two separate issues or topics, which however can only have one answer.) ️ Write your question well. Think about the people answering this. Make sure the question is asking what you want people to respond to and that there’s no way for other people to interpret it differently. ️ Resources mentioned as well: Bryan’s meetup talk on Survey Design: https://lnkd.in/dCK_Ga8x Mentorship program Maisie shared: https://lnkd.in/gwPyH-Mr" Where to find more? ► Subscribe to Our Channel Here: https://bit.ly/2TzgcOu ► Data Science Hangout site: rstudio.com/data-science-hangout ► Add the Data Science Hangout to your calendar: rstd.io/datasciencehangout Follow Us Here: Website: https://www.rstudio.com LinkedIn:https://www.linkedin.com/company/rstudio Twitter: https://twitter.com/rstudio
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Welcome back everyone to the Data Science Hangout. I hope you're having a great week and thank you to Rob who covered for me while I was out in Park City last week snowboarding. And Matthias, I heard there was a great session last week. If you're joining for the first time, welcome. It's great to meet you too. I'm Rachel, I'm the host of the Data Science Hangout.
This is an open space for the whole data science community to connect and chat about data science leadership, questions you're facing, and what's going on in the world of data science. If you ever want to go back and re-watch or share a session with someone who's missed it, we do have a Data Science Hangout site now off of rstudio.com. We really want this to be a space where everybody can participate and we can hear from everyone. So there's three ways to ask questions. You can jump in live by just raising your hand on Zoom. You can put questions in the Zoom chat and feel free to just put a little star at the end of your question if you want me to read it out loud if you're like in a coffee shop or something. And then we also have a Slido link that I'll share in just a moment here too, where you can ask questions anonymously as well.
And we really love this dialogue that happens live during the Hangouts too, so I love hearing people jump in live. If you do find yourself super inspired and have spoken a few times, maybe just consider holding back a bit for others to jump in too. So we want to try and make this as inclusive as possible and making room for others is an important aspect of that. But with all of that, I'm so excited to be joined by my co-host for today, Mike Miller, VP and Data Science Team Leader at Enjin. Mike, I'd love to turn it over to you to just introduce yourself and maybe tell us a little bit about the work that you do.
Yeah, sounds good. Yeah, happy to be here. Thanks for inviting me and getting the opportunity to chat with you all. So like Rachel said, I'm Mike Miller. I'm the VP of our Data Science Group at Enjin Insights. So Enjin Group is kind of our, I guess, our parent company. So we have three really kind of main arms. So EMX, which is our digital like ad server programmatic arm, and then we have an agency group which does kind of typical marketing campaigns and creative development and things like that. And then our Insights Group, which I'm a part of, is kind of our market research and primary research arm.
You know, it's kind of built on, you know, survey research and things like that. So within my department, our Data Science Group, so there's about eight of us that kind of handle the data science and, you know, some data engineering capabilities. Our main role is advanced analytics consulting and methodology of primary research. So a lot of proposal work that we come through, we help, you know, design a methodology, what the, you know, what's going to solve the business need, you know, what techniques do we need to, you know, run and, you know, do regressions or, you know, conjoint work, choice-based modeling, things like that. So what's the right methodology and kind of strategy to solve the business need.
You know, a lot of our stuff is kind of, you know, data stewardship, you know, what's the proper use of data, what's, you know, waiting and, you know, when to wait, when not to wait, you know, what are the bounds of waiting efficiencies and things like that. You know, obviously with statistics, it's kind of the easiest way to lie. So, you know, maintaining, you know, maintaining quality research and making sure the data is actually telling what it is because it's very easy for, you know, our internal clients and even external clients to say, hey, you know, the data seems to suggest that. Well, it might seem to suggest that, but, you know, it's not necessarily saying that. It's not the question asked. It's not spirit behind the question.
I guess I can talk a little bit about, you know, how we use R and RStudio. I think the main, you know, tools that we use within our department are, you know, Excel, SPSS, R and Python. We do a lot of work with Shiny for like online simulators and deliverables and things like that for, you know, interactive deliverables that we can give to our client. You know, it kind of lives on and allows the research to kind of have more shelf-life, you know, allows our clients to kind of, you know, basically teach them to fish to kind of more self-sufficient. So, it's been a great tool for us.
Data stewardship and client expectations
Yeah, I mean, I think it's just, you know, like leveraging the data in the proper way. You know, there's a lot of times, you know, we tried to use our census values. So, going into a survey, we'll use kind of census targets to make sure the data is representative, you know, whether it's representative census, representative of, you know, a client target market or whatever it is. So, we try to balance that stuff on the front end so we don't have to wait very often.
But a lot of times clients will try to just wait to a story, you know, hey, you know, this maybe, you know, rural is too low and we think we, you know, are getting bad results. So, there's a lot of kind of situations there that clients seem to want to tell a story that maybe the data doesn't suggest. So, obviously, there's ways to tell that story, to manipulate the data a little bit. So, you know, it's our kind of responsibility to say, hey, like, you can wait it to certain bounds or you got to, you know, what's your universe you're trying to wait to. You can't just wait to be waiting.
If your data doesn't tell you the story, then that, I mean, that's helpful. You know, a lot of times, you know, with primary research, it's a difficult conversation. You spent, you know, 50, 60, $100,000 on research and they didn't get the answer they want. But, you know, a lot of times that $100,000 that you spent on a research study, you maybe saved you millions of dollars before you launched a product. So, while it didn't give you the answer you want, you're still in the long term probably saving money. A lot of times clients don't want that answer. They want, they think they know what the story should be.
A lot of times that $100,000 that you spent on a research study, you maybe saved you millions of dollars before you launched a product.
You know, correlations don't, you know, isn't causality. So, while there's a correlation, doesn't mean that, you know, A is driving B, you know, there's, it's a correlation. You know, just those kinds of things and helping a lot of our clients are not statistical people, you know, they're marketers, they're, you know, product developers or things like that. So, you know, they're not strong in statistics. So, it's really kind of teaching them and making sure they understand what the results are telling them and encouraging them to not, you know, step out of those assumptions.
Clients and industries served
So, I mean, like I said, we, you know, full custom. So, a lot of our studies, you know, I kind of grew up with, you know, I came from MMRSI, which was then BFIRC, which is now part of Enjin. So, I kind of cut my teeth on CPG. So, we did a lot of, you know, cereals and, you know, snacks and chips and different things in general, you know, product optimization. A lot of it, you know, sometimes it's cost-cutting. Can we reduce, you know, the size of a given item and without people really knowing?
We do a lot of insurance work, you know, home auto health insurance. So, just managing, you know, developing products, you know, how, hey, if we want a deductible at $2,000, what's kind of the take-up rate versus, you know, $5,000? So, some of that product optimization, we do a lot of like customer satisfaction. So, like brand health trackers. So, just kind of keeping the pulse on the brand, a lot of CX experience. So, customer experience work, you know, if you go into a store, was it easy to find what you're looking for? Is customer service good?
You know, the gamut of what we touch is wide, which, you know, I think is a big appeal for why I enjoy my job is, you know, it's the same techniques, the same kind of solutions, but it's, you know, a ton of different industries. You kind of learn enough to be dangerous in many different things. So, the variety is very appealing to me.
Hiring challenges
I am. We actually just, so we had two open positions probably since, well, September, maybe. And we just filled one. We have one open, but we're kind of just passively recruiting for that right now. I'm not really sure exactly the skill set we want. Because like I said, we play a little bit of in the traditional research, you know, statistical market research, you know, maybe more data science. And then we also have some, you know, data engineering capabilities.
Yeah, I mean, we struggled for probably a month or two to even find good candidates. We found really good candidates that, you know, felt that they were in range. We got the offer stage, they had another offer that was completely out of our range. We got some resumes that kind of passed the initial screener through HR. They get on a call with, like, a technical member of our team and couldn't even, you know, answer the simplest question. So, you know, some resumes, like, yeah, these look really good, but they can't talk the talk.
Yeah, I mean, not really. I mean, I think our job post has largely stayed the same. You know, I think it was, you know, we just started to add some, like, qualifying questions that our recruiter could kind of ask. We kind of armed her with, like, here's a question, here's a general answer, as long as they don't, like, I have no idea what you're talking about, like, would be a red flag. So, you know, that has worked well. I mean, we've still got on calls where people, like, yeah, I really know Python, I really know R, and, you know, our kind of technical person on our team would kind of ask the interview, and they're like, oh, I don't even know what a regression is. Or I would just Google it, was, you know, one guy's answer to many questions.
Shiny simulators and version control
Yeah. Yeah. So, typically, we'll do it right through Shiny Server. So, we have a Shiny Server Pro account through RStudio, and we basically have a server locally that we hosted on, and we provide credentials to, obviously, our internal teams and then our clients. You know, maybe three, four people that get log-in credentials. They go to a website, you know, we provide the link and publish the, basically, the entire R, you know, package and the app and everything to that server.
You know, I think, you know, one of the main attractions to an online deliverable is version control. You know, we used to deliver everything with Excel, and you make edits, or you, you know, add things to it, and it's, hey, it's version one, version two, version three, and you start sending emails and different things, and it's just, it's difficult to share within the client's organizations. It's always on a link, and it's the same link. We can publish, you know, edits to it, and it's still the same link. So, it's been really good for version control.
Frankly, we don't get a lot of client data because a lot of our contacts don't have access to it or can't get it. You know, we would love, you know, we have those conversations and, you know, they're on board, like, yep, yep, you know, but they either, A, talk to finance or talk to IT or whatever, like, nope, we're not giving you that data. A lot of times, it's a mess. So, they're like, yeah, we would love to get that, we'll talk with our team, and they can't get it or it's missing so much data that it's not helpful.
Career progression and mentorship
Yeah. So, yeah, I started with at MRSI as an intern. And, you know, so MRSI was a Cincinnati-based, you know, market research firm, fairly small, mom and pop, you know, CPG kind of thing. Started as an intern, was hired on full-time as an analyst, very much in the same role, what kind of our department is now. You know, I'm certainly not a coder, which is kind of interesting that I'm here in a, you know, RStudio, you know, data science hangout. You know, I have enough R skills to, to be dangerous. I know how to read and kind of manipulate things, not really drafted on my own.
You know, frankly, I was probably got here faster than I expected, you know, I think due to some departures in the company, you know, kind of fell into the role probably, I guess, two years ago, 18 months ago. But obviously, you know, the executive team and everything felt comfortable with me taking kind of the lead of the team. You know, our team is, is pretty tight knit. So I think there's a lot of that, you know, they didn't want to bring in, you know, an outsider to something that doesn't really know, you know, corporate culture and, you know, the team and kind of upset to applecart.
Yeah, so they started like kind of a spark is what they call it, you know, they have a spark talent group, and that probably started two, three years ago, you know, based on, you know, employee surveys and things like that, people kind of got stuck or didn't, you know, feel, you know, training was there, kind of the importance of your career development and things like that. So they kind of set up spark training. So they have like a monthly, you know, training topic that people can sign up for and attend.
So within that, I was, you know, I was picked in the high potential groups, and that kind of HR met with me, like, what are you looking for? What do you, you know, how do you want to grow your career? What do you want out of engine? And then it kind of aligned you to, you know, a given, you know, executive member that that kind of, you know, fit those needs and could, you know, fulfill that. So that's how I kind of got grouped with Rich Catrone was my original mentor.
So I've been with him, you know, ever since I kind of took the role as vice president. But yeah, it's just continued exposure to the MX group agency group, you know, kind of what's his vision of engine and how I fit into that and how to drive the team forward and what he wants out of us. So, you know, it's just, it's been great, you know, have him as a, as a friend, as a mentor, obviously as a CEO.
Building comfort with executives and clients
Yeah, I mean, so a lot of our, like our, our projects are set up that everybody has, every project has a primary and secondary analyst. So you're always with somebody. But it's really just kind of, you know, learned over the years. You know, I, maybe I'm a silent listener for a while and just kind of hear the flow, hear how things work, hear the presentation style. You know, it's a lot of times it's learning the, you know, the, the tendencies or kind of preferences of our internal salespeople, learning the preferences of given clients, what they want, what they expect, how they, you know, are they technical at all?
You know, you're kind of a silent listener or somebody else is doing it. Then you get, oh, here's a slide or two for you. Then it's okay. You're running the meeting on your own. Then you start to kind of be that person with, you know, a shadow behind you that you're kind of, hey, do this and maybe I'll present it. Or, you know, you just start to kind of, you know, learn and adapt and kind of learn a style and just get that comfort level. I mean, there was many times that, you know, early in my career that, you know, I, you know, traveled to a given client. I was there by myself or I was with somebody and say, hey, you're going to present this workshop, you know, all day thing. And like, oh shit, like that's super scary.
You know, I don't know how I'm going to do with that, but you just kind of get thrown into it and you sink, you know, not necessarily you're going to sink or swim, but you're your manager, that person who kind of puts you in that position knows you're going to succeed. You know, is there to kind of fill in the gap, you get off track or don't know how to answer a question. You know, it's, it's kind of those tendencies. I know how, if you don't know the answer, it's just, I don't know the answer. I'll look it up. I'll get back to you. You don't, I'll make it up. So it's just learning those soft skills, learning those presentation skills.
You know, certainly being in uncomfortable positions. And then, you know, as a manager, you kind of learn that, like you're going to grow with challenges. So, you know, kind of put people in positions that they're going to succeed. It might be uncomfortable for them, but, you know, the confidence in my team or my junior people that they're going to do the right thing or I wouldn't put them in that position. So, you know, it's just, it's the comfort. Once you do it once, it's like, oh, okay, that's not so bad.
Machine learning and survey methodology
Yeah, I mean, it kind of depends. I mean, we've done so we definitely do some machine learning on a fairly regular basis with like online behavior, third party data and merging that with with our survey data. So we do, you know, kind of an ensemble approach, there's like five different algorithms, random forces, SVM, XGBoost, things like that, we kind of pick the best model of what we're doing.
You know, things just go in cycles. Most of our data from survey research is probably 2000 completes or under. And then a lot of our clients just want to cut and slice data. I think we're trying to engage with our EMX team, you know, in terms of programmatic, trying to get our hands on, you know, different agency clients and try to just onboard the alternative data sources. But, you know, a lot of our bread and butter stuff is segmentation, you know, through K-means or even like Sawtooth.
I think a lot of our clients, they want to know what's going to happen in the future rather than what has happened in the past, which is kind of typical market research, you know, is kind of, you know, backwards looking. You know, a lot of clients now are like, how can I use the data from trackers or other stuff to predict what's going to happen or, you know, the early indicators of things.
So I found one thing was to change the whole survey technique. So you can actually do a 2000 in the survey world is a lot of responses. Are you using a Likert scale? Are you using something else? Yeah, get rid of the Likert scale, go to a zero to 10 scale or a one to 10 scale. And then you leave binary outcomes and you will find logistic regression or elastic net will do amazing things for you. Like a simple way to think about is NPS, right? You ask the NPS question, but then you ask 10 follow-ups or nine or eight or whatever your favorite number is, scale those one to 10. Now you'd be able to say, hey, for every increase in this area that you improve, the NPS is going to go up or down by Y.
Survey fatigue and best practices
Yeah. No, I mean, I mean, obviously being in this industry, I'm the worst survey taker. Like I hate taking surveys and like, I just pick them apart. And yeah, I mean, that's, that's a huge problem. You know, people, clients, particularly like, you know, hey, let's do this 30 minute survey. I'm like, nobody's going to take it. Like you're going to get 15 minutes in and people bail. So, you know, respond to engagement, respond to fatigue. Like those are huge concerns and things that we deal with on a daily basis. You know, if you get garbage data, you're going to have garbage analysis. So, you know, make sure you design it right. Make sure you build a questionnaire that asks questions that you want and, you know, cut out the fluff, cut to the point, keep it focused.
Yeah. That's my, that's my, was my idea is like five questions, maybe 10 max. One of them should be a binary. More than that, you better be paying your respondents. And when you pay your respondents, you get a whole different, you can get and get a whole different story. But you're right. I mean, survey fatigue, huge issue. Like somebody wanted to do one with like 20 questions. It's like, you're wasting your time. You're not going to get anybody.
No, I mean, I think, you know, respond to fatigue is the biggest one, you know, ask, ask questions. You know, I think double barrel questions are bad. Like just a general, you know, don't make sure your question is, is, is asking what you want people to respond to. Make sure there's no way for other people to interpret it differently. So, you know, a good question is, you know, write the question well. So you're getting a response that people understand what you're asking. Um, and just keep it short, you know, keep it focused, make sure you're, you're asking people what you want to get out of it. You know, it's very easy to start going down the rabbit hole of, oh, I'd love to have this data. I'd love to have that data. You know, every time you add a question that's not focused on your main objective, it's, it's just fluff. So garbage in, garbage out, you know, I think that's keep it focused.
Just keep it short, you know, keep it focused, make sure you're, you're asking people what you want to get out of it. You know, it's very easy to start going down the rabbit hole of, oh, I'd love to have this data. I'd love to have that data. You know, every time you add a question that's not focused on your main objective, it's, it's just fluff.
Yeah. So, um, basically every project that we run has an SPSS file. So a lot of our, like data tabulations. So we run things through dim net, which is our main, like survey platform that we, you know, script and basically serves a link, um, to have people complete. Um, so that's all run through dimensions and then that kind of exports into SPSS. So when our data tabulation team creates Excel tabs and everything like that basically creates an SPSS folder or an SPSS file. Um, so, you know, if there's, you know, correlations or regressions or things like that, it's just, it's very easy to just hop in there and run it. SPSS, you know, in my opinion has really good kind of that diagnostics.
LinkedIn is great. Um, you know, that easy way to connect and chat and, you know, can follow up with any questions or anything like that. Perfect. Well, thank you so much, Mike, for chatting with us and sharing all your experience. Really appreciate it. Thank you all for joining today too.