
Gordon Shotwell | Technical Debt is a Social Problem | Posit (2020)
Technical debt is a big problem for the R community. Even though R has excellent support for testing, documentation and packaging code it has the reputation that it is not suitable for production applications because data scientists don’t pay enough attention to technical debt within their codebases. Most people think of technical debt as an engineering problem. We choose to make our current work cheaper at the expense of needing to do more work down the road. But when you look closely at the root causes of technical debt they are almost always about interpersonal relationships. Developers have trouble empathizing with other users of their code and so don’t spend the time to make that code easy for future developers to use and understand. In this talk I argue that we should think about technical debt as a social problem because it gives us insight into why it’s so hard to pay back. I then provide a practical roadmap of how to introduce best practices into your data science team
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hi. Thank you so much for coming. So the title of this talk is Technical Debt is a Social Problem. And the reason I wanted to give that is I think, unfortunately, our community has a reputation, especially in industry, as a language that has a high amount of technical debt.
I think this is actually a fairly major problem in terms of adopting R in a lot of contexts where it hasn't been adopted before. And as a community, we need to take this seriously and work on it. And I've often found myself in the position where I was kind of coming into a team or an organization, and I knew sort of what to do to pay back technical debt on that team. But I didn't actually have any sort of decision-making power over what we worked on. So I had to sort of learn to be a little bit more strategic in terms of building these more robust solutions.
And the main insight that I had through this process was that it's not a technical problem, really. It's a social problem.
What technical debt actually is
Technical debt, for those of you who are not familiar, basically all of the little shortcuts that we take developing things, especially under time pressure, that make the products less stable or less easy to maintain over the long term. The idea is that eventually you're going to pay that back. That's the idea. It's really the practice. And when you look at the actual artifacts that we call technical debt, they're almost always social artifacts. And I think they fall into two main categories. They're a failure of consideration or a failure of communication.
So communication is really just what you think about it. It's how we communicate technically with other people. So the things that fall into this category are, like, documentation, which is how we communicate use and purpose. Tests, which is how we communicate correctness, like what the software is meant to do precisely. And then also things like code style and project organization. How easy it is to find the thing that you're looking for in a code project and how easy it is to understand that thing, to actually just physically read it.
So those are all quite obviously, I think, things that happen between other people. We can't judge whether documentation is good or bad without knowing whether another human being who's relevant to us understands it, right? There's no, like, perfect communication that we can transmit to the world. What matters is whether somebody understands us.
Consideration are all the things that have to do with how well your software handles things that are not problems that you have, right? Whenever we're developing software, it does a really good job of handling the problem that's in front of us, the problem that we're thinking about. But we can judge software as being kind of good or bad, robust or not, based on how it sort of handles a broader spectrum of people, right? So these are things like the code isn't robust, right? Someone throws some arguments at it that you didn't really think about, that weren't something that you would think to throw at a function, and it all falls down.
It can't be updated easily. Maybe there's a lot of repetition and things need to be updated in many places, right? It doesn't solve the problems of future people, or even just the problems of other people today. And then it has things like fragile dependency structures, or maybe it works on some scales, but it doesn't work at all on larger scales. So all of these are basically about how good our imagination is when we're developing these products, right? How well we can think about all the different things that somebody might conceivably throw at that, and how skilled we are at putting ourselves in other people's shoes to understand how they're going to relate to that problem, what they're going to try to solve with it, and how they're going to try to solve it.
So again, these are really social problems, right? It's not so much that there's a perfect dependency structure, it's not that there's a perfect kind of robustness. It has to do with how we're considering the needs of other people in our organization and the people who will join it in the future.
It has to do with how we're considering the needs of other people in our organization and the people who will join it in the future.
Why paying back technical debt is so hard
So this also gives us some insight into why it's so difficult to pay technical debt back, right? Because when you realize that actually this is a problem between two people, you sort of bring in this other insight, which is that people are really kind of fickle and biased, right? So some of the biases that I think you encounter, so one of them is status quo bias.
This is, of course, the 1970s UK rock band, Status Quo. I'm sure it's familiar to you. So this is just that when you're working with a system for long enough, right, you've paid the price of learning that system, and you've kind of gotten used to the pain you feel when you use it, right? So it's sort of all those things kind of disappear for you. And so you have this sort of bias that you kind of like the way it is, right, and you're not going to really be that interested in changing it. You're going to sort of overestimate the cost of change because you've already learned the existing thing, and nobody's learned the new things, or you're going to have to do some work to learn the new thing.
The second bias is the IKEA effect, and this is one of my favorites. I love this picture because you can really tell that she likes that chair. So when you build something from IKEA, basically you overestimate how good it is. So you put something together yourself, and you kind of invest a little bit of your ego into that thing, and since your ego is your favorite thing, suddenly that thing becomes way better than everything else. I think this is a big part of why programmers tend to like building their own databases, right? Because even if it's worse objectively than the database you could buy, you built it, so there's some pride that goes along with that.
So similarly, whenever you're trying to replace something, usually you're replacing something that somebody in your organization who's still there has built, right? So they overvalue it. And lastly, it's the parenting effect. This is my daughter's first experience with spinach. When you watch something grow up, you forgive its messes, right? You forgive the problems because you knew... You sort of see how it's sort of progressed over time, and you're like, well, it's so much better than it used to be, right? This system can do so much more than it used to be. It's so much easier to use than it used to be. And you sort of forget about that actually right now it's got some problems.
So all of these things together, I'm sure we could think of other biases, but all of these things together means that it's never a fair contest, right? Whenever you're trying to pay back technical debt, it's not that people are going to rationally evaluate, like, which one is objectively a little bit better, right? When you're replacing these things, you have to do much, much better in order to get over these kinds of humps that you find within an organization.
Building delightful products
And the goal, I think, for doing that is to try to build delightful products. So something where... This is sort of similar to how we relate with external users of products, right? We want to build something that actually people get excited about when they use, that they really sort of... It's a joyful experience when they use this thing. And that can mean a lot of different things. For me, like, some of the things I really focus on are that it sets up in a single step, so like install.packages, right? You only have to call one thing, and the setup is complete. There's a clear, obvious first problem. There's a way to... Kind of a clear way to get from the obvious first problem to something quite complicated to be a master of that system. And then it has sort of help resources, and it doesn't break, right? It doesn't sort of just randomly stop working.
So one of the products I really love that fulfills all these things is R Markdown because especially it sort of gives you something that, like, right away, here's the easy thing that you can do right now. But down the road, you know, you can just really take over all publishing, you know? Like, you can publish all the websites, all the books, right? Using the same framework, and there's a smooth path to that. So building delightful products. That's the goal. And if you do that, then I think you can kind of overcome those biases.
How do you build delightful products? Well, I have a three-step plan for you. So the first one is to find the right beachhead, which is to pick your first battle in this process of paying that technical debt and pick that correctly. Second is to separate users and maintainers and to treat these internal products more like you treat external pacing products. And lastly, empathize with the debtor.
Finding the right beachhead
So a beachhead, for those of you who are not familiar with the military things, is like when you're invading a country, you, like, have, like, a little staging area that you have to take first, right? So it's not about winning the war, but it's about, like, putting yourself in a position where you can, you know, deploy your resources and, you know, get supply lines and all that stuff. So finding the right beachhead is very much about, like, picking the right project.
And so this is a picture of a piece of code base that I was working on at Socure when I was first starting, and each of these is an R script. Each of the dots is an R script, and each of the lines is a sourcing relationship between the script and another one. And Socure at the time was a very small data science team and has grown a lot, so this was a kind of type of way of sharing code that works really pretty well if you're, like, four or five people. It works less well if you're, like, 25 or 30 people.
And there's a sort of natural tendency for myself to, like, look at this and say, I just need to replace this with a package, right? I need to take over all of this code and put it into one thing. And that's going to fail, right? Because there's no way that I'm going to be able to really understand what each of these things does, and there's no way that actually I even know that a lot of these things need to be around at all, right? So you have to be strategic about what you're picking. Which of these am I going to use to give me kind of something which I know I'm going to be able to accomplish, and it's also going to build trust in the organization.
So some of these things I think it's really important to focus on small, contained projects, and something that you can do really, really well at, right? You're trying to knock it out of the park, and it's easier to do that with a baseball than like a bag of Jell-O or, you know, an elephant or something, right? So you want to have something that's small, because mostly what's going to happen is people are going to judge it based on the delta, not so much the overall value. So it's better to do really, really well in an area of limited scope than just a little bit well in an area of big scope.
It's good if it's an orphan or greenfield project, like there's nobody in the organization who's super attached to how it's like now. And if there's external pressure coming from outside of your team, that's also really great, because nobody likes to respond to that. So if you can take that on, it's really good. So in my case, what we picked there was the database access functions, right? They were small, I could understand them well, and we were responding to some external pressure from our infrastructure team to sort of change how we were storing and distributing credentials.
So find the right beachhead. Put yourself in a position where you can solve this small problem and solve it well so that you can move forward. And what this will allow you to do is develop trust within the organization. It'll allow you to build up the social and sort of technical infrastructure that you need to solve these problems, like a CI process or just learning how to do code review with other people.
Separating users and maintainers
The second main point is to separate users and maintainers. And this is in order to build delightful products, you need somebody who is defining what delight means and you need somebody who's trying to make the product do that, right? So in other words, you need a user and you need a maintainer. So ideally, every system in your team or your organization should have a written-down list of who is a maintainer of that product and everybody else is a user.
These two groups abide by a basic and holy contract. That contract is that users get coddled and maintainers get opinions, right? So you have to pick one of these things. And we all want to kind of sit between these. Like, we want to be coddled, and we also want to have opinions. We want to make these decisions, but we don't really want to deal with all the consequences that might happen down the road from those decisions. We don't want to write the code, we don't want to document it or write tests. What this says is basically we're not doing that anymore with these sort of low-technical problems. We're going to have maintainers and we're going to have users.
And if you're a user and you want to be a maintainer, you're welcome to, but you're going to get all of the other stuff that goes along with that to make sure that this product is delightful. So being a good maintainer means that you're responsible for making it delightful. That's your goal, right? You want to make this product really, really good, really, really easy to use. You want to make it a joy. With that, you have the responsibility and the authority to do that, right? So you're the decision-maker over what that product does and how it does it, right?
You decide how it's going to accomplish this goal, and you kind of judge yourself based on whether you're accomplishing it. So along with this means that you don't kind of ask the users to do your job for you. This comes up a lot for internal products with documentation, where the first time that we ask somebody to set up a system, and if the documentation is wrong, we ask that person to fix the documentation, when usually, actually, they're the worst person to document something because they understand it the least. So this is sort of something where you say, like, as a maintainer, I'm going to do all of this stuff to make sure that this thing is really good, and I'm not going to expect the user to be able to do any of the things that I should do.
Being a good user is basically the opposite of this, right? You get to define what delight means, and importantly, you have to sort of turn off the part of your brain that can understand computer systems really, really well. So as developers, like, it's tempting to basically be like, oh, I have a bug with a system I don't maintain. I'm going to go read the source code and figure out what that bug is. And that's rarely a good use of your time, both because you have your own systems to maintain, and also because you're not actually fixing the problem for the sort of, like, lowest common denominator user, right? You want to have the feedback be like, actually, this was hard to use. I had to think hard to get this to set up. Please fix that, right? That's not delightful if I have to work too hard at it.
It's also good to ask about the problem, not the solution. So rather than proposing, like, you should change this technical thing in your system, say, like, as a user, I have this problem, right? There's a reason why product managers use that a lot. It's a really good way of phrasing it. And it's just important to not be embarrassed about complaining about something and to really try to never read source code.
Empathizing with the debtor
So the last point is to empathize with the debtor. And I think the important thing here is whenever you're doing these projects, you kind of have to, like, remember that what you're doing is kind of inherently threatening and sort of insulting, right? Because you're saying to somebody who has worked hard on something, right, that the thing that they've developed, like, doesn't work anymore, right? Or maybe it never worked, but it doesn't work anymore.
And that is really a problem because you're also asking them to learn something. And when people are sort of threatened or afraid, it turns out that they can't learn anything, right? Those two parts of your mind, like, they can't work together. So you need to have people be kind of, like, comfortable and feel supported and understood in order to get them to change and also in order for you to learn from them, right? Because the existing products, like, they're not all bad. They've done some good jobs, right? So it's important to, like, to really, like, try to take their perspective to understand, like, where they're coming from, why does this system do things the way it does, and how can we kind of together make it better?
And then more often than not, this is you, right? Like, you're looking at your own projects from two years ago, paying back technical debt, and you kind of don't want to look at it because you feel bad about yourself, and that's unpleasant, so you never fix the tests, right? Like, you know, so having this, like, I'm kind of using this phrase to, like, talk about other people, but really a lot of the times it's you, right? Like, you're the person on the other side of this when you have some other person coming in and saying, like, this doesn't work, or you're looking at your old stuff and it doesn't work. And all this really applies in the same way, right? Like, you're trying to learn a new thing, and if you feel threatened and afraid, you're not going to learn it, right?
There's this... I don't know if you're familiar with Marie Kondo, but she has this thing where you, like, if you're getting rid of your old pants, like, you pick up your pants and you say, you know, thank you, pants. Like, you know, you covered my butt for a lot. You know, like, you did a good job, right? And then that helps you let go, because you're sort of like, these don't look good on me anymore, but thank you.
So I really think that one way of developing this kind of empathy is to look at technical debt as a good sign. And the reason it's a good sign is because the only reason you ever care about paying back debt is if you haven't declared bankruptcy, right? So, you know, technical debt means, like, I haven't declared technical or regular bankruptcy, right? So there's still something there to work on, right? Otherwise I just throw it away.
So you can look at all these code sort of artifacts, these things that, you know, you're like, oh, this is awful, this is really fragile, it breaks all the time, it's hard to learn. And sort of just remember that actually this is the work of, like, thoughtful people who are doing their best with the knowledge and resources that they have available to them. And that's really all we can ask of people. And similarly, the code, the fact that you're looking at that code or that product means that it got you to the next step, right? It did something.
And sort of just remember that actually this is the work of, like, thoughtful people who are doing their best with the knowledge and resources that they have available to them. And that's really all we can ask of people.
And even if that thing is just basically showing you what doesn't work, right? And that's really all we can ask of these products is just to take us one step forward so that we can, like, fix it, right? It's kind of a constant process of just basically fixing broken things and then breaking new things that need fixing later.
So going back to this slide, basically... Going back to this slide, this code base was developed at a time when Socure was really kind of in a slight, like, emergency. Like, we really needed a product to get to market, right? And this did that, right? We have a market-leading identity product now because of what this code did, right? Even it's a bit of a tangled mess. Like, it got us there, and that's why I'm here sort of talking about this.
So thank you. That's all I really have. I think it's a sort of important thing that we should focus on. We are hiring, so please talk to me if you would like to work on identity fraud. And I'm on Twitter. Thank you.
Q&A
Thank you, Gordon. We have time for maybe one question. Regarding the user versus maintainer roles, what are tips for teams with high turnover? And then another question is... Where a team has a user that is highly likely to become a future maintainer.
So for the turnover, I think basically turnover is good because the more turnover you have, the more education feedback you get. So in that sense, it's important, basically, like, if you're able to manage high turnover, that's a good sign that you're able to educate people quickly. So that's actually kind of like a beneficial thing.
For the maintainer user things, I think kind of giving people, like, good first projects is really helpful. One of the things that I think is... I found very helpful, personally, is if somebody else writes tests for me. So if somebody else writes tests for me, I think that's really helpful. If somebody else writes tests for me. So if you're a senior person, the best thing you can do is write documentation or tests and have a junior person fill in the code. Like, the code is actually the easiest part. Setting the expectations for what that code needs to do is the hard part.
