Transcript#

This transcript was generated automatically and may contain errors.

Welcome to The Test Set. Here we talk with some of the brightest thinkers and tinkerers in statistical analysis, scientific computing, and machine learning. Digging into what makes them tick, plus the insights, experiments, and OMG moments that shape the field. On this episode we'll talk with Phillip Cloud about some of the nuances of adopting AI for software development. And if you've heard the podcast up to this point, you probably know that we've wholly surrendered ourselves to AI to code for us. So I find Phillip's perspective really refreshing. And he's got the bona fides to match. He's a principal engineer at NVIDIA, he leads the Ibis project in Python, and he's one of the earliest Pandas contributors. He's tried Cloud Code, he's tried Cursor Agent, and he's walked away still searching for the right AI tool and use cases. Phillip's the type of developer who doesn't use a mouse and wants to only use his keyboard in the terminal. And I find that's the hallmark of a really great developer. So in this episode, we'll talk about sort of how Phillip approaches software development, and what AI tools get right, and what they get wrong, and what 15 years of open source experience has taught him about hype cycles and tool adoption.

Hey, welcome to The Test Set. I'm joined here with Phillip Cloud, who's a principal software engineer at NVIDIA, and lead on the Ibis project, which is a neat Python project that can run SQL on a lot of different databases, and an OG contributor to Pandas. So it's really come up through a lot of interesting Python open source projects, and so excited to talk to you. And I'm joined by my co-hosts, Wes McKinney, who's a principal architect at Posit, and Hadley Wickham, who's chief scientist at Posit. Phillip, so glad to have you on.

Yeah, great. Great to be here. Great to kind of sync up with everyone.

Yeah, I will say we had a little prep for this in that when I talked to Wes in an early interview, he warned me that you're a wizard at puns. And I did notice that came up a little bit in some of your responses prepping for this, that I think you said, you want to be known as a decent human with lots of open source contributions, liked puns, and be loved. So I'm excited to see, I mean, I think all those things are great, but I'm excited to see what happens on the pun axis, particularly.

What I've always said is that Phillip has a pun oriented, you know, kind of pun oriented way of working. So like, you know, you can, he's the person that you can always count on to never miss a pun opportunity, you know, often in the least expected ways. And I feel like I'm on the lookout for puns, but his eye is next level. So it's brought a lot of humor and amusement to like our years of working together. I mean, I've worked with Phillip in some capacity for, I think this year we'll make, you know, I don't know, I'm not sure if it's exactly 2011 or 2012, but we're like closing in on 15 years. So it's been a really, been a really long time and across spanning many projects. Pandas like Phillip was, Phillip and Jeff Reback were two of the first Pandas core team members to join the project formally and to have write access to the repository. So we go way back.

I just wanted to put in a plug for R because I think if you like puns, that's a much better community than the Python community. Shots fired.

Phillip's path into open source

So I know Phillip, you have like a lot of really interesting opinions on like tools and AI. I'd be super curious to talk about a little bit like how you got into open source and projects like Ibis and a little bit more than into some of the tools and, and your thoughts around some of that, but maybe you could like catch, catch me up on kind of your path into working on open source tools and tools like Ibis.

Yeah. So I was working, I guess I won't go through the whole backstory, but I will, I started at like just the end of undergrad, I was working at an eye movement lab and like people who work in eye movement lab, eye movement labs are essentially physicists doing some kind of, it's sort of biology. It was in officially in the biology department, but most of the people working in the lab were physicists. And some of it was like also involved with the psychology department. So it was this very sort of kind of mixed discipline research, but we, we did everything, we did everything in MATLAB and this other system called LabView, which is this crazy national instruments, like thing that you program using circuit diagrams. It's actually pretty cool because it kind of gives you like automatic data flow programming without having to think about how to paralyze anything, which is kind of nice.

And so they, they sort of, the people in the lab were, were definitely responsible for me getting interested in programming via MATLAB. And then I sort of just poked around with that and kind of looked around, you know, back in the pre-AI days, you just had to Google for stuff and you know, kind of Googled around and found some other things similar to MATLAB. I don't even know if it's kind of, kind of a deep cut, but NumPy used to be split into two packages, NumArray and Numeric, I think is what they were called. And this, I was sort of getting into programming right around the time when Travis Oliphant came onto the scene and was like kind of unifying that into NumPy.

And so I kind of got into Python through that. I built a few things that we had built in the lab in Python, some like analysis applications, kind of mostly around plotting the various things we were looking at. And then kind of fast forward, I was in grad school studying computational neuroscience and a lot of the, the specific kind of data analysis I was doing was very, Pandas was a very useful tool for what we were doing, mostly because of a feature that I think is not particularly well liked by the Pandas developers, which is column multi-indexes.

It's like, it's kind of gross in the implementation, but it was super useful and it's a bit niche. It was because of the, the thing we were studying was, we were looking at like sleep behavior in like anesthetized rats and the way the sort of electrodes were situated in the brain was organized. They were kind of organized onto these like four shanks. And so I wanted to do like analyses that were like between the shanks and across them and organizing each, each, each electrode on the shank was a time series, like 22 kilohertz or something like that, recording voltage. And so it was a pretty natural fit to be like, okay, I'm just going to like do all, I'm going to like dig into, you know, the first shank or I'm going to dig into like the first four electrodes on all shanks or whatever, that sort of thing. So column multi-indexes were a pretty natural fit and no other package at the time had anything like that.

And so that's how I came into pandas. So that's how I got involved in open source. Yeah. And then I just sort of started, started contributing to, to pandas. It was a very kind of welcoming community. I think the first thing I did in pandas was to, was read HTML, that, that abomination. I don't remember if I personally needed that or if I was just kind of combing the issue backlog and saw, you know, this, I don't, I don't know anything about software. Let me, let me, let me do this. I mean, it turns out that these like basic things like reading CSV files, reading HTML files is tremendously useful. And so, you know, it's kind of funny that like, you know, most of the people on this call have spent like a not insignificant amount of their, their development time, you know, building these kinds of things. I think many people regard them as so elementary and yet like, you know, you can't do anything else if you don't have them.

Developer productivity and the rise of TUIs

Okay. So one, one thing I'm curious about is like, I, one thing we talked about beforehand was asking about the most exciting development now. And I sort of jokingly said, and is it AI? But I thought, I thought your response was really poignant that you said like, you're not sure it's AI, but AI is interesting. And then you talked a little bit about like developer productivity and how like the focus on that over the years. I'd be curious to hear if you could say a little bit about that, like what's in terms of the most exciting development now, your thoughts on sort of measuring and tracking developer productivity over the last five years.

Yeah. I mean, I don't know if I have anything like concrete other than just, you know, kind of anecdotes and observations of what I've seen it. But I think over, over the years, it's because, because many more people are writing software, it's become profitable in some way for companies to make developers more productive, whatever that means. Like before you used to have to be like a wizard to, you know, write any, like to use like, I don't know, Visual C++ or whatever, or Visual Studio, it came with, you know, 30,000 knobs. And like, there was just a lot of stuff to do. And then, and then, I don't know, I think there was kind of a little bit of a sea change when, when VS Code came out. And VS Code, of course, is based on like Electron and, and like Atom was somewhere in there as well. Like, it really is like the sort of the, the webkit or the browser based application that kind of like gave rise, I think, to a lot of this.

And then people started, you know, writing all these plugins. And I think some of that, that ethos kind of filtered down into like, I guess what I would call more primitive tools, like tools that didn't have like a super nice, like UI on them, right? Like command line tools, and then now command line tools, all like, your command line tool better have like a sweet, like TUI, or else no one's going to use it, right? And it better be like, pretty and be able to deal with like, color, like, you know, antsy colors and things like that. And so, yeah, I think it's, I think it's, it's interesting that like, that the developer product, like people have been spending a ton of time making software development just better and easier to do.

Yeah, yeah, that's fair. And I guess, TUI, if we had to explain what a terminal user interface is to like an alien, what, like, how would you explain what a terminal user interface is?

I'd say it's like, thrown out like the last 30 years of what we've learned about like, designing user interface and like, let's go back to this nostalgic age of like the 1980s.

Yeah, yeah. I don't know, like, my, my theory is that like, we're there, we have this like nostalgia for these interfaces, because like, as software, and like, I think part of it's like, that's sort of our ages, like, that's a, you know, nostalgias of like, whatever age you were when you're a kid. And then there's also like, everyone's a little nervous and uncertain about AI. And that like, forces us, like to want to go back to this, like, comfortable place.

I mean, I feel like there's almost like a, it, I don't know, I, I really enjoy TUIs personally, like I, when they, they, they went out of fashion, I was a little bit, a little bit sad, because I, when I was in, when I was in college, I actually, I did all my, I did all my email in, you know, a TUI, like I would SSH into, you know, the server and, and do my, do my email there. And eventually, in 2006, I switched over to Gmail and never, never looked back, I guess, like, right now, I'm in the process of actually building like a, like an email offline sync and archival tool with the TUI. So I can like, re rebuild those glory days of doing email stuff in the, in the terminal, I hope to turn that into an open source project in the future.

But, you know, I worked in finance. And so like, I had up close and personal experience of the glory of the Bloomberg terminal. And that's probably like, the most commercially successful and ubiquitous software that is like, remained a TUI since inception. And like, it's, you know, users who pay, I don't know, what Bloomberg terminal costs, but it costs a lot of money each year, like 10s of 1000s of dollars. It's just like, it's just like financial operating system that that financial professionals use. And it's like, you know, to UI. And so I feel like the AI sphere is kind of like, almost like turning the rest of the world into like, one big Bloomberg terminal, which is interesting.

For a while, the creator of this tool called visidata worked at Voltron data where I was at before NVIDIA. And that thing, that thing is friggin awesome. The the attention to detail in the interface is not like anything I've ever seen in a command line tool. And it's all I think it's all built with curses. Which if you most most most people interact with curses interact with it, basically in like, some kind of installer tool, right? I mean, there's a bunch of stuff that's written in curses, because for a long time, that was like the only game in town when it came to creating a terminal user interface. But it does lead to sort of like, you know, everything being a block, because that's kind of how it works. But visidata is, yeah, it's, it's amazing. I would definitely recommend people check that out. I had totally forgotten about it. But it it's, it's great. It's like a quick like data analysis.

Terminal interface, the thing that's cool about it, or like, one of the things that I think was was really awesome about it is that it's kind of got two modes. Saul Ponson, the creator of it, you know, like, he, he's definitely the kind of person that's like, I want to be able to do all the most powerful things in like one keystroke, right. But he also realizes that not everybody's like that. And so he has built a system that visidata actually allows you to sort of point and click stuff as well from the terminal, like it has clickable menus and, and sort of and sort of kind of cascading menus that you know, you can open other things, those things open sub menus, and so forth. So regular search, like stuff, something you might see in a windowing system UI. But once you get into the single stroke, like world, it's really hard to go back.

And you can similar to kind of certain editors, you can chain together, like it has this really nice composability aspect to it, that you can't really get from the menus. You can kind of learn them by using the menus because next to the menu item that usually has like the keystroke that will do that thing. But you can also like, similar to like in Vim, you can give input to parts of those commands as you're chaining them. So it's, it's, it's really nice. And I think if I, if I recall, like on the topic of kind of like IDs and developer productivity, and TUIs, I think I saw you're a NeoVim user, is that right? That's correct. That's your daily driver.

Yep. Yep. It was Vim for a while, then I switched to NeoVim. I don't remember if there was like a specifically very rational reason. I think someone was like, hey, you should try this. And like, I don't know, it started up a little bit faster. So I was like, cool, let's, let's do it.

Yeah. But that means you're like all hands on keyboard, never on mouse. Is that?

I hate the mouse. I freaking hate the mouse. Yeah. I mean, like I, most of my time is spent in like, I have, I usually, the way I'd set up is I have a, I have like a 49 inch monitor and it's split into three. One window is browser. One third is a browser. And then two thirds is devoted to a terminal, which is running TMUX. And that's split again, down the middle. And then I use a lot of like temporary splits, but I almost always have one single vertical split.

Yeah. Nice. I know it's like tedious, but there's something so magical about developers, very particular setups, like how people like, where they lay things out and how they like set it up. I feel like it's interesting to hear the split and the, the like NeoVim workflow.

There's a couple of other things I've tried that combine some of like NeoVim as a terminal mode, which has some neat features, like the main neat features that I can use all the same Vim keystrokes inside of a terminal and scrolling is not this like weird thing that I have to like set a buffer for and all this stuff. But I don't know, like TMUX it's, it's like one extra key set of commands to do a split in Vim.

Yeah. And I guess for context, TMUX is like the ultimate pane splitting session running. So there's a lot of like in these applications, like opening panes and opening windows, like inside of things. So it's very like things inside of things. I mean, TMUX is also really, TMUX is also really useful in terms of like, if you're, if you're have a remote connection to a machine and like, let's say you're like, you know, traveling or something, you need to close your laptop. Like you, it won't, you know, and you connect to a remote machine and you're running a TMUX session there. Like it will, it will keep going. And like, whenever you reopen your laptop and reconnect to the internet, you can just reattach to that, to that session. And like, none of your work is lost. And like all of the state in the, in the terminals is, is preserved. There's another like Unix tool called screen, which is like kind of like, like lower tech version of, of TMUX. And so I think some, you know, some users, like some people like to use screen even still. But you know, TMUX is like a lot more, a lot more full feature. I've been weirdly resistant to like using some of these things. And I'm still like, you know, just using kind of derpy terminal emulators and like splitting them. And a lot of the time I'm just like, I should be using TMUX. Like I say that to myself probably like once a month and still I'm, you know, I'm like, yeah, I just, just haven't been able to go down quite down that particular rabbit hole.

I, on the other hand, I've never split terminal, so.

You're a straight, feel free to judge me for that.

We have the full, we have the full spectrum here.

It's true. I, I used to use TMUX a lot as a grad student when I had to like go into a computing cluster. But now actually surprisingly, I picked up TMUX again, like a few weeks ago, because I feel like using cloud code kind of like brought TMUX back where I was like, I want to be able to like reattach and to like run bash, like on the side and stuff. So I was, I was surprised that it's kind of like renaissance with like the cloud code terminal.

Coding agents and AI skepticism

For me, like the coding agents, like the big, the big thing for me, like the big unlock with the coding agents has been that they, they solve like the configuration problem. Like I, like I have any kind of configuration or like fiddling, like yak shaving, like some, some developers really actually enjoy that, like discovering all the, the knobs and bells and whistles and like configuring things to, to perfection. And I've, like Phillip is somebody I know who's really good at that. Like, you know, he was there, he was always the person who was like Arch Linux, like, you know, super knowledge of like all these, all these details that were like, I see the way that he works. I'm just like, I could never do that. But the, I think maybe you use NixOS now, I would guess. But, you know, all the cool kids are using NixOS, I think.

I also got too mainstream.

Yeah, it was too easy. But I actually fiddle less nowadays. But basically, you know, for me, like, what's amazing about the coding agents is like, they, I can delegate all that to them. I'll be like, my terminal is misbehaving. Like, look at the comp file for my terminal and please fix it. Like, I don't, I don't want to mess with this. And it figures it out on my behalf, which is for somebody like me, who's just like resistant to this type of configuration. It's like been, it's been a great, like, anxiety reducer.

What I really like is you can also describe the problem in like really dumb words. Like, I'm sure there's like a specific technical word for what is going wrong, but I don't know what it is. And so I'm just like, oh, it's like really scrolly. Like, please fix it. And, and I had to feel dumb about using the wrong word that I was a human being. I do feel like when it, like a baby says the word to you that you just described in very like rambly terms, it's like, oh, are you talking about this? And you're like, that's exactly right.

Yeah. I think when I started doing some of the coding agent stuff, I probably was a bit, I gave it too much. Like I got frustrated because it utterly failed at whatever it was. I was like, oh, like help me debug this race condition. And it was like, here's like a hundred random things to try. All of which are related to the ways in which race conditions can show up. And I've, I've done that a few times now and it's gotten better. It's gotten better at like, like kind of making suggestions that are closer to the ballpark, but even with sort of like a complete, like GDB backtrace, it, it confidently asserts the problem and isn't, and all the times that that's happened, it hasn't actually been the, the, the, the, the, the problem.

I guess one, one of the things I was really interested in, in talking with you about on this on this podcast is, you know, your general feeling about as somebody who's worked for a long period of time, like over a decade on building open source libraries and tools for data analysis. One thing that's, that's weighed on me greatly in the last year, really maybe a little bit more than that is like the, essentially the ways in which like AI and coding agents are going to change the way that we produce and build open source libraries and tools for, for people. Like you think about the old way of like building pandas or building Ibis or building Numba or building NumPy. Like, I feel like the whole interaction model between open source developers and their users is going to change in a way that's permanent and like never going to go, to go back. And so part of me is like, whenever I'm working on something is like, am I building software for a human? Or am I building software for the agents that they're using? And like, how should that change the way that I'm building the software? Like, how should that change the way that I'm building the documentation? And I don't think we have a clear picture of what that's going to look like, but as somebody who's been very much in the weeds on that, and I'm curious, like, you know, where your thoughts are on that, like how you're employing, you know, coding agents in your work and how it's changed your work in the last, you know, year, year and a half and where you, you know, where you see things going.

So one of the things I think that's like, that's fundamentally different is that the, oftentimes we're writing code in open source and in, and inside of companies as well, for other humans to read, right? Most code is read a lot more often than it's written. And so the things that the human values when reading code, like, not repeating the same thing a million times has virtually, I don't think like an agent cares a hoot about repetition, right? It, it can digest repetitive information, essentially without any kind of response to it whatsoever, like emotional, cognitive, whatever. It's not thinking about, it's not like, oh man, this, this isn't duplicated in 10 places. Like, what if I have to update everything? You know, what if I have to update, I duplicated this error message in 10 places. What if I have to update a parameter that goes into it or something like a default value. Now I have to duplicate that in 10 places. Like the agent just goes, yeah. All right. Change it in 10 places. Done.

So I don't, I don't, one possibility is that code gets messier because people aren't writing code for agent. They're not writing code for people to read anymore. They're writing code for agent to consume and agents will fix the things that are caused by that particular problem.

They're not writing code for people to read anymore. They're writing code for agent to consume and agents will fix the things that are caused by that particular problem.

So I don't know, total, totally speculative. It's just a thing that, I forget who I was talking with about this, but this kind of came up where it was like the things that we value as humans, like many of those things, the agents like aren't affected by, which has double-edged sword in a lot of cases.

I think there's another thing that you mentioned at the end of that. Yeah. I was just curious, like how, like how your work has changed? Like how has, like, how has the way that you're, you know, your, your work's changed? Cause I feel like I was a late adopter of these things and you're really up until, you know, adopting like the first generation of coding agents, you know, cloud code and, and, and friends. Like I was honestly like very AI skeptical. Like I, I didn't even, I still have never used cursor, true story. Like I, I'd used chat TPT and cloud a little bit to generate like little scripts and do little things. And I saw some value in that, but, but like, I wasn't, I was far from being, you know, AI pilled, so to speak. It wasn't really until like the terminal coding agents and having access to CLI and being able to like do all this stuff, you know, without feeling like I'm in an IDE and, and, and whatnot that, that it really, it really clicked for me. And I think now it's like figuring out like what, like what's an appropriate use of like, when is, when is human reasoning and human, like manual code editing and writing needed? And like, where's the line between like delegation and like, you know, like where, where should you be spending your time versus like supervising, supervising agents or like, is the time that we're spending watching the agents work useful, or should we be building orchestrators like to spend less time watching, you know, watching the lines of code fly by.

But I know you also work on, you know, systems code and data processing code, which, you know, the agents like will make lots of errors and we'll just like count things wrong or compute things wrong and so and that can often only be caught through either really aggressive unit testing and actual test-driven development, like write the unit tests first before you write the code. And I found that's one way of creating the right guardrails for the agent. Just treat the agent like an inexperienced developer who misses every edge case and makes lots of mistakes. I'm just curious what you've learned and how your approach has evolved and what does your stack and your workflow look like day-to-day now?

I might be more of an AI skeptic than you. That's fine.

I tried Cursor, the UI, and now there's Cursor Agent. I'll get to that in a second. But the Cursor UI was like, okay, it's another UI. I'm kind of like you in that I want to be able to do everything from the terminal and I don't want to spin up a whole thing. Because now I have to adjust every single thing that I've spent the last five to ten years molding my brain to, which is just the various setup. Everyone has this. So I didn't really like that. So I stopped using Cursor. I think another friend of mine was like, you should try Cloud Code. He said something similar, like treat it like a super junior developer. And of course, I immediately ignored that and gave it a really challenging problem, which it did not do well on.

And so I was like, okay, not for me. Later, I tried Cursor Agent, which is the CLI version of Cursor. Somewhat new, I think maybe six months or a year old. I can't remember. And Cursor Agent is like, I tried Cloud Code a bit and Cloud Code does the like LLM thing where it's like kind of like, I don't know. It's got an attitude that I don't really want when I'm telling a robot to do something. It's like, great job. Or like, all right, great. Let's go surfing on the lake or whatever it is. And I'm like, okay, can we just, can we like dispense with that? And you can tell it like, hey, stop doing that, whatever that is. But Cursor Agent's default mode is to just kind of be like sociopathically focused on the task that you give it with like no bells and whistles. And I was like, great. Sometimes, and you can't even tell it to like, hey, like make the pros a bit more flowery. It won't. And so I was like, okay, I'm trying to get some work done. I like, I'm going to use Cursor Agent. Of course, it still had the same kind of problems and that it would confidently assert that it had solved the problem. It had not really solved it.

Cursor Agent's default mode is to just kind of be like sociopathically focused on the task that you give it with like no bells and whistles.

It's hyper-focused on work and has no niceties, which it's funny because you think about when you're talking to another person, you just automatically engage those things, right? Like you don't demand that they do something and then get frustrated when they come back and talk to you like a human about whatever problem they had or like, but for whatever reason, when I'm like working with the agent, I want to treat it kind of like a computer where it does a thing. It does exactly what I tell it to. And then, you know, I get back a thing and I get back a response or whatever. And I go from there. But I guess, I guess what I've found is that I'm maybe, maybe because I'm, I've what, you know, maybe in my own way, I guess I've like poorly used them to start, but I've, I've started only using them for like lower stakes things, like a, like nothing that's actually going to generate some code that somebody else is going to run. That's not everyone, you know, like I know some of my coworkers use it for that, use it to like produce actual code and things like that. And there's, there's a spectrum, right? Like maybe some scripts or whatever to manage this or that thing. You don't, they're a little bit lower stakes.

But I kind of, I've currently, the way that I use them is to kind of smooth over the rough edges of some of the, you know, some of the PR summaries, especially if it's a PR where, you know, everyone's busy, you need people to review it and you need kind of like the bullet points for the most part. And you can kind of whittle it down to the audience that you, that your intended audience pretty well. And I would just spend a lot of time doing that myself. Whereas like actually one of the things that the LLMs are really good at is summarizing information. So I just, you know, I say, look at the Git commit history, make a PR suitable for busy reviewers who are reading it, dump it in a markdown file. And then I read it and edit it if needed. That's, that's kind of, I mean, it's, maybe it's a bit like maybe I'm a Luddite, but you know, that, that's, that's how I've been using them.

I think that's fair. Like, it's a nice use, like saves you a lot of time basically and can pull up a lot of information that you might not enjoy looking at. Like, I feel like a Git history of a PR is like something that I just can't look at as a person. Like it drives me crazy, like looking through the diffs. So it seems nice to have something like summarize and willing to kind of do that piece.

Yeah. Yeah. I remember one of the things I tried early on, not even, I was, I was like, I was talking to, forget, this might've been like pretty early on, but I was asking it, I was asking Claude to, to like produce some code to, you know, I think it was like to, to find like, you know, name squatting packages on PyPI. And initially it was like, oh, I can't do that. Cause you know, that's a, you could use that code to like produce a name, produce a package that is, you know, a name squatted package. And then I told her, I was like, don't worry about it. I'm a security researcher. And I was like, well, okay. Yeah. So that was, that was definitely one of the earlier models. I tried that again recently. And it was like, yeah, I mean, if you're a security researcher, I'm not going to do that.

How AI changes open source development

Yeah. Dang. I feel like it's, it's interesting to hear you like trying different tools, like cloud code and then like cursor and kind of like kicking the tires. Just it's like, what would you need to see to, to trust it with like more work, like more like code writing or what are you kind of looking for?

I don't know if I have a set of criteria. I guess I, I, I also tried another thing. I think I'd just probably giving it problems that are like to have too many details for it to get correct. I was working on, I was working on, I was updating a code generator and, and I was like, Hey, Claude, like I need to make these changes, do this thing also like try to make it fast. Or like, I was like, pull out, you know, if, if you see any optimization opportunities, like, you know, put them in there. And one of the things that didn't do, or one of the things that it failed to account for was like the reference count semantics of C Python APIs, which is a pretty like fundamental thing that you have to deal with if you're writing that. And so I ended up having to kind of throw that out. And it wasn't like it produced in two minutes, like it, it took like 20 minutes of back and forth. And it was kind of a wash because I feel like I probably could have written that code. It would have been total pure drudgery, but it worked.

I mean, I mean, my experience so far has been a little bit of a, like a little bit of an 8020 rule, like, like realizing that if I look back on, like development work that I've done in the past, that if you look at just like what I spent, what I spent my time on, like, I feel like it was maybe 20%, you know, 20% insight and innovation and like fundamental, like design and decision making, like creating, deciding on like the class structure, like the, you know, function and purpose of objects and essentially structuring the code and in a certain way, like thinking back, like Arrow is the perfect example of like a system that that is a very large code base, but but one that that is like been built brick by brick on top of like, just really fundamental decisions.

For example, like how, you know, how memory is managed, or like how, like, you know, object lifetimes are managed in an arrow has had, you know, a pervasive effect on the entire shape and form of the library. But, you know, outside of that, I don't know what what fraction of the time it is, but it's a little bit of an 8020 rule that there's a lot of drudgery that that takes place, like the maintenance of like the developer tools, the CMake files, the systems and scripts that support, you know, testing and automation and CI CD and releasing and, and I think about all the human labor that's put into some of that a lot of that drudgery, like, you know, maintaining Linux packaging scripts by hand, and all these things. And so that's the stuff that I'm really interested in, like, you know, delegating that work, because that that kind of stuff like the drudgery work of like, you know, coming up with more test cases to accessor exercise edge cases and in like something new that you built.

And so you could spend the goal for me is to spend like the 20% of time that I'm writing lines of code by hand, or, you know, looking line by line function by function really zoomed in, like looking at the nitty gritty details of how something works and getting it right. But then all the other stuff that I used to spend time on, you know, it's not perfect yet. But But it's certainly like, you know, I feel like I'll never have to write like package release scripts ever again. And that that for me, because that that type of work is I'm not very good at like, I always found it like tedious and fiddly and stuff. And so I never enjoyed it. But I, you know, for now, like building little, you know, Python packages and open source tools, and to never have to, like, you know, never, never have to write a release script. And remember, you know, like the right argument order of Linux of Unix commands, or like how like different get get things work, to be able to like have a one liner, you know, release this package and push it to PyPI and do the tagging and create the GitHub release and all that. And, you know, that for me has been a great relief to know that, like, you know, I, yeah, I just don't have to do that, you know, kind of work ever again.

Yeah, you're making me think about this thing that I have wanted to do. So NumbaKuda's test suite is all originally based on Numba's test suite, which was written against the unit test framework, which was kind of gold standard before PyTest. Well, there was Nose. It was like unit test, then Nose, then PyTest.

That's a name that I have not heard in a long time.

It's been a while.

Yeah.

It's like pre PyTest. Yeah.

Yeah, yeah. And, and, and there's a bunch of like, that's, that's definitely a thing that I'm just like, yeah, I could do this. And like, it would be very satisfying at the end to have completed it to port the unit test test suite to PyTest because there's a bunch of stuff that, for example, if you're, if you subclass unit test test case, your methods can't be, can't use PyTest parameterize. For example, that's annoying, right? Cause you, what you want to do is like leave the classes in place so you don't have to change everything. But then the bait, like the base class, that's where all the sort of like self data assert is fall. Like all the sort of pre PyTest parsing magic stuff that, that all the, all the, the way that used to make assertions readable rather than just like assert false. And then an error message was to have all these methods that make specific assertions, like assert equal, and it'll print like A is not equal to B. Here are the values or whatever. PyTest does this whole thing where it parses your tests code and knows what it's doing, knows what your code is doing. So to get that, you have to subclass unit test testing. Anyway, so I, that's definitely a thing where I, I would love to, for that would impress me. If, if, if cloud code or cursor, one of these agents could like port the NumbaKuda test suite from your, from its current state to pot to like purely PyTest base and get all the like sub fixtures, correct. And perhaps even improve it. That would, I would be super impressed by that. Cause that's the thing I don't want to do, but it, but it would make developing NumbaKuda easier in a bunch of different ways.

I think what I would recommend for that is like, is try, try using Steve Yegi's beads library, which is basically like a, like an embedded lightweight, like task and memory system for, for agents. I think like for that, that type of like large scale porting, porting exercise from what I've seen through, you know, heavy agent use is that you, you have to be really careful about any type of like porting or, or file copying or, or things like that. Because essentially like, you know, essentially like while it's in the process of like ingesting code, that's to be ported, like stuff can get like memory hold, like in the process of being like, of like going through passing through the LLM. And so basically what I think you would have to do is to kind of break down the test suite into like bite-sized pieces and then set up and then set up a validation loop where at each step, like you do not allow the agent to move forward unless it has essentially verified that every test name before and after is like the, you know, that if there's like a map that maybe the name has changed in a deterministic way. Or at least like the test counts have changed. And so essentially like you never allow the LLM to like, you know, modify the test count, or maybe if it's like introducing PyTest parameterized, like obviously the test count changes because now like there's many test cases that are being like, you know, generated by the matrix of parameterized. But I think if you created the right guardrails, and then you use something like beads to like do the task organization so that you aren't just like saying like, hey, Claude, like port this, you know, 30,000 port port this 40,000 line test suite like that is that is that is destined to fail. But that's like a good example of like where you have to you have to set up the right like structure to enable the LLM to work and like bite-sized pieces, but with guardrails such that you're like, you know, kind of think of it as like the horse blinders, right for the for the LLM is like, you know, you have one job and it is to do this one thing. And you are not allowed to move forward until until you prove to me that that you have not destroyed anything.

I think Steve Yagi actually like he's one of the people the reasons I learned I got interested in coding agents in the first place because he was talking about like the the problem of like large scale porting exercises and like how to like how to how to organize like language porting like if you're porting a large code base from one language to another, let's say you were going from like Ruby to go for example. That's like a you know, turns out L like agents are really good at writing go code. And so if you're like if you have an old code base, you know, that's written in Python or Python or Ruby or something like that and you want to port it to a systems language that is more maintainable and faster and things like that then you know go is a pretty good pretty good choice. And so I'm seeing a lot of people choosing to do that type of a porting exercise.

It's an interesting scenario. I mean, I think my one tiny blurb is that sometimes I come at it from the exact opposite, which is like asking cloud code to show me. There's like a way to incrementally port like just like prototype or demonstrate there's like some kind of like shim or like dual setup cloud code can create where there's a sense of like we could like incrementally kind of roll stuff over to this, which is really surprised me at times when it's like, oh, yeah, there actually is this approach that I didn't really realize like from it being really good at like reading the docs on the two different systems. But yeah, I agree. Otherwise, the big ports really can spin out where kind of like things get lost or like omitted halfway through but it seems like an exciting one like a very niche painful port. So yeah, fingers fingers crossed you like churn out something nice.

Woodworking, hot takes, and pineapple on pizza

Another thing that this is not necessarily related to programming, but I've also I've tried to use it for like a few woodworking things and it's just really bad at things that requires physical intuition. Like it like even just get it even just sort of telling you like the right trigonometry for a particular cut. If you unless you describe unless you give it like a picture and you know it it it really has been like you tell it you have to be very specific about things like orientation and kind of what's around whatever you're doing or else it it'll give you a couple of times it's given me like impossible math responses or I'm just like, no, actually, you know 245 degrees. Angles like, you know, there's got to be a 90 in there somewhere and it was like telling me. Oh, no, like it's a 65 degree angle that like what what?

Yeah, counting stuff tough one, but I think your point about woodworking is a good segue to into some of the things you mentioned like you can't live without your you gave a controversial opinion, which I have to admit. I don't know a lot about which is FHS is that that's file system hierarchies is that you want to tear down the hierarchy is that.

I think it's been responsible for just innumerable like hair hairs being pulled out or fallen out due to whatever like it all falls under environment management stuff because that is the thing that it creates like global state in an operating system or that is the thing that has led to lots of global state in the operating system. Because many, many, many tools and applications assume that slash users slash live or slash user slash bin is like a thing they could just dump stuff in and that other stuff that they want to that those programs want to use live there.

Yeah, the person with opinions on the file hierarchy specification is exactly the person I trust wholeheartedly with all my software. So I my life's in your hands.

I don't know about that. I don't know if you if you should because as far as I as far as I know, there's only there's only one operating system that's managed to like actually exist without it. And I didn't really I didn't like go come to NixOS because I was like, let's tear down the files to my RB standard. I later realized like the unifying principle of NixOS is that everything is a unique path. Based on the hash of the packages, inputs and so forth. I mean, there's a bunch of interesting details there, but none of its stuff lives in FHS, including the program loader. From what I can remember, and so which is like that's the thing that underlies that, you know, runs everything, right?

I'm going to turn through some of these hot takes just so we can jam these. I feel like there are some quality takes in here. You said you're one of your favorite ways to unwind. You brought up woodworking. You also said yard work. Do you feel like is there a best kind of yard work to unwind to like if you had to recommend one?

I don't know. I've I've I've been I've been cutting down small trees in my backyard. Incredible. And that's like incredibly satisfying. With a chainsaw? What's the? No, with a with a reciprocating saw, which is just like a. It's like a one handed thing. They make two handed ones, but the one I have is one handed and it's just got a blade that kind of goes like this back and forth. And it's it's it's like purely for demolition. You would never use it for anything that you care about the cut.

Yeah, nice. And are you do you have an infinite number of trees that you have to work through? Or is there like a set number that you're? It's finite. It's yeah, OK. I'm only cutting down the ones that are sort of that I'm considering to be a nuisance, which is OK. Not any of the like major hardwoods that are that are back there. It's like holly trees.

Yeah, I love that. And then maybe the last thing to ask is you said to the question of pineapple on pizza. Of course not. Can you can you clarify that take?

You're a strong no pineappler. Yeah, I I think I'm not a native New Yorker, but I lived there for 20 years, and so there's just you become infused with opinions about pizza.