
Isabel Zimmerman - Making GUI Data Exploration Reproducible with Python
Language: English Speaker: Isabel Zimmerman Talk Title: Making GUI Data Exploration Reproducible with Python Interactive data exploration tools are excellent for visualizing the data as you are cleaning it, but when a data practitioner analyzes data through drag-and-drop interfaces, the path to reproducibility becomes opaque. This project bridges that gap by capturing UI interactions in the Positron IDE and converting them into clean, readable code across pandas, polars, SQL, and multiple R syntaxes. This talk will include a demonstration of exploring data with a UI and converting that exploration into reproducible code. We’ll walk through the architecture that makes this possible, from tracking UI changes to generating semantically equivalent code across different data manipulation libraries. We’ll also discuss the challenges and considerations that went into the design. This work addresses a critical need in the data science community: tools that enhance usability and reproducibility. Whether you’re building data science tools, analyzing data, or simply frustrated by the gap between exploration and reproduction, this talk will show how thoughtful design can make reproducible science as easy as point-and-click
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hi there. I'm Isabel Zimmerman. I'm a senior software engineer. I'm also a very big cookie fan. I love to make them. I love to eat them. I grew up in the kitchen with my mom and she would tell me how long to soften the butter and how much cocoa to add and when to take the finished product out of the oven. When I moved out of my family home and started hosting on my own, when my friends would ask me for a recipe, I had to just tell them that I measured from the heart.
Using a graphic user interface tool for cleaning data is a little bit like baking without having a recipe. It works really well for one person, but the second that you need to share or reproduce your results, things get tricky. The details matter and without writing them down, good luck getting the same cookies twice. I mean, we've all been there. You're exploring a data set and you reach for the GUI tool because it's just easier. You click, you filter, you visualize. But then someone asks, you know, how did you get these results? And you realize it's really hard to reproduce what you just did.
Today, I want to talk about how we can bridge this gap between the ease of GUI tools and the rigor of reproducible code. So, GUI-based data analysis tools are incredibly appealing. There's something intuitive about, you know, pointing and clicking. You get immediate visual feedback as you explore your data. And maybe most importantly, it really lowers the barrier to entry to explore data. You don't need to remember exact function names or syntax. It makes data exploration way more accessible to more people, which is fundamentally a good thing.
The reproducibility problem
But here's the problem. Reproducibility is extremely important in data science. If you're doing academic research, if you're doing business analytics, or you're building a model, someone needs to be able to recreate your work. Your collaborators need to understand what you did. And future you, six months from now, when you can't remember why you had made certain decisions, really needs that documentation of having a script to reproduce your results.
Think about it like a recipe. When you make cookies from scratch, you can write down the recipe so you can make it again, you can teach it to others, and you can tweak it and improve it. Reproducibility isn't just something that's nice to have. It's essential. And so, we've been living with this false choice. You know, you can either have this easy-to-use GUI tool, or you can have reproducible code-based workflows, but not both. Things like Excel are easier, but harder to reproduce. Things like Jupyter Notebooks are really reproducible, but have a steeper learning curve. But what if we didn't have to choose? What if we could have the ease of a GUI that automatically generates reproducible code for everything you do?
But what if we didn't have to choose? What if we could have the ease of a GUI that automatically generates reproducible code for everything you do?
That's the solution I want to show you today. Automated code generation from UI interactions. This is sort of like your kitchen helper that not only is going to weigh your ingredients, but automatically write down the recipe as you go. This is a project that I worked on and built for the Positron IDE, and it has really interesting implications for other Python projects.
Demo: exploring data in Positron
Let's get into a demo. So, I'm going to be using the Positron IDE to do a little bit of data analysis. So, first, I'm going to import the pandas package, and I'm going to use this to read in a CSV that has a bunch of recipes. I can use this button to view our data in our GUI explorer. So, we can see here that we've got 14,000 rows, 16 columns. Here's all of the columns here with a little bit of information about them. Let's say I'm looking for a recipe for friends tonight.
So, maybe I want to have something that has ingredients that contains chocolate chips. I can apply this filter, and now I can see in my ingredients column, all of these will have the string value of chocolate chips. Maybe this is also something that I need to make for tonight, after work today. So, let's make sure that the total time to prepare this recipe is less than or equal to 60 minutes. Now, this seems to be a good set of options. Now, I only have 250, but I really am not going to make 250 desserts. I only need one. So, to find the very best one, let's maybe sort by the average rating. So, we want the top rated recipes, and maybe also see which ones have the most ratings. So, I have the highest rated recipes with the most total ratings that take less than 60 minutes to make, and contain chocolate chips. Looks like banana bread mug cake in a minute is our winner today.
Now, this was a simple GUI type interface to interact with this dataset. Now, I'm going to be using this special convert to code button to actually generate all of these filters in Pandas. I can copy it right here, and if I want to, I can run this code in my Python terminal, and I can see it still has the same 250 rows and 16 columns that we saw in our GUI interface, and if we see the top set of data is, again, our banana bread mug cake in a minute. So, that means I can copy and paste this code into a Python script, I can share it with others, and it was automatically generated from the GUI interface in the data explorer in Positron.
How it works under the hood
So, now that you've seen it in action, let's talk about how this actually works under the hood. So, the first thing that we need to do is use the UI to capture what the user is doing. In Positron's case, the front end is built with TypeScript and React, but actually, all of this is interacting with an iPy kernel instance that's running in that console. So, whenever we click a filter button, enter a value, select a column, all of these interactions are sent to the Python runtime via Jupyter comms and iPy kernel. So, actually, every single time I add a filter, I get some sort of information, like the column name and type display and the condition and the parameters. So, I get to see that I'm looking for columns that have chocolate chips inside them. This is all sent through JSON, so I'm able to intercept this information and act upon it in Python.
So, this conversion layer is all written in Python. It ingests all the information, all that JSON that we got from the Jupyter comm, and it puts it into these code templates. These templates are just classes that do a bit of string manipulation, as well as delegating different tasks to a Pandas or a Polars back end, depending on what type of data frame the user is using. So, it's possible to do this sort of code conversion, because the Positron IDE has these Jupyter comms that have the really structured input and output with very rich information, and also because the Pandas and Polars data frame libraries have very modular ways to put together the filters and sorts.
So, we can see here the code that we generated earlier. You can see there's something called a filter mask, which we're using to filter our data frame to have different ingredients. And if we go through each part of the filter mask, we can see that we're using the data frame name of all recipes, a certain column of ingredients, and then an operation. In this case, the operation is that the column has a string that contains chocolate chips. It's not case sensitive, and it doesn't include NAs. We can also see where we filtered for greater than or less than or equal to 60 minutes. Again, we have the very structured data frame name, data frame column, and some sort of operation. This maps really well to the information that we're getting from that Jupyter comm and those parameters. And we're able to do pretty complex things in a simple way by using classes in Python that ingest this JSON and translate it and put it into these sort of templates in Python strings.
Same as the filter mask, to do this sorting in Python, it's a method chain where we do a dot sort values, and we can pass in a list of column names and a list of ascending or descending per column name. This mix of structured information, along with string manipulation, allows us to make very rich templates. We're able to test this as well by having mock Jupyter comm inputs and making sure that the output is what we would expect. Once all of this information is sent through the converter classes in Python, we're able to have this beautiful string that users can run out of the box. I'm able to see the type of data frame over that Jupyter comm message before I even generate this code. So, I'm only showing users the type of code syntax for the data frame library they're using. That's all to say, if you're using a data frame that's built in Pandas, you will only be getting Pandas code. If you have a Polars data frame, you will only be getting Polars code.
Design considerations
There are a few considerations we have to take into account. The first one is why not just use AI? I mean, it's 2025, and LLMs can write code. But I think the main problem here is that AI is not deterministic. The same input is going to produce different outputs, which can be confusing to users. Also, we want guaranteed correctness. Because this GUI is fairly simple, we already know all of the types of operations that are available. We already know the type of column that's available. We can just use these code templates to fill in the blanks, and we can make sure that this is correct 100% of the time. An LLM might generate code that's mostly right, but using this template, we're able to guarantee our output. Speed matters. People don't want to either pay for every single time they want to generate this code. It will take longer if you're making these API calls to a different LLM that's hosted somewhere else. So, for this, this deterministic rule-based system worked really well for this highly structured UI component.
Now, we can think about, you know, it might be hard once we get into more complex filters. There are a few things that really helped out with this. Some of them are on the side of the GUI interface. All of the filters in Positron are AND filters, which means that the order doesn't matter. So, we can sort for chocolate chip recipes and then see things that are less than 60 minutes long or do things in the opposite direction, and it will give the same information. This simplifies a lot of our filters. So, perhaps if this continues to build out and Positron supports or logic, I'll have to update this code to support the new UI. But for now, we don't have to deal with nested filters as much.
Another thing to think about was what syntax to show. Should we create the most concise code or should it be more readable? Do we always want to use method chains or break operations up into multiple steps? So, my philosophy when I was building this was that the generated code should be idiomatic for each backend. It should look like what an expert would write. It could also be a teachable moment so that people can see how to express operations in the code. So, we want something that scales well but is also readable. And most crucially, it needs to be copy and paste ready. Users should be able to take this code and use it immediately and generate the exact same results without any sort of modification. Think of it like writing a recipe. You want it to be clear enough that a beginner can follow it but also concise enough that an experienced baker doesn't get bored reading it or even make it written in a way that people can modify this code easily. Sometimes these goals conflict and we have to make careful choices. The tradeoff between ease of use and reproducibility is not fundamental. It's a design choice.
The tradeoff between ease of use and reproducibility is not fundamental. It's a design choice.
With thoughtful engineering, we can build tools that are both easy to use and automatically reproducible. Users don't always have to remember syntax. They don't always have to take notes about what they did. The code is just there, ready to save, share, and rerun. Other type of code template scenarios would be really helpful in some Python packages to make things a little bit easier to use by automating the easy parts and giving users a template to work off of so that they can modify it for themselves later. Thank you so much for joining me on this talk. You can read more about Positron at this link. You can find me on my website here and I've also shared with you one of my favorite chocolate chip cookie recipes. Thanks all. Bye.


