Resources

How I got unstuck with Python (Julia Silge, Posit) | posit::conf(2025)

How I got unstuck with Python Speaker(s): Julia Silge Abstract: Python as a language is known for being explicit, simple, readable, and beautiful. At the same time, the tooling around using and writing this language has not always made people feel productive and delighted. I know this has been true for me! In this talk, learn about recent improvements in tooling for Python that have finally addressed my own persistent challenges. Posit’s new IDE, Positron, provides a next generation environment for Python data practice, and this new IDE plays nicely with modern language tooling from the Python community. Whether you are Python curious or looking for ways to improve your Python workflows, hear about how I finally got myself unstuck with the most popular programming language in the world. Materials - https://github.com/juliasilge/get-unstuck-with-python posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

There we go. All right. I am so pleased to be here to tell you about how I got unstuck with Python. My name is Julia Silge, and today I'm an engineering manager at Posit. I work these days on Positron, which is our new next-generation data science IDE. It works both for Python and for R. But I want to tell you, not about who I am today, about who I was 10 years ago.

10 years ago, I was making a career transition into data science from what I did before, which was scientific computing, physics, and astronomy. And I learned Python. I was trying to learn Python. That's my default assumption about what I would learn. It was what I heard about being recommended that I should do. And when I heard people say what they liked about Python, I heard people say things like, oh, it's explicit and simple and readable and beautiful. And I thought, fantastic! I love all those things. That sounds right up my alley. This is going to go so well.

I'm sad to share with you, it did not go well. It went really badly, actually. I did not get up and running with Python quickly. I felt stuck. I felt blocked. In full disclosure, I felt kind of bad about myself because it was not going well. And I thought about all the school I had gone to and all the sort of computational experience I had, and I thought, why am I not able to kind of get going in this new kind of skill?

You don't have to feel too bad for 10 years ago me, because I pretty quickly discovered R. I got exposed to the wonderful open source R community. I had jobs, got jobs that were rewarding and interesting, and I'm really happy with where my career has taken me since that time.

I do, however, think that it's good to look back and kind of ask the question, why? Like, why did it go so badly? What about my own experiences and what the ecosystem was like led to me having such a rough experience? With the hindsight of 10 years and actually having come back and at least gotten moderately competent with Python in those intervening years, looking back, I think there's two things that together conspired to have me have this very bad experience. The first one is I had challenges around managing Python environments, and the second one is that it was really rough for me to choose what kind of tool to use to actually write Python code, what kind of IDE to use.

So this talk is a talk of good news, because in the intervening 10 years, the tooling in both of these areas has changed a lot, and there are much better solutions. So let's dig into that first one first. This is a topic upon which much ink has been spilt, and for good reason. The Python ecosystem has some specific characteristics that lead to this being particularly challenging, even for people who come in with really great computational experience in other languages.

Python dependency challenges

So one thing that is important to know if you are trying to get unstuck with Python is that you can end up in situations called dependency hell. Dependency hell is a phrase that means you are trying to use two packages at the same time that have conflicting dependencies. Maybe one package needs a version 1 of a dependency, and another package needs version 2, and so you literally cannot install them into the same environment to use together. This is different from the experience that many of us are used to coming from CRAN, because CRAN not only will make sure packages work in isolation, but CRAN actually tests all the packages together. And on any given day, if you try to install package A from CRAN and package B from CRAN, you can do that on any given day. In Python, from PyPI, PyPI will say, okay, here's package A, here's package B. They have conflicting dependencies, and PyPI says, good luck. This is your problem to deal with.

Another thing that's unique about the Python dependency situation is that there is no default or built-in package manager, and this is different from ecosystems like JavaScript with NPM or the Rust ecosystem with Cargo. There's no equivalent to this kind of thing in Python. Instead, instead of having one of these, Python has a whole bunch. If you've heard of things like Conda and poetry and PIP tools and PyM, these are all different tools in the package management space, but they don't all do the same thing. Instead, think of them as an overlapping Venn diagram of different kinds of tools that have a different perspective on what the important thing to do is. You can actually use some of those things I just mentioned together at the same time to do different sort of pieces, whether it's dependency resolution or make a virtual environment. So ten years ago, I'm trying to get up and running, and my Python situation was a disaster. I was like, how can I not even install packages? This is bananas. I cannot believe how much trouble I'm having here.

Modern solutions: UV and environment management

I said this is a talk of good news, and I am really happy to be able to say that so this has been a long-running problem in the Python ecosystem, but as they keep making new tools, the newest sort of tools in this space, they're actually good. They're actually good. So the first tool that really made me sort of sit up and pay attention was a tool called PyM, and it is a great tool for installing Python and making virtual environments, and I was like, wow, this is actually pleasant to use and works as I expect. But today what I would recommend now if you're trying to get going is a tool called UV. What's unique about UV is it is an all-in-one solution. So it provides tooling for everything that you need to do in the package management space. It will do dependency resolution. It will install packages for you. It will help you make virtual environments. Like it will do all of it, and it is blazing fast.

Now, I'm someone who like often performance is not the thing I care about most about the software that I'm using, but UV is so performant and so well-designed and pleasant to use that it reduces the pain of Python environment management work to a level I honestly would not have believed you 10 years ago. Like it is truly so much improved.

UV is so performant and so well-designed and pleasant to use that it reduces the pain of Python environment management work to a level I honestly would not have believed you 10 years ago.

There are a lot of different ways you can use UV. I'm just going to show you a very lightweight way that I like to use it. If I'm starting a new Python project, I'll make a new directory. I'll do UV then. I'll activate that virtual environment, and then I'll say UV pip install the packages that I need to go, and I'm up and running. Like I literally can then start running Python code and use it then. So instead of being stuck, I'm ready to go.

Choosing the right IDE for data science

The second piece that I now realize really impacted how I was like how much trouble I was having was picking a tool, an IDE, a code editor. Python is a general-purpose programming language, and what that means is that the code editors and the IDEs that you often end up being recommended to you when you're trying to get started with Python, they are built to support the wide range of work that is done with Python, whether that is like stuff to do with the Django web stack or, you know, generic scripting, or if you're trying to do data science. What this means is that the editors, they're not specifically built for doing data science, and trying to do data science kind of stuff often feels like a second-class citizen.

I really ran into this personally when I was ten years ago trying to get going. I installed a few things that I heard about and had a hard time using them for the exploratory iterative work of doing data science. The thing I actually invested the most time in back at that ten years ago time was trying to use Emacs for a data science IDE. I'm like in org mode trying to install extensions, and it did not go well. I think what this most often looks like for people today is attempting to use VS code as a data science IDE. You're like, okay, I'm going to install certain sets of extensions to try to get support for exploring my data or seeing a plot, but they're not very integrated, and so you end up with like a lackluster experience that doesn't support the kind of tasks you need to do.

Ten years ago, I also gave a go at Jupyter Notebooks, and this is a tool that many people use for that iterative exploratory work. I came from a background in scientific computing that had really ingrained in me the importance of some reproducible practices, and when I was first exposed to Jupyter Notebooks and got experience with the execution model and how state is managed, and I bounced off of them pretty hard. I was like, this is not a good fit for me and how I know I need to work. So I really struggled to be productive, even just like how am I supposed to write this code? How am I supposed to run this code, and what kind of tool can I sit and do something?

Positron for data science

This is a talk of good news, like I said, and I am really excited about what Positron specifically brings to this. So I do work on the Positron team, so I know I don't speak with a lot of neutrality here, but real talk. I have become more productive using Python for these exploratory iterative kind of work since I started having it than I ever was before. It is built on top of the code OSS infrastructure, but it brings a perspective on what data science work is like, like new features, new ways that what you work on is integrated together that makes that better experience.

Something I haven't addressed yet is like what will you do if you need to decide? How do you go about kind of deciding? So yesterday, there was a great talk that I would encourage you to check out that really digs into this. It is by Isabel Zimmerman yesterday about what are different IDEs like, for what purposes are they best fit. But as kind of like a one-line takeaway for you here, think carefully about what your work is like. What is your work like, and how do you think about where on a spectrum from maybe more engineering-like to more statistical analysis-like, more exploratory to more operationalized, where are you on that spectrum, and so what kind of tool should you choose? Choose a tool that is right for the kind of work that you are doing.

Why try Python again?

Something you may be wondering is why did I try again at all? Like what made me keep trying or pick it up again? And I will be honest, I don't β€” it's not for things that I'm really comfortable in R with, and I do like I loved R, I love R, I use R, so I don't pick up Python to do something that I'm really good at in R. Basically, if it's something that I would use the tidyverse or use ggplot2, I don't pick up Python for those things. There are things that do push me to Python, though. Some of them are external. Like, for example, I collaborate with people who are Python first or who I need to work with someone to, you know, work on a Python package, and so I need to be able to read and write and run Python code to work with people. I bet a lot of you in here have felt some of that external pressure, like, oh, should I learn Python for my career or do I need to learn it to work with someone else?

Some things that push me and make me interested in Python are more internal. I have discovered tools in Python that legitimately are a delight to use. This is a pretty opinionated list I'm about to put up here, but one that does this for me is the request package, so it's a package for making HTTP requests, and it's legitimately wonderful. Another one is Pydantic. This is a package for data validation, and it is a delight to use. It's a delight to write the code to use. It's just very ergonomic, very, very comfortable. This one is a bit of a meta tool. Ruff is a formatter and a linter for Python, and I find it much more pleasant for me to write Python when I use this tool for that. Ruff happens to be made by the same people who make UV, so shout out to them. They are killing it lately.

So if you want to get unstuck with Python, I've got a couple of resources here for you. So this QR code will take you to my slides here. If you want to install Python, you can get documentation installers from our website. You can join us on GitHub with any questions that you may have about how, like, if you're having trouble getting started. You can go to Astral's docs to install UV and get going with it, and like I said, you can also get Ruff from them there.

Overall, I do not think, I don't feel like my goal here is not to convince you that you should use Python, but I do hope that I can convince you today that if you have ever had trouble getting going with Python, it's not because the language itself is not readable, simple, beautiful. Those things are all true about Python, but there have been real usability challenges that are around using Python that have been real. Those have been real challenges, and if you had trouble, you were, like, felt stuck, it almost certainly was not your fault, and there are tools, there are more modern tools in this space that are much better than the tools that have been available in the past, and they can help you get unstuck with Python just like they did me. Thank you.

If you had trouble, you were, like, felt stuck, it almost certainly was not your fault, and there are tools, there are more modern tools in this space that are much better than the tools that have been available in the past, and they can help you get unstuck with Python just like they did me.

Q&A

Thank you so much, Julia. We have a few questions here from Slido. Could you briefly describe the lifetime of a Python project in Positron, e.g., create a new directory, UV PIP install?

So, we do have in IDE support for stepping you through creating a new Python package, including using UV to make a new virtual environment. There are different options there if you're not a UV user, so you can take a more UI-centric approach of bringing up the command that's, like, make a new folder from a template. It will set up the virtual environment and get going that with you. If you want to take a more terminal-driven approach, my pretty lightweight thing to do is either to make a new folder or to, you know, clone something from GitHub, and then I go into that folder, I type UV Venn, I activate it, I install it, and then, like, I can open that thing in Positron. It all just comes up, and I can run the code right away.

Amazing. Perfect. Some IDE environments, like Databricks Notebook, allow users to use R and Python interchangeably in the same notebook to support the use case of something Python does better and other things that R does better. Would Positron allow a DS to do this?

So, the best way to do that today is either to have a quarto file that has mixed R and Python chunks, and they end up talking back and forth to each other using reticulate. Another thing I do is if I have a project that involves both R and Python code, I will have a folder that has both, and then I can have R and Python consoles running at the same time in the same workspace, the same folder. And so I will use that when I'm in a situation where maybe I am dealing with, like, a file, and I'm going to do one thing with it in R and one thing in Python. If it's more like I'm going to go R, Python, R, Python, I might use quarto and reticulate, but if it's a bit more, I'm going on separate tasks, you just have it all running, and you can switch back and forth and have a pretty flexible workflow that way.

Amazing. Thanks. Another user is asking about analysis that you use Python for first instead of reaching for R, or something that you have a better experience in with Python.

Yeah, yeah. So, if I am starting with a new dataset, I still reach for R, because I'm so effective with the tidyverse, and the way the tidyverse works with my brain is, like, I just reach for it first, because it's a much better fit for me. Things that I reach for Python first are often if I want to interact with an API somewhere back and forth, and I have, like, some I need to get something from an API, and then I need to reshape that data, or, like, save it in a file, and move it around. So, I tend to use β€” I tend to reach for Python as a β€” unless as a β€” as a data β€” I write a really specifically data sciencey, statsy language, I don't tend to reach for Python if what I need to do is ultimately quite statsy, but if I'm doing more something that is β€” I might put in the category of dealing with data, but in a less β€” in a less stats forward way, I often will reach for Python if I need to kind of pass data back and forth for API there.

Amazing. Thanks. I think we have time for one more. What components of Jupyter did you, quote, unquote, bounce off of? I'm having the same experience and can't quite put my finger on it.

Yeah. So, I don't want to speak badly about a tool overly badly. I'll speak a little badly. No. I know. It's a tool that I know many people love and are productive with. I came from a background that really ingrained in me some habits around, like, some reproducible practices, like, how will we manage states? Some, like, hygiene around inputs and outputs. And Jupyter notebooks have an execution model, like, for example, it's very easy to get into a situation where you've executed a lot of things out of order and you don't actually know the state of what's in the in the notebook at any given time. And for me, I β€” that was not a good fit for, like, how I think about data, the kind of habits that I had. And that have really I found quite protective in, like, that my results can be reliable. So, I know it's a tool that many people have been quite productive with. But for me, it was just really not a good fit. I noped out pretty hard pretty fast because of what they're like. Great. Thank you so much. Thank you.