Resources

Julia Silge - Keynote PyCon Colombia 2025

Julia Silge -------------------------------------------- Engineering manager at Posit PBC, leading development of open-source software for data science in Python, R. Data scientist, expertise in machine learning, text mining. PhD in astrophysics, author of books on data science. -------------------------------------------- Social Networks: Github: https://github.com/juliasilge More About PyCon Colombia at http://www.pycon.co

Sep 12, 2025
45 min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Wonderful. Thank you so much for that great introduction. Thank you.

This has been such a fantastic day. The talks today, from the keynote this morning to all the session talks have been so exciting. And I am excited to get to end our day here today by giving you a bit of an introduction to Positron, which is a new IDE that I am working on along with my team. I'm going to give you a bit of an introduction to it.

I'm going to kick off by telling you a little bit about who I am, so that you can understand the perspective from which I am coming to speak. So, a very long time ago, I was an astrophysicist. A medium long time ago, I was a practicing data scientist. I worked in the non-profit sphere and also in tech companies as a data scientist, working with data, how can we make better decisions, did things like analysis, reports, making interactive apps, training models.

And then more recently, in the last five years or so, I've shifted to being someone who is a tool builder for data science. Someone who works on what are the tools that we can build or make to make the work of data scientists more productive, more fluent, more simple as ways to approach different kinds of things.

So, I work at a company called Posit, PBC, which is the company formerly known as RStudio. It's the company that made the RStudio IDE. And today I'm talking to you about a new one, a new IDE. So, that's a little bit about me.

And now I'm going to, I want to find out a little bit about you, so that I can understand more about how maybe I should speak about the kinds of things. So, learn a little bit about your background. Because I know people use Python for all kinds of things, right? People use Python for everything from, you know, with the Django web stack, for web development, to building APIs. And so, the first thing, I'm going to ask you to raise your hands here, so I can understand. So, can you tell me, do you mainly use Python for data work?

Okay, okay, great, awesome. Thank you very much. So, it seems like most of the people in here, you use Python mostly for working with data. Which is great, because that's what I'm going to talk about here.

Have you ever used RStudio? Have you ever opened up RStudio and used it? Have you ever used VS Code? Have you ever used a VS Code-like IDE or editor, such as Cursor, Windsurf, or VS Codium? Or one of the other forks?

The iterative nature of data science

I want to start off by talking about how the process of data science is, in many ways, different from the process of what you might call general purpose software engineering. The process of data analysis, working with data, is iterative and exploratory. So, we definitely use code. We write code to analyze data. But the process of writing code for the specific purpose of analyzing data, it's substantially different from what someone is doing who is writing code, maybe to build a website. Or maybe to make a mobile app or something.

This diagram outlines one model for what that process of data science may be like. You start, the first thing you do, you read in some data. Maybe you read from a CSV, or maybe you read from a database. You have some data in memory, in Python. And then you start, actually, in what is, I believe, an iterative set of steps. You maybe reshape the data. You munch the data. Maybe you make a visualization to see what's in there. Then you use that to maybe train a model or to train a first version of a model. And you realize, oh, I should, you know, maybe use this variable I have with some feature engineering. And you go around.

Or maybe you're like, ah, this doesn't make sense. Let me go back to that variable I was using and look in more detail. And, oh, I have to correct for this way that the data was recorded. And so, then you transform again. You make another visualization. Then you go back to your model again. And this process, the important thing is you don't know what the next step is until you see what's really in the data.

And so, this process, you probably would actually make mistakes if you, ahead of time, said exactly what you would do and did not stray from that path. Because when we're doing statistical analysis, machine learning, data analysis of various kinds, we're coming up against the real world. The real world. And we have to iterate in a way because of the actual content of the data that we're working with. And this is unique about the work of what people doing data analytics, data science, machine learning. It's a unique characteristic of what this kind of work is like.

Eventually, you do decide you're done. Or at least you're like, ah, I got to stop. I can't keep iterating on this model or app or plot or whatever it is you're trying to make. You decide you're done. And then you do something with that analysis. Largely, we communicate about it. We write a report. We build an interactive app of some kind. Maybe we deploy a model and then have to tell people about the model. But this idea of what data science is like, what working with data is like, is an important thing for us to keep in mind as we're thinking about our practices, as we're thinking about our tools and the kind of tools we choose to use.

The garden of forking paths

A result of this kind of iterative, exploratory nature of data work is that we end up in a situation where we are making a lot of decisions. We're making a lot of decisions, and the one decision we make depends on the last decision we made. People, when they talk about this, they talk about it with the idea of like a labyrinth or a maze or there's this paper that's called The Garden of Forking Paths. The Garden of Forking Paths. It's the idea that I'm walking through a garden, and there are all these paths that are like forking, and I choose to go one way, and that means I would end up in a different place at the end than if I chose to go a different direction.

So when we are in this iterative process, this means that the decisions that we make, as we're deciding, like as we're doing data analysis, they change where we may have ended up in the end. So this paper specifically talks about this idea of a labyrinth or a garden of forking paths in the context of how we use p-values in a statistical sense. It talks about how dangerous it is to very naively use p-values when you've made a lot of decisions.

But forking paths are a good thing. Like it is good to make decisions and we base our next decision on the last one we made. It is good to analyze data in different ways and to use what we learn about the data to do the next thing. The mistake actually would be to choose just one, like I'm doing this, and I decided, I don't care what I see in the data, that's just what I'm like. I'm not going to use the information in the data itself. That would be a mistake. And another mistake is of the nature that is really focused on this paper, would be, for example, to compute p-values without accounting for all the different choices we made.

So the garden of forking paths is not a problem. It is the only option we have to correctly analyze data. But it means that as we move through a path, we end up at a place that would have been different than if we took a different set of paths. So this is part of why we have this very exploratory kind of iterative nature of data work.

So the garden of forking paths is not a problem. It is the only option we have to correctly analyze data.

Reproducibility and tension in data work

Now, at the same time that that is happening, we as data scientists, people working with data, we know we need really reproducible practices. And it turns out that the nature of computational data analysis has really exposed limitations in our ability to evaluate findings. So this paper here explores what it means for a work to be reproducible. It explores it from a research or academic perspective. And it talks about this spectrum from, hey, I'm just telling you my results. So picture this as, like, you ran some code in a Jupyter notebook. And the cells are all out of order, actually, of, like, the order you ran them in. And you made a plot. And you posted the plot in Slack. And then, like, that's this side.

And then there's a full spectrum of reproducibility that takes us through, okay, I check my code into GitHub. And I use good practices about how I write my code. And, oh, maybe I even can go all the way over that side to maybe a fully reproducible or replicable kind of process.

So by very nature of our data work, we have iterative, exploratory process. At the same time, there's tension because we have we need this reproducible practices. This was about academic research. But the same thing applies in a company or in an industry. And so we end up in this situation. The process of data science is inherently iterative and exploratory. And at the same time, we know we need these to adopt more reproducible practices. This is to maintain the integrity of our work. To show the people we work with that what we're doing can be relied on. That we it is efficacious. It is it does what we say it does.

So these characteristics of data analysis work are in tension. And what I want to talk to you about today is this tension. And how the processes that we adopt and the tools we choose to use can bring those tensions into balance. So it is with this background and context that I want to tell you about a new IDE that I and my team have been working on. That is built specifically with this kind of tension in mind.

Introducing Positron

So this is a screenshot of Positron. This new data science IDE. Positron has been available for beta testing for about the last year. And just this past week, actually, we moved to stable releases. Which is really exciting. So Positron is no longer a beta product. It's now a regularly available product.

So I'm using the term IDE. That stands for integrated development environment. And that's a piece of software, you know, that allows you, as someone who writes code, to yourself develop software. So in this example, someone is writing a report using Quarto and Python. So on the left, you see the source code for the report that they're writing. And on the right, you see the rendered, finished version of the report that you might be able to publish or send to a coworker.

So there are a lot of IDEs out there in the world. IDEs, code editors, you know, depending on what you might want to call these. And so I'm going to tell you today about what's new and or different about Positron. You know, so that, you know, I would be very happy if you tried it out. But more importantly, so that we can think about can you choose the right tool for the job you need to do? Whatever that job is. Whether it's very iterative and exploratory. Or if you're someone who does more traditional software engineering, you can think about what is the right kind of tool for the kind of work that I do.

The first thing that I want to say a little more about what I mean is that I'm going to tell you Positron is a next generation IDE specifically for data science. So the company that I work at, Posit, is a company that makes Quarto. Like the developers at our company are the ones that make Quarto. We make Shiny, the Tidyverse, RStudio, open source packages in Python and R for things like making tables, doing machine learning, MLOps. So we think a lot. We are all about data science. The process of data science.

And we think that we would say we know that someone who is writing code for data analysis or machine learning, that person is different from someone who is doing software engineering. We are huge proponents of code first data science. What I mean when I say that is that person writes code. They're not using a low code or no code tool. They're not pointy clicky, pointy clicky in a GUI. They're writing code. Tools that are built for a typical software engineer who is writing code, they're often not a good fit for someone who is writing code to analyze data. Because of that need for iteration and exploration.

So all across our org, pretty much every single thing we make or do or build is informed by how deeply we know and believe kind of these two things. Code first data science. People who do data science are different. And we think we can make you, if you are a data practitioner, more productive with these tools because they're specifically built for the kinds of things you need to do. So this is why we're building this thing. Because there isn't something out there that is quite like this. A platform where you can do all of your data science.

A polyglot IDE for Python and R

The second thing I want to dig into and talk to you about a little bit is that Positron is what we call a polyglot or multilingual IDE. Currently at this period, it comes with first class data science support for both Python and for R. So a lot of environments that were built for data analysis are specifically built to use one language run time for scientific computing. So tools that you probably have heard about or used yourself that fall into this category include RStudio. Include things like Matlab. Include things like Spyder. The Python IDE Spyder. There's a lot of these where all of the UI is super strongly linked to one kind of run time for scientific computing. And there are real limits to these kinds of tools or IDEs.

Because it turns out a high proportion of people use multiple languages for data science or for their projects in general. This can happen literally on the same project. Like one project can involve multiple languages. This can happen over a week as you move from one project to another project. This project is in this language. This project is in another language. It almost certainly is going to happen over the span of your career. I know that it has in mind. When I think back to the kinds of programming languages I used when I was in astronomy, when I did, you know, like I look back, I see some real shifts in the languages that I have used over the course of my career. And for most people who are about my place, about the same place that I am in our careers, we all say the same thing. Very few of us picked one language and then used it our whole career.

So I bet many of you are familiar with this. Maybe you build Python packages that involve a C code. So you're combining Python and C. Maybe you combine Python and JavaScript. Maybe you make complex interactive apps. And so you use Python plus a bit of JavaScript. Maybe you're someone who uses Python together with SQL. You write SQL queries. You have data. Then you use Python maybe for machine learning. Maybe, like me, you're someone who uses both R and Python.

So by contrast with those kinds of IDEs or editors that I talked about before, Positron is built with front-end user-facing features that are about the tasks that you need to do when you're, like, analyzing data or doing other kinds of data-specific work. And then there are back-end language packs that are like the engine. They provide the engines for all those features and tasks. So here you see some of these. You see a fully interactive console. Much more fully featured than a typical Python REPL. A plots pane where you can interact with the plots that you make. A variables pane where you see all the variables you've defined. And in addition to the place where you're actually writing the code. So in this case, these are all backed by Python. But these front-end features are architecturally separate from the runtime that is the engine there. And that means today we have support for Python and R. But we can add in support for Julia. We can add in support for other scientific computing languages that become important in data science.

Three categories of data practice

So data practice, as I think about it based on my experience, it tends to involve three main categories of work that are qualitatively different from general purpose software engineering. So I'd say one category is exploratory data analysis. That's what I've really been talking about here. That iterative process. One piece is reproducible authoring. So this is kind of where we need to write some kind of report or make a website or make some kind of interactive app that depends on the data that we are presenting or analyzing or communicating about. We do not want to be in a situation where we're copying and pasting plots from inside of your editor into a Word document. We do not want to ever be in that situation. Instead, we want to build reproducible documents so that the whole document can create the report or the website or whatever it is that you need to make. And then that third category is publishing data artifacts.

And this is often a challenge for people of data work because it kind of is at the edge of iterative scientific data work into more traditional software world because you're like trying to deploy something.

As we kind of dig into this a little more, I want to say, I might say this more than once, but I really hope a big takeaway here for all of you is that it's not bad or wrong that people who are doing data work are different than people who are doing traditional software engineering work. If you're a data science person and you're like, gosh, yeah, like I feel a little bit like the software engineering person I work with, it is better that they do something different. It's not bad or wrong that it's different. It's just different. And that's reflective of the reality that the work is actually different. So you need tools, though, that are built specifically for the tasks you need to do, whether that is the data work or maybe more traditional engineering work.

Positron's key features

So I'm going to walk through a little bit more of the components of what makes Positron a bit unique so that you can understand how it kind of links to this iterative exploratory kind of work. So first I want to highlight that because we are in a world where we're writing code, we definitely have a place, there's a place in this IDE where you write code in a traditional, in a very traditional way here. But not only do you need somewhere to write code, you also need somewhere to execute code interactively. You could think of this as like a sandbox or think of this sort of like a playground. It's a good place to test out code quickly and see the results in a truly interactive way so that you can get that quick feedback loop of what's going on with your code. Not because you are doing all your analysis in this interactive console, but because you have to see what the results are to know what the next step is. Because that is one of this nature of what data work is like.

This is a, you can use keyboard shortcuts to run code from the file into the console. You get things like autocomplete and syntax highlighting in the console. Because again, this is real code that you're writing and running. It's a fully featured console that's connected to the other pieces of the IDE like the plots and the variables pane.

And I'm not going to go into this in super detail right now, but here in this console, there's enhanced environment manager kind of support. Where like interpreter management designed with an eye to these data science tasks. So that whatever Python interpreter it is that you have, you can easily understand what's going on with it.

So we said that what's going on down here in the console is connected to the rest of the IDE. And this section over here is the variables pane. And this gives you a lot of information about every variable you have defined in the session that you're running right here. You get information about the type of the variable. You can see some of those guys are little tables because they're rectangular data structures. You'll see a little cylinder if you're dealing with a database connection. And the variables is connected like you see here to the console. But if you are someone who uses Jupyter notebooks, it's also connected there to the Jupyter notebook. It has another tab for each notebook session that you have. It's super handy for keeping clear. Like if you're running more than one Jupyter notebook at a time, it's super handy for understanding which session you're working with and what's going on here with that.

This section down here is where you work with plots that you have. So you can dynamically create and update plots. You run code in the console, and then you see plots here. I am someone who finds this way easier for interactive work than, say, compared to a Jupyter notebook. It's kind of two different models for how to do this exploratory kind of interactive kind of work that you have here. It comes in handy when you have a lot of plots because of how easy it is to switch back between them. You can see your iterations. You can say what you like. Although I kind of bad-mouthed it a little bit ago, there is a button here where you just copy that plot, and then you can go and paste it in Slack. There's a lot of UI affordances here for exporting at certain sizes and for all the things that we know in real life happens with plots.

Another feature that happens here in Positron is this data explorer. We think of this as a tool for debugging your data. So it supports Pandas data frames and Polars data frames. And it's built in a way that is performant so that if you have something up to, say, tens of millions of rows and very wide data, it's able to scroll around in ways that it doesn't hang the rest of the UI. And for each sort of column, you get summary statistics and spark lines where you can see what's going on. This is a really good way when you're in that iterative process of how do I know what's in my data so that I can then decide what code to write. This supports that kind of process.

So you may see this and feel like, well, I've got a question here. Because you may say to me, you were talking about code-first data science, right? And how does UI like this fit into that way of working? And one of the things we're working on right now, actually, this month, is ways to, let's say you did some sort of filter and sort in here, like as if you were in a spreadsheet type thing, a way to export the code that will get you that same, say, filtering and sorting. So we are all in on making sure that things are reproducible by making sure that we're focusing on code-first practices.

Another component here is a viewer pane for viewing locally running content. So this is like a fully interactive kind of pane. Here on the left, we have a Streamlit app, and then it is running right there in the IDE on the right. You can view local host URLs. You can open HTML files right there in the IDE. Run interactive apps like this one or, you know, FastAPI or Flask or Shiny or, like, other kinds of, like, HTML content here. This is where you can view a rendered document such as Quarto, like a rendered Quarto document you can see here. And I find this to be hugely impactful for my own work because you can end up in a flow state where you're able to edit the app or the document that you're working on and automatically see it update versus having to go somewhere else to look at something. And we end up in this very tight feedback loop based on the code you're writing and the result that you're seeing.

Familiar and extensible: built on Code OSS

So as I'm walking through these features, I want to highlight that Positron is both familiar and it is extensible. And so it's familiar to some of you who are in this room because it's pretty directly inspired by RStudio. So many of the people who are working on Positron are the same people, like, literally the same people who have worked on RStudio. And here you see, like, a really RStudio-like interface with, you know, a console and a source editor and, like, a fully-featured help pane. But it's being used in Python, you know, not in R as only would be possible in RStudio.

So this help feature makes it really easy to find help for any function or package or topic that you have right from within the IDE so you don't have to go and get distracted by the internet looking things up. Like, you get to see everything right there inside of the product. It supports interlinks between help topics and different packages. When you see examples, they have the syntax highlighting, really nice to be able to see it. I have found this to be as helpful as a user, but it's also helpful as a package developer. Because if you're working on your Python package and then you, you know, reload it, you build it locally, you can see the help that you just wrote right there in a way that keeps you in that tight feedback loop in this sort of flow state without having to leave the IDE to figure out how things are happening.

So it's also familiar to almost everyone in this room because it is built on the open-source components that are used to make Visual Studio Code. So Positron is a fork of the open-source project that Microsoft uses to make their proprietary builds of VS Code. This is similar to how Cursor and Windsurf are forks of this project. Now, building on Code OSS, as that project is called, we forked it for slightly different reasons than maybe Cursor and Windsurf did. We forked it because we see this as we are able to focus on the data science tasks, the data science features, which is what we care about and what we really have some institutional muscle around. We do not have to spend our time working on the general features that every kind of person who writes code needs.

So as an example, we have not changed the experience around the source editor. We have not changed the experience around how you interact with Git. We have built an entirely new, truly interactive console. We have built a new connected, integrated way to build with your plots. So building on Code OSS really allows us to focus on the things we care about, and we kind of make a deal that we largely accept how things work in VS Code for other kinds of features.

Building on Code OSS also opens up the wide world of VS Code-compatible extensions for users who want a data science-specific IDE. So this, you know, if you all are familiar, it's like any theme you can imagine. Make your editor look however you want to extensions that are more about maybe more substantive functionality, like connecting to, you know, I don't know, Databricks or wherever your data is. And you also mean you can build extensions yourself that will work both in VS Code and in Positron.

Now, if you're familiar with extensions of that nature, you may be thinking, wait a minute, why didn't you just make extensions? Why didn't you make, like, really good data science extensions? And we have thought pretty carefully about this, about what features need to be part of the sort of core product itself, and what can we support more versus integration with extensions. And at, you know, in my teams, at my company, we actually make a bunch of extensions. We make an extension for Quarto, an extension for Shiny. We make an extension for Posit Connect, which is what's showing here. This Connect is a publishing platform for data science. And so this person has made a dashboard, and they're just mid-publishing. They're just in the process of publishing this Python dashboard to Posit Connect here. So we're all in on extensions. And we're huge fans of the extension story around these kinds of IDEs.

However, it turns out that not everything we need for a first-class data science experience can be built into extensions. And that's because of the very good reasons that the extension API is limited. And especially extensions cannot talk to each other, and especially we cannot get the kind of fully integrated experience on the kind of components that I just showed you. So these, like, deeply iterative workflows, and that tension that I talked about, like this deeply iterative workflow and the reproducible needs, it turns out that solving that problem just with extensions, you know, I'm going to argue, actually doesn't play out that well. And I bet this is resonating with some of you. If you are someone who does a lot of iterative, exploratory data work, and you have tried to make VS Code into a data science IDE with an extension pack or the data wrangler, and you're like, yeah, okay, I'm getting it, but you're like, but this doesn't update when this updates, and I can't now see this. That's exactly why we actually forked instead of building sets of extensions.

AI and LLMs in Positron

So if I'm up here in the year 2025 talking about a new IDE, I'm sure, I'm confident that a question at the top of many of your minds is how and in what ways is Positron set up to work with LLMs. So I'm going to say it one more time. Positron is not a general purpose software engineering IDE. It is, so we don't see tools like Cursor and Windsurf as our direct competitors, because those are built, again, for general purpose software engineering, and we're just not in that business. Those are not the users that we are focused on and that we care about. Instead, our goals here, and our goals around LLM code assistance and AI integrations, our goals are to build an IDE specifically for data science, for statistical analysis.

I'm just going to take a moment to talk a little bit about, our company has been going through some pretty deep exploration of what the rise of these new models means for our users, and our open source users, our free and open source products, our customers, the people who pay us for our enterprise software, and even we ourselves. What does it mean for we ourselves at our company to work in the era of these AI models?

So our company, one of the ways we've been exploring that is with internal hackathons. They're a week long. You end up in a small group with people who you maybe don't work with on a daily basis, and at the beginning of the week, there's an intro, here are what these models are like, and then each person commits to spending a certain number of hours working on a small project. So this is my project, my very small, little experimental project.

I'm very interested in text analysis and NLP, and have been for a very long time, and so I was interested, A, in how these models can be used for text analysis, like how they can be used not only to generate text, but to analyze text, and B, I'm someone who's pretty concerned about the way these models can be used for disinformation at scale. And I was like, well, I wonder if I thought about it. Could I think of a way that these models can be used to maybe counteract disinformation in some way?

And so this is a very small project, right? But I built a dashboard that takes as its input a URL, like any URL. It gets the content from the URL, it summarizes it, it goes to Wikipedia and finds what Wikipedia pages are about this same topic. It gets the content from Wikipedia, and then you end up at the end asking the model, okay, you have this text, which is the text from the original URL. You have this text, which is from Wikipedia. Can you comment, like tell me how well they agree? And so you ask it to give it a score, and then it writes a short paragraph about whether they agree or disagree. So this example is a website that spreads vaccine disinformation, and this particular page is about the CDC that it should study a link between vaccines and autism, right?

And so what my little project did was help me get my hands dirty with the models and help to understand what is it that they can do and what is it that they actually cannot do right now. And it's been interesting to see people at my company in general start to wrestle with these issues. And just briefly, I'll say, people who are skeptics stay skeptical, but they are more concrete about what they are skeptical about. People who are enthusiasts, they stay really enthusiastic about these kinds of models, but they get more specific about what they are excited about. So it's been very interesting seeing my company really wrestle with this, and like what are we going to do? Like what are we going to do with the tools that we make?

This paper came out in the past spring, and I highly recommend that you read this paper. It's like an opinion paper or a position paper, and it makes the argument that AI is a normal technology. And so this is a quote. When they're saying AI is a normal technology, they're not being like, oh, it's no big deal. They're saying it's a normal technology in the same way that the Internet is a normal technology or electricity is a normal technology. So it's going to have a huge impact on how we work and learn and teach. It's not to understate its impact, but instead to view it as a normal technology is to contrast it both with the very utopian views of what these models can bring and the very dystopian views, and instead to give us tools from what we learned. Like what did we learn from the transition from before the Internet to have the Internet? What can we learn from history about bringing electricity? How did that happen? Who benefited? Who was left behind? We can learn from these things that happened in the past to help us make better decisions about what's going on now with this transition to AI.

it makes the argument that AI is a normal technology. in the same way that the Internet is a normal technology or electricity is a normal technology. So it's going to have a huge impact on how we work and learn and teach.

So how we think about AI and LLMs is deeply informed by all this stuff we're thinking about and this particular perspective on data science that we have. So we're asking questions like, what do people who are doing statistic analysis need when it comes to LLMs? So to be clear, we're not training new models. We're not training Copilot, but for data science. We're not training new models. We instead are building tools of various kinds on top of the models. And so in the IDE, in Positron, this looks like taking advantage of that deep integration so that we can increase the context available to the LLM. This includes things like chat participants that are data science aware, know about what is going on with data science projects to scaffold you to getting more quickly what you need to do.

So this is what Positron Assistant looks like. It is currently available in preview in Positron. And it can be run in three modes. The first mode is ask. So this is like a chat style. You're just like asking and it answers, ask answers, like sort of chat GPT style. Edit, so that's where you're in a file and then you ask the model things about the file and ask it to propose edits for you. And then the third mode is agent. So an agentic mode where you opt in to giving the assistant permission to run code on your behalf. So the assistant gets to run code in the console and it sees the results and I see the results. We both see the results of what happens here.

And this ends up giving us like more context and some pretty successful stuff. So here, I down here, I'm saying, hey, I wanna read in some data available at a certain CSV file here. And if you'll notice up there, I imported Polars and I imported Seaborn here. And so what's gonna happen is it is going to try to use Polars to read in the data. So here, this is my same, this is my prompt. This is what I asked it to do. And then you will notice that it's like, okay, great, here is some code. And it automatically ran it because it's in this kind of agent mode. But look, look what happened. An error, it didn't work. It didn't read in my CSV.

Now, because we have this deeply integrated experience here, what happens is that, I mean, obviously I see the error. I'm the one looking at the computer. But also, the assistant sees the error. And notice there is no, I did not give any input between that first step and this step. It sees the error and it's like, oh, the error indicates that, you know, like we didn't handle the NA values here that are in the CSV. So let me try again. And then if I hit that run code button, it runs it again and then it succeeds. I also want to highlight that it used Polars. And that's not because I prompted it, it's because I imported it. So it knows to use, it has deep integrated knowledge of what's already going on in the IDE. So that it can generate the code that you actually want it to generate.

So let's get that ugly error off the screen. Who wants to see errors in code? So here, this is just us going a little further. And I'll highlight a few things. Like as I ask a question up at the top, like hey, for these Pokemon that are in this data set, like how are height and weight related? And then the assistant generates some code that uses Polars and then uses Seaborn to make plots, because that's what I had already loaded. I didn't explicitly tell it to. It's using the context here. And also, like the assistant literally can view the plot. Like it takes the plot, and the plot can get sent to the model. If you opt into permissions in such a way, the plot itself gets, because most of these multimodal models that are good at generating code, they're multimodal and they can look at images, and so they can describe plots for you, tell you what's going on, notice outliers, notice problems, and whatnot.

So Positron Assistant, I think it's a really interesting piece of the IDE, because it highlights the outcomes that we're getting by the sort of architecture we've chosen and the set of trade-offs that we're making. We are making trade-offs on the side of, this is not for everyone. This is for people doing data science. And that means it's powerful for the people doing data science, because we're not trying to be everything to everyone. We're trying to make the best thing for people doing data science. And we are taking, instead of everything is extensions, we are saying what actually can improve the experience by not being an extension and being part of the IDE as a whole.

And so that, I will wrap up here a little bit and say, hey, if this makes you curious, if you want to try out Positron, you can go to positron.posit.co. That's where you can get documentation and installers. If you run into bugs, if you have questions, if you want to give us feedback, join us on GitHub at posit-dev-positron. If you are, I mean, I don't know, if any of you are customers of ours, Positron is available in Posit Workbench and Preview, which is one of our enterprise products for big companies who need these kinds of features.

And if you're sitting in here and you're maybe not a data science person, maybe you are one of these people that uses Python more for traditional software engineering, I do think a takeaway we can all take from here is that not all software is built for the same purposes. And so software tools that bring us joy are not one-size-fits-all. But rather, we can build tools that are specifically for the kind of work that we need to do. And with that, I will say thank you very much.

not all software is built for the same purposes. And so software tools that bring us joy are not one-size-fits-all. But rather, we can build tools that are specifically for the kind of work that we need to do.