
10 Years of Data Science Tools...and What Happens Next (Jonathan McPherson) | posit::conf(2025)
10 Years of Data Science Tools... and What Happens Next Speaker(s): Jonathan McPherson Abstract: In this talk, I'll reflect on a decade of work on RStudio and the principles of tool-building that have led it to become the standard data science environment for R. We'll talk about how those same principles have guided the development of Positron, a new data science environment from Posit, and how you can apply them to your own tool-building work. Slides - https://github.com/rstudio/rstudio-conf/blob/main/2025/jonathanmcpherson/10%20Years%20of%20Data%20Science%20Tools.key posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Good morning and welcome to PositConf 2025. So my name is Hadley Wickham. It's my great pleasure to welcome you to PositConf here in Atlanta. And my job is just to take a couple of minutes to help you be as successful as possible with conf so you have a great time.
To that end, the first point I want to make is if you don't already have it, make sure to grab the app. This is where you can find all the information about what's going on, including the talks and all the other things that are happening. This is also where you can find a link to our Slido. This is where you will ask questions of the speakers, and it also has a link to a Discord if you want to engage with any of the other participants online or in person.
New this year, we also have a really fun competition, both online and in person. Do some fun activities around conf, collect points, and you can get some pretty cool swag at the end of it.
On your way to this keynote, you've all already walked through the lounge. This is the place if you want to ask questions about Posit, about our products, our open source packages, whatever you want. New this year, we also have a bunch of live demos going on. Again, you can find out about those in the app. This is a great way if you want to find out what's new with Posit products this year.
But at Posit Conf, we also want to make sure that everyone feels safe, comfortable, and welcome. And to that end, when you registered, one of the things you did was sign a code of conduct. That's really important to us. And if at any point you feel unsafe or notice anyone else may be feeling unsafe, please reach out to any Posit employee. You can also go to the registration desk or email conf at Posit.co.
If you want to know how to recognize a Posit employee, well, you can spot us by our t-shirts. Apart from me, obviously.
And we've done a bunch of things to hopefully make you feel comfortable. So please respect the keep your distance pins or the hugs okay pins. If you see someone wearing a red lanyard, that means they prefer not to be photographed. So please keep them out of any photos. And finally, everyone has their pronouns listed on their badges.
We also have a bunch of rooms available to you, regardless of whether that's a quiet zone just to hang out, whether you want to meditate or pray. We have a lactation room and gender-neutral bathrooms on every floor.
And that brings me to really the only rule of Posit Conf, and that's the Pac-Man rule. Whenever you are standing in a circle with your friends, please make sure to leave Pac-Man's mouth open so new people can join your group. We want Posit Conf to feel welcoming to everyone, regardless of whether it's your fifth Posit Conf or your first Posit Conf.
And I would kind of encourage you, if you're a Posit Conf veteran, you know, please do your best. Like if you see someone hanging out and they look a bit lonely, you know, go up and strike out a conversation with them. Please take that as like your kind of mission from me to make everyone feel as welcome here as possible.
If you're looking for like-minded individuals, a great way to find those people is our Birds of a Feather session happening at lunch. You can spot them with the big flags. New this year, you can also create your own Birds of a Feather session. You'll find an easel with some stickies outside the entrance to the dining room.
Now I don't know about you, but one of the things I am most excited about at this Posit Conf is our evening event at the Georgia Aquarium. It is an amazing aquarium. It is going to be so much fun. The one thing I want to point out is please, please don't forget to bring your badge. We cannot let you in without it.
So without much further ado, I would like to introduce our first keynote speaker, Jonathan McPherson. Jonathan is a long-term colleague of mine at Posit. And as you probably, as you may know, I like to write little poems about our keynote speakers with the help of AI, because I'm not very good at poems.
Architect of code, Jonathan builds Positron, visions made precise. At RStudio's heart, he helped shape enduring tools guiding many hands. From Redmonds to now, wisdom gathered through the years, foundations endure. Please join me in welcoming Jonathan.
Jonathan's opening: tools and the brain
Good morning, everyone. So like Hadley said, my name is Jonathan. And today we're going to talk about data science tools. But before we really get into it, I'd like to show you a picture of my brain.
This is a knowledge graph. It's made up of notes. I've taken notes all of my life, but I first started doing it a little bit more obsessively back in 2016. And in this knowledge graph, every little dot that you see is a note that I took. Every connection between dots is a connection between notes. And so this is kind of a picture of all those notes put together.
If you look at this picture, you'll probably see a couple of clumps starting to emerge of notes that are related to each other. This one over here, for example, is books that I've read. I like to read. This one over here is people I've interviewed. And I think this little one up here may be house projects that I will never finish.
This graph comes from a program called Obsidian. Before Obsidian, I used a program called Vimwiki. But over the last ten years, as I've been accumulating all these notes, I noticed that something was happening to the way that I think and process information. In particular, I noticed that I had stopped thinking and then writing, and started using writing as a way to think. In particular, moving to thinking by writing. This tool actually did something to my brain.
I'm not the first person to notice something like this happening. Another person who noticed this happening was this guy. This is Friedrich Nietzsche. He is famous for two things. First of all, for being one of the most influential German philosophers of the 19th century. And secondly, for having an absolutely stupendous mustache.
Back in the year 1881, Nietzsche's eyesight began to fail him. That was a real problem, because Nietzsche was an author. He wrote his books, and he wrote them longhand. Because he was unable to see what he was writing, he was compelled to purchase a typewriter in order to continue writing. This is the typewriter that he got. This is called the Malling-Hansen writing ball. It may already be familiar to those of you who love ergonomic keyboards.
As Nietzsche began to write with this thing, it actually had an effect on what he wrote. Critics also noticed this. Nietzsche's first book that he wrote after he got this typewriter was called The Gay Science. One critic noted that it changed his prose. It said that he changed his prose from arguments to aphorisms, and from thoughts to puns, and from rhetoric to more of a telegram style. Nietzsche himself noticed this as well. It led him to make this observation, which I think is very profound. He said that our writing tools are also working on our thoughts.
He said that our writing tools are also working on our thoughts.
If this is true, then it means that tools are not really interchangeable. I think in the world of data, in the world of software, we live in a world very full of symbols and manipulations, and it's easy to think that one tool can just kind of be plugged in and substituted for another as some kind of layer of abstraction. But in this talk, I'm going to argue that that's actually not the case. Tools actually change the output. This would be a very different talk if I gave it with PowerPoint, for example, instead of Keynote here. I'm going to argue that the tools that you use and the tools that you make matter quite a lot.
RStudio's growth
But I'm going to guess you did not come here to learn about ergonomic typewriters nor mustaches. We came here to talk about data science, so let's talk about RStudio. I had the immense honor of joining RStudio back in 2013. This is what our website looked like back then. At the time, we were a very small company, about 10 people, and I was so excited to tell all my friends and family about my new job at this little startup, but everyone that I told, all my friends and family, when I told them about my new job, they all asked me exactly the same question, and I would guess it's the same question that you've heard from many of your friends and family, namely, what is RStudio?
So, that was 2013, you know, fast forward now to 2025, and RStudio has never done better. Just this year, in 2025, we crossed over 5 million runs of the RStudio IDE every week. I just want to call out a couple of interesting things about this graph that you might have noticed. One of them is that it's really seasonal, you know. If you look at the graph, you can kind of see a dip in the summertime and a really big dip around Christmastime. RStudio is used a lot in work environments and academic environments, which kind of explains this.
The other really interesting thing about these dips around the holidays is that, I don't know if you noticed this, they don't go all the way to zero. In fact, if you look at the axis here, this thing starts at 2 million runs. So, some of you are using RStudio on Christmas, and so, whoever you are, you're awesome.
Making tools for people
So I had the huge privilege of having a front row seat to the growth of RStudio during this time, and so, for the rest of this talk, I want to share about some things that we learned about making software and data science tools. I'll talk a bit about how we applied those learnings to Positron, and how you can also apply them to your own data science tools.
So, the first thing I want to talk about is making tools for people, because at the end of the day, people are one of the only things that matter. Making tools for people might seem kind of obvious, but I think you have probably used a tool that seems like it was perhaps designed by and for an octopus.
So the first thing you've got to do if you want to make tools for people is you need to listen to them. And again, I think this seems very obvious, but especially as organizational size and complexity grows, this can be kind of hard to do. Here's a real life example. So I used to work at this company. I don't think I can say its name for legal reasons, but let's just say that it rhymes with Microsoft.
And our users wanted to store multiple values in one database cell. We made a desktop software product. And you can maybe make your own conclusions about whether or not this is a good idea. But it is what people wanted to do. And so, you know, they told the sales team, hey, I'd like to buy the product if only it had the feature to store multiple values in one database cell. And the sales and marketing team, of course, had to talk to the account rep and say, hey, we're having a hard time selling this without this multiple values in one database cell thing.
Of course, they then needed to talk to the product planning team. And on the game of telephone went, you know, they've got to talk to product management. You know, product management needs to talk to the engineering lead. And finally, we actually get to the engineer, i.e. me, who's actually going to do this work. This whole process took, I don't know, maybe about two years, right? That's a long time.
You know, fast forward to, you know, 2013 when I joined RStudio, and I'll probably never forget this experience. It's one of my first weeks at RStudio. And I was on our discussion forums. And somebody mentioned that they had found a bug in the data viewer. And I suddenly realized that all those people that were between me and the person my software was being written for were gone. I could actually just go and just fix the problem. And this whole process took, like, maybe about two days.
So I guess my first piece of advice here is that you should connect the people who make the tool directly to the people who use the tool. This, again, is easier in a 10-person startup than it is when your company gets big. And this has been true, you know, almost as long as we've had technology companies.
Some of you might remember the movie Office Space, where you have... Maybe you have someone at your company that kind of thinks that this is their role. They kind of interface between customers and engineers. And some people almost think it's some kind of law of physics that, like, the time of communication is, like, some, like, computable function of, like, your number of employees times, like, your layers of management and so forth. But I'm here to tell you this actually doesn't have to be true. It's not a law of physics. You can actually connect these groups of people.
In fact, it's one of the things that we've tried to do at Posit as the company has grown. I think we're now maybe, like, 350 people. And if you, today, for example, go to Positron's discussion forum and you start a new, like, written thread on Positron, you will get a response, probably within a day, from an actual person who works on Positron. These people over here, with a couple of exceptions, are literally Positron engineers who work on it. This is a thing that you can do.
Listening vs. watching users
So it's really important to bring together the people that are using your software with the people that are making it. Like, that direction connection is invaluable. But listening to your users alone can actually result in really bad software.
Users approach you with feedback. You listen to them. You do what they asked you to do. You built exactly the right tool. This is how a lot of people kind of approach software development. And we've learned that this actually is not always effective.
So we're all, I think, statisticians in the audience. Can anyone tell me a problem or a bias that might creep in with the very first step of this process? I would argue that this is a sampling bias. You know, if you... This is sort of like having a survey. And the first question being, do you love responding to surveys? You know, if you only listen to the users who approach you, you are actually going to get a very not representative sample of all of the users who are using your tool.
Let me show you a very specific example from RStudio. So if you go on RStudio today and you look at the GitHub issues in RStudio and just sort them by the ones that people want the most, you'll find that the most upvoted issue, number one, with 104 upvotes, is support for more Linux packaging systems. And so it seems like this is the number one thing that users want us to do. And this is actually a really great idea. But I'll point out that we should consider the source. You know, where are we looking at this? This is our GitHub issue tracker, which means that everybody on it is a GitHub user. And GitHub users, as some of you may know, actually tend to prefer alternative operating systems. You want to guess how many Linux users as a percentage there are of RStudio? It's about 5%.
So again, if you use Linux, this is not a dunk on you. You are awesome. This feature is a great idea. What I want to point out here is that if you only listen to the loudest voices in the room and you only listen to the people that come and talk to you, you're actually not going to build a product that's good for everybody. It's not really enough to simply listen to your users.
So when you listen to them, you can introduce another kind of bias. I would just call this one missing values. Because even though users will talk to you, it's also true that users mostly don't know what they want. You know, some of my favorite features to work on in RStudio have been features that literally nobody asked for. This is one that I worked on a few years back. It was one of my favorite little pet projects. It became a very popular feature. But literally no one asked for this. Your users don't necessarily know what they want. And so you need to kind of fill in the blanks when you're listening to them.
So if you want to avoid this bias, what you need to do is actually watch people use your tool. And I promise you. I promise you if you watch people use your tool, you will always be surprised 100% of the time. People do things that you have no idea that they would do.
I just want to be clear here. We do not collect telemetry on individual users at Posit. That's just not our thing. So everything that I'm about to say is based on things that were shared on YouTube. Turns out people really love recording themselves using RStudio and putting it on YouTube. Observations of colleagues and so forth.
Here's an example of somebody using RStudio on YouTube. One of the things that we noticed when we watched people using RStudio is this. So you can put your hand up or not. But how many of you have ever opened RStudio, one of the most advanced statistical environments ever invented, in order to add two numbers? People love doing this. Yeah, they treat it like a calculator, right? This is a thing that people love. And you'll never see this on GitHub issues because no one's going to open an issue and say, I wish this thing was a calculator.
And so by watching people do this, we actually were able to bring that same feature to Positron. So when you open up Positron, you'll notice that it has exactly the same feature. When you open up, it will try very hard to make sure that it is a calculator. You can type stuff into it and get an evaluation. Again, just kind of based on watching people.
The same thing is true of the data viewer, right? So RStudio has got a really nice data viewer. And one thing we noticed from watching people use the system is that people have this data viewer open all the time. It's a fixture. In fact, the amount of time that people spend using this data viewer is way out of proportion to the amount of time that we spent making this data viewer. This thing is the result of maybe like two or three months of effort.
And so based on that observation, when we built Positron, we were like, we are going to make a really, really good data viewer. You can go play with this today in the Positron lounge if you're interested. But this thing has support for a million rows and a million columns. It's super high performance. There's really nice column profiles. It has excellent searching and sorting and filtering and row and column pitting and all kinds of great stuff. And again, this is based on the observation that the data viewer is where people spend a lot of their time. It's not something you can learn by just listening to the problems that people report.
Hiding complexity
So it's really important to listen to people and to watch them. And another thing you've got to do if you want to make tools like this is you need to take things that are complicated and make them simple.
So again, this is the Malling-Hansen writing ball. And I just want to invite you to stare at this thing for a minute. And I think the longer you look at it, the more complicated this thing looks. I could probably tell my children that this is a steam powered radio telescope. And I think they would believe me. But if you use a keyboard today, it actually probably looks a lot more like the keyboard on the right than the one on the left. And I just ask you, you know, where did that complexity go?
And someone much smarter than me once pointed out that complexity is a little bit like entropy, which is to say that you can never get rid of it. And the best you can ever hope to do is just kind of move it around. In fact, if you look at this somewhat simpler keyboard, you'll find that it's not actually simpler. It's considerably more complicated than that thing that Nietzsche was using. It's just that all the complexity is hidden from you. The complexity is over there. It's hidden where you don't have to deal with it. The tool has taken the complexity away from you and kind of embedded it in the tool itself.
It's probably not too hard to see where I'm going here with data science tools. An example from the RStudio world is breakpoints. In RStudio, you can click on a line number and you get a cute little red dot. And the next time that line of code runs, it will stop there and you can look around. This functionality is built on a lot of complexity. In fact, did anyone know that R itself has a set breakpoint function? I'm guessing that most of you did not know this because even though R has this, it's really complicated to use. RStudio takes that complexity away and puts it in the tool so you can just kind of click in the gutter.
Search is another good example of this. If you use a search or a search and replace in RStudio or here this is Positron, the interface almost could not be simpler. You just literally type what you're looking for and you see a list of results. And again, this is not something that you have to use an IDE to do. You can do this at the command line with something like grep. But if you really want to do this well, you need to understand things like text encodings and recursion and ignore files and regular expressions. And we take all that complexity away from you. We put it in the tool so you can just type a word.
And so that's kind of the thing that you need to remember here is, when at all possible. You can't get rid of complexity, but you can take it away from people and put it in your tool.
Meeting people where they are
I love this. This is a piece of art, not a real fork. This is a collection from a collection by a Greek artist named Katerina Kamprani. And I love this thing, right? Because you can almost imagine how this could happen. You know, you could imagine someone coming to you and saying, hey, we work at a company that makes silverware. These forks are great, but could you make them a little more flexible? Or you know, wine glasses, they've been done. What if we made them, I don't know, more collaborative?
And I'm sure you've used tools that were made this way, right? They're optimizing for one particular attribute. And certainly this is sturdier than regular flatware, but it's also really uncomfortable to use. And so I think a lot of times as toolmakers, we make tools that are designed to achieve our own goals and we kind of forget that we're making things really uncomfortable for people using them. It is really important to meet people where they are.
And again, let me give you a couple of concrete examples of this. For example, you may have heard it said unto you that you should make a separate folder for each project. This is really good advice. If you make a separate folder for each project, you will be able to isolate dependencies. You'll be much more successful with source control software. You'll be able to reuse templates and all kinds of other great stuff. This is almost unilaterally a good idea. But we don't make you do it.
Both RStudio and Positron open up in this mode where there's no project and they don't try to force you to make one. And this is because we've learned that sometimes people, when they open up the software, maybe they're not looking to use a project. Maybe they just want to add two numbers. Sometimes you just kind of want a quick scratch pad, right? A project is a good idea, but forcing users into that workflow from the get-go is not very ergonomic.
Here's another thing you may have heard. You should start with a fresh workspace every time. You probably heard this. And this is very good advice. This makes sure that you're always recreating the data from scratch and not having these long-running things that you can't reproduce. But we don't make you do it.
In RStudio, the current options, which you can argue whether these are the correct presets or not. Positron actually does this a little bit differently. We do not enforce this workflow. We do allow you to save and load it, again, because sometimes you're not building a beautiful monument to reproducibility. Sometimes you just need a little bit of a scratch pad. And so we've kind of compromised a little bit on enforcing reproducibility in order to make a tool that feels more ergonomic.
Here's another thing you might have heard. Don't change the working directory. This is also very good advice. If you're manually changing the working directory, it usually implies that the code that you're writing is not going to be very portable. It's going to have problems with paths. Generally speaking, you want to make sure you're computing your path from a project root. But not only do we let you do this, we actually made a button that does it. Again, in the service of the fact that we understand that people need to do this. It's really important that your tools be ergonomic and sometimes at the expense of ideological purity.
Empowering users through extensibility
So it's really important to listen to people. It's really important that you make your tool easy to use and simple. Another thing that you've really got to do is empower your users. Back in 2016, we created these things called RStudio add-ins. And the RStudio add-in API is actually fairly low-powered as extension APIs go. But the community took it and made just kind of an explosion of creativity with it.
We have this, this is one of my favorite add-ins called Datapasta. What this thing does is it, you should try it if you haven't, it lets you copy data from anything on the web and paste it into RStudio and it turns it into R code that creates that same data set. There's one called Shortcuts that literally lets you make anything into an RStudio shortcut. There's a really nice one called RegExplain. And what this one does is explains regular expressions. There's a color picker one. There's a styler one for styling your R code. There's a theme assister and there's a bazillion of these. And the community has just really surprised us with their creativity.
This is also true in VS Code land. And that is one of the reasons that Positron is built on VS Code is because VS Code has got something like 80,000 extensions. They have really managed to make it very easy for people to contribute experiences to the editor surface. And these extensions, they really run the gamut of possibility. All the way from something like a C and C++ development compilation and inspection environment to a cute little cat that you can adopt and put in your IDE.
In fact, if you'll allow me to nerd out just a little bit, bear with me for a minute here. I think that VS Code's extension model is actually a core reason for it being so successful. Prior to VS Code, a lot of editors, and by the way, I blame Emacs for this, a lot of editors load their extensions right into the editor itself. Emacs started this trend, as far as I know, and it gives the extensions a lot of power and literally makes the editor something that you can extend infinitely in a way that's as native as the editor itself. Emacs actually is a really great IDE. It's just missing a good text editor.
And this is the approach that Adam takes, for example, and the problem with this approach is that when you do this, you allow the extensions to compromise the core IDE experience. And so VS Code works a little differently. Extensions run actually separately from the IDE. They run in this thing called the extension host, and they only get to talk to the UI and what's effectively remote procedure calls. And this design has allowed VS Code, I think, to make a very good tradeoff between making a system that is extensible by the community, but also having a reputation for reliability and performance. This is why VS Code has 80,000 extensions and is still fast, and it's one of the design principles that we adopted for Positron.
So in Positron, we kind of took this idea and we extended it. Just like VS Code has an API that allows you to build extensions, Positron has that API. We also added another API that makes it possible to add language packs. So in Positron, the R and the Python systems are actually extensions. I find it's easiest to think of Positron as almost like a data science workbench that has these kind of core features like a data view and a console. And then R and Python are more like extensions that plug into that workbench and give you specific language capabilities. But this thing is built on an API that allows anyone to make these, and we hope that you as a community are able to take this thing and build your own creative solutions on top of it.
This is another way that I like to think about this. If you think about the parts of the product that you build, certainly the core functionality is something that belongs inside your product. But if you allow users to build extensions to your system, what you'll find is that they can build out kind of this long-tail functionality that is not or maybe even shouldn't be part of the core of your system. So it is very important to give people the ability to extend your tool. I promise you will always be surprised at their creativity if you do this.
Making tools for results
So it's really important to make tools for people, but also for results. So if you think about the whole equation of like a person uses a tool, the thing that they make is a result, right? And when I say result here, I'm talking both about an artifact or an output like a PDF document or a PowerPoint presentation or a report, a dashboard, anything like that, as well as an effect, right? Less time searching for information or better organized information.
So I want to remind you what the effect of this thing was on the output or the result of Nietzsche's writing. It created aphorisms and puns and telegrams out of his writing. And so I think what this implies is that a tool's output is shaped by its design. And I would also argue, if you'll allow me to get philosophical for just a minute, that the world that we experience is actually made up of the output of many tools, like all those books that are out there that were written with that typewriter. But even like think about the stuff that you're experiencing right now, like these monitors, like this stage, the chair you're sitting in, your laptop, your landing gear, all these things are actually the end result of many, many tool outputs.
And so if it's true that a tool's output is shaped by its design and the world we experience is made from the output of many tools, it follows that a tool is a statement about what you want the world to become. I think many of us in this audience are tool makers, and it's important to realize that the tools that we make, because again the world is made up of these things, are actually statements about how we want the world to be.
And so if it's true that a tool's output is shaped by its design and the world we experience is made from the output of many tools, it follows that a tool is a statement about what you want the world to become.
So I want you to think about what statement is this thing making about what it wants to the world to be. I would argue that what this thing is saying is that writing should be closer to thinking. You know, this thing removes a step between writing and thinking because you no longer need to form the letters by hand, you just press a button and the letter appears, right? It moves your thoughts closer to your output. And as a result, what you write becomes closer to what you think.
Imagine something as simple as a garden variety leaf blower. This is also a tool that says something about the world, you know? You might think that this says something like, sidewalk should be clean, but what this tool actually says is that no one deserves to sleep after 6 a.m.
Or even think about this VS Code Pets extension, right? What does this tool say about the world? What does this tool want the world to become? I would say this tool says the world should be more fun and whimsical, right? Even something as simple as this has a point of view.
A few slides ago, I showed you this diagram of Positron's architecture. And this API that we made that lets you plug different languages into Positron is also a tool. And because it's a tool, it says something about what we want the world to be, right? We think that there should be room for many languages and they should work together.
What Positron's tools say about the world
So I want to talk for a couple minutes about the tools at Positron and the things we want them to say to the world. One thing that our tools always want to say is that science should be reproducible. So I've worked on IDEs. Let me talk a minute about how we do that in our IDEs. For example, again, looking at RStudio's data viewer, you'll notice one thing that this data viewer doesn't have is an edit button. And the reason it doesn't have this isn't because no one's ever asked for it or because it wouldn't be handy. It's because it is so easy, if you have an edit button, to start editing your data by hand and creating a workflow that's not at all reproducible.
I know you're thinking, wait a minute, Jonathan. A few slides ago, you told me to put in stuff that users want. And the truth is that this is not actually an easy decision. You kind of have to decide on the tradeoffs here. In Positron, we've tried hard to make this even more reproducible by adding a button that actually converts the sorting and filtering that you've done in the data viewer into code. So you can take the buttons that you clicked and the sorts and filters that you added and actually create code that does those same edits so that you do them again and again.
I want to take you back for a minute just to RStudio Conf 2019. You may recall at the time, these low code and no code tools were kind of having a moment. And Tarif gave this presentation about basically just coming out in support of code and saying, we actually love code. It's repeatable. It's inspectable. It's reusable. It's diffable. And in the age of AI, I actually think a couple more bullets should be added here. Code is ingestible. Code is what these models read. And if you have expressed your tools and your workflow in the form of code, the models can actually read that code and understand what you're doing. Similarly, code is disgorgeable, is that a word? These models also write code. So if you are kind of embracing a code first data science workflow, you've actually kind of already positioned yourself to get the best out of what this new generation of AI tools has to offer.
So code should be reproducible. How about this one? Science should be free and open. This is something you've heard us say a lot. When you hear it, you probably think a lot about our open source packages and models. But this is also true of our IDEs. You know, when we built RStudio, everything was kind of packaged into one unit, right? We have R execution and debugging and code formatting and all this stuff was kind of in one bit. But in Positron, we've tried to make the system more free and open.
So again, I'd like to think of Positron as like a data science workbench. And for the R components of Positron, we actually built a couple of reusable systems that can be used outside of Positron. So Positron's code execution engine and like it's R formatted and so forth, they're actually separate subsystems that we've contributed to the community to use in other ways. This is not just theoretical. For example, I don't know if any of you use the Zed text editor, but if you go and look at how to run R code in Zed, what they'll tell you is to go get the ARC kernel, which is Positron's code execution engine, and plug it into Zed so you can run R code there.
If you have used TreeSitter R, I'm guessing you think you haven't, but if you've ever looked for R code on GitHub in the last few months, you have. The R code navigation system on GitHub is actually powered in part by code that we wrote for Positron. We've contributed that back to the community and back to GitHub. The same thing is true of that R code formatter. So this is the VS code plug-in, you can get this today, it's on the marketplace. You can use Positron's code formatter inside of VS code, and you know what? You can also use it inside of RStudio. It works great there.
RStudio and Positron: complementary tools
I just want to say one more thing here before we wrap up. I have kind of experienced a lot of feedback about what's going to happen to RStudio in the world of Positron, and people seem to think that we're going to remove RStudio in favor of Positron, which is not true. So let me kind of tell you how I think about this.
This is the picture that kind of comes to my mind. If you're like me, you actually have both of these tools in your toolbox. I've got a Phillips screwdriver that I often reach for, but I also have one of these screwdrivers that has multiple bits that I can kind of swap out as I need. And the truth is that some of these things are useful or not at different times.
One thing that is true about making tools is that as you make a tool more and more generic, it kind of becomes less and less good at any one particular thing. RStudio is a great tool for R in part because it only does R. It is very well built for that purpose, and if that is all you need, then it is always going to be a tool that's more like a Swiss army knife that's got a bunch of attachments hanging off of it. Sometimes you do need the Swiss army knife, and I often find myself reaching for Positron. But RStudio is very fit for purpose, and a lot of times when I'm reaching in my toolbox what I want is the tool built for that specific task.
Summary
So just to kind of remind you of what I've talked about here, I think it's really important to make tools for people. You need to listen to them, you need to watch them, not in a creepy way. You need to empower them. People are one of the most important components of a good tool. You need to make tools for results. I want you to remember that you need to shape the tool according to the good that you want it to do.
And finally, if you literally remember nothing else from this keynote, this is the thing I want you to remember as a tool builder, and that is that a tool is a statement about what you want the world to become. As people who build tools, this is an enormous responsibility, but it is also a huge, huge privilege.
And finally, if you literally remember nothing else from this keynote, this is the thing I want you to remember as a tool builder, and that is that a tool is a statement about what you want the world to become.
Q&A
And with that, I think we'll take some questions. Thank you.
So if you want to ask any questions, you can ask them on Slido. Again, that link is in the conference app. I am going to pick the questions based on the number of votes, plus whether I think they're cool or not. So I'm going to ask one. It's kind of a fun one. I didn't get so many votes, but I think it's interesting. What's the most surprising thing you've seen and that you never expected someone would do with RStudio or Positron?
The most surprising thing that I would never have expected anyone would do with RStudio and Positron. I think it's not too surprising to see the actual things that people do with it. I have been very surprised to see where people want to run this thing. Someone tried to make it work on a Raspberry Pi, and I was like, all right. I have seen people try to put it onto supercomputing clusters where it had no business being, and a lot of other really, really unique places.
Did you see that Hacker News article about running a web server on a disposable vape? You know, I did see that article about running a web server on a disposable vape, and now I kind of want to put Positron on it.
A quick question. Is tighter R integration on the roadmap, like detecting the R version from the R in block file? That's a good question. So the answer is yes. So we're continuing to innovate and develop Positron's R capabilities. I want you to remember the slide I shared a minute ago that had a picture of a screwdriver with a bunch of attachments on it. So the R support in Positron is like, it's a little attachment that goes into Positron. And so it is not yet as fully developed as RStudio is, and to be totally frank, it probably never will be. But we are making it better all the time. And yeah, better support for detecting the R version that you want to use automatically is definitely on the roadmap.
What are some other languages that you see as kind of up and coming in data science? And are there anything on the roadmap for us at Posit in terms of supporting those things? That's a really good question. So there are a lot of languages that are really interesting for data science. Honestly, I think one of the most interesting languages for data science coming up right now is JavaScript. People who do it, it seems like a weird fit, but honestly, so did Python. So people are doing a lot of really interesting data science right now with literally just JavaScript.
So, in fact, I've experimented a little bit with putting a JavaScript system into Positron so you can literally just write JavaScript code. Not actually available in any build, but I think that JavaScript as a language is very interesting for data science. And I like SQL, better SQL support on the roadmap too. Yes, SQL support is also very interesting to us and on our roadmap.
The other thing, and don't stone me to death for this, that I think is kind of interesting possibly is like SAS support. Does anyone have a rock? I just think that's sort of really interesting that we're seeing a lot of companies want to move from SAS to other languages, but in some ways, VS Code and Positron and the alternative SAS engines like WPS and the language engine, there's sort of interesting ways of running that and very old code and very modern systems, which I just think that would be hilarious if we added that.
Does hiding complexity do a disservice to the user and ultimately create a dependence on the tool creator and lead to more towards more users who don't know what they want? That's a really good question. Who asked that? It's Scott. Scott, that's a great question.
And so I think that this kind of requires a nuanced answer, and that is that it is true that it's not always the right call to hide complexity. And you kind of have to decide at what level of abstraction you want your tool to sit. And you also have to think about whether or not it's important for your user to understand what's happening. If you think about the tools that we have today for data science, really like the whole software stack, like really nobody understands the software stack, I think down all the way down to zero. At the end of the day, every tool is abstracting away some complexity from you.
For example, if you use R, it's really not fair to say, well, are we really doing users a disservice by not showing them the raw assembly? Probably not, right? Like there's actually never a need for a user to know that, right? On the other hand, there are some places in RStudio, and I didn't have these in my slides, but for example, when you import data using RStudio, it actually doesn't just import it. What it does is it generates the code for you to run to do the import. It uses it to kind of teach you the code to write. So it is certainly true that you can create an artificial dependence that way. It is also true that you can use your tool to help teach people how to operate at a lower level of abstraction.

