
Updates from Posit, with Hadley Wickham, Charlotte Wickham, George Stagg, and James Blair
6:45 Hadley introduces the conference. 10:37 Hadley on Posit, PBC. Who are we, and what we do. 20:41 Charlotte Wickham on Quarto, an open-source scientific and technical publishing system. 31:05 George Stagg on webR, R for WebAssembly. Execute R code in your web browser. 43:34 James Blair with the latest on Posit's partnerships with Databricks and Snowflake. Please join us for our first Posit Conf 2024 keynote, where we’ll tell you about our mission, our products, and some of the exciting things we’ve been working on over the last year. Hadley Wickham, Chief Scientist, will talk briefly about Posit’s mission and products, before introducing the three speakers who will update you on some of the coolest projects we’ve worked on over the last year. James Blair, Senior Product Manager, will give you the latest on our partnerships with Databricks and Snowflake, and how we’re building seamless integrations that let you focus on data science instead of dealing with technical details. Charlotte Wickham, Developer Educator, will show you what’s new in Quarto, focusing on new ways to build beautiful PDFs with Typst. Finally, George Stagg, Senior Software Engineer, will tell you about the latest innovations in teaching using webR, a tool that lets you compile your R code into standalone HTML files. Talk by Hadley Wickham, James Blair, Charlotte Wickham, and George Stagg
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Good morning everyone. Welcome to PositConf 2024.
I'm so excited to see so many of you all here in the room, and I know there are many more joining us online as well.
A few important pieces of information. The number one thing you need to know about is the app that has everything that's going on, regardless of whether it's our awesome speakers, or some of those fun social events we have planned for you.
Also, two important links in the app. The first is in many of the sessions, you'll be able to ask questions of the speakers through Slido. And if you want to engage with others online, whether they're here in person or not, you can use our Discord server.
The first time this year, we're trying out something fun. We have a little bit of a conference competition going on. You can enter to be in to win prizes, like free tickets to next year's Conf, or sticker packs, or signed books.
The competition is going to ask you some fun questions about you, and we're going to present those in a shiny Quarto app, of course. You'll be able to see those statistics live, updating in the lounge. If you haven't heard of the lounge, you've already walked through it on the way to the keynote. But this is the best place to sit down, grab a coffee, talk to our sponsors, and if you have any questions about Posit products, open source, or commercial, we'll have a bunch of folks there to answer any questions that you might have.
If you need to identify a Posit employee, the shirt color is a little subtler than it has been in the past, but we are wearing a bluish green, and you all are wearing a greenish green.
We really want everyone to feel safe and welcome at Posit Conf, so as part of your registration process, you've already agreed to our code of conduct. If you experience any problems, or see anyone else having problems, please let us know. You can approach any Posit employee, you can come to the registration desk, or you can email conf at posit.co.
We've also given you a bunch of information available on your lanyards and badges, so if you want to indicate whether you'd prefer hugs, or handshakes, or no contact, you can grab one of our contact pins. We ask you to respect folks' pronouns on their badges, and finally, if you see anyone wearing a red lanyard, that's an indication that I'd like to not appear in any photographs.
We also have a bunch of spaces reserved if you have kids with you, and they need somewhere to run around while you watch the talks on your computer. We have a family room on floor six. We also have a quiet zone there if you want to do some work or just chill out. On the fifth floor, we've got a meditation and prayer room, and a lactation room, and every room has gender-neutral bathrooms.
That brings me to the most important rule of PositConf, and that is the Pac-Man rule. The Pac-Man rule is if you're standing in a group, please make sure you leave Pac-Man's mouth open so that new people can join the group.
We know this is a big conference, and it can be intimidating to meet people, so at lunch today and tomorrow, we have birds of a feather stations. This makes it easy to sit down and talk to someone because you've got a guaranteed shared common interest.
About Posit, PBC
Hopefully, you get the idea that community is important to us, and community is important to us not just at Conf, but to the company as a whole.
I wanted to take a couple of minutes of your time while I'm up here in front of you all to talk a little bit about who we are, what we do, and then I'm going to introduce three of my colleagues who are going to tell you some of the particularly cool stuff we've been working on in the last few months.
So, to understand who we are, I think it's worth looking at our name. Now, this is not a particularly complicated name, but there are two parts, and I wanted to talk first about the second part, this PBC, or Public Benefit Corp, or B Corp. So, what is a PBC?
Well, if you imagine all organizations falling on some sort of continuum, on one side, you've got corporations. In the US, at least, the sole goal of an LLC is to maximize shareholder revenue, shareholder value.
And this is not the worst thing, right? Corporations have been a tremendous economic engine for the world over the last hundred years, but at Posit, we don't just want to make money, we want to do something good in the world as well.
And one way to do something good in the world is to be a charity. But being a charity is really hard, because you have to ask people for money, and they have to give you money out of the goodness of their hearts. And while there are many incredibly valuable charities, at Posit, we believe we can provide you some kind of like immediate benefits as well, as well as this longer-term mission to create really amazing open-source tools for data science.
We can give you some tools that actually pay off right away, and that you're hopefully willing to pay us for. And so that's why we're a PBC. A PBC kind of lies in the middle. It has responsibilities towards its shareholders, but also to its employees, to the environment, and to the community.
And every PBC has to have some kind of fundamental mission as well. And so this is our mission, to create free and open-source software for data science, scientific research, and technical communication. This mission is something that's so important to us that is fundamentally encoded in our corporate DNA.
So that's the PBC side of the name. Well, what about the Posit side? Well, many of you know we used to be called RStudio, and that's because we mostly cared about R. In the very early days, certainly we were 100% R, but it hasn't been true for a very long time, because even over the last 10 years, we've started to produce tools for Python, and SQL, and Julia, and other languages.
And you may have heard JJ and Tarif talk about our goal of being a 100-year company. And when you think about that kind of time span, no matter how much you might love any programming language today, it seems pretty unlikely that it's going to be around in 100 years' time.
So for Posit to exist for 100 years, we need to be willing to embrace many different languages over time. The other thing we need, if we want to be around for a long time, is money, and probably bags of it.
And that brings me to the kind of economic engine at the heart of Posit, which we call the virtuous cycle.
So the virtuous cycle starts with us creating free and open-source software that anyone, anywhere in the world can use, regardless of means. And millions of people use those tools. And then some of those people go on to join organizations, larger organizations that have kind of special needs, and that leads them to our commercial products. And by buying our commercial products, you both give us the money to support further investment in free and open-source software, but you also give us data. Data about what are the challenges facing data sciences today that we can solve in both our open-source tools and our commercial tools.
So the way we kind of think a little, kind of think about the split between the free and the paid is generally the tools we provide for doing data science are free. So that's tools like the Tidyverse and Tidymodels for doing data science in R or many of the hundreds of R packages that we maintain. It's also true for tools like Shiny, which allow you to create interactive apps for R and Python, or Quarto, which allow you to create data-driven documents and using R and Python and Julia and other languages.
When you want to bring those tools into your organization, that's where our commercial tools come into play. Because large organizations have special needs. They have needs for things like authentication and monitoring, security, for scaling and tuning and data governance.
And we also charge for products that live in the cloud, regardless of whether those are our own products, like Shiny Apps IO or Posit Cloud, or tools from our partners like Databricks and Snowflake, or for generalized compute services like AWS or Azure or Google's Compute Platform.
So at the heart of Posit, there are really three beliefs. First of all, we believe in open source, that the era of using proprietary software to do data science is over. That open source provides such a benefit, is such a great way of doing software development, because it brings together expertise from many, many people all around the world.
We also believe in code, and clearly not getting our fonts correct. We believe if you're doing data science every day, that the benefits of learning to code massively outweigh the costs of learning to code.
Now, this is certainly not true for everyone, right? Not everyone uses data science every day, and there's a bunch of different user interfaces that might appeal to different types of people. But fundamentally, what we believe in and the audience that we want to serve are people who write code.
And so what we want to do is take open source, code-first data science, and make it ready for the enterprise, so that you can take the same tools that you used for free when you're learning, and regardless of whether you join a 10-person little business or a 100,000-person massive organization, you can take those tools and continue to use them effectively.
So what are those tools? Well, the first one, and to my mind at least, the most important, is Kinect, because it connects data scientists to decision makers. Kinect makes it easy to deploy whatever you create as a data scientist regardless of whether that's a Shiny app or a Flask API or a Quarto document or an R Markdown document, and get that out of your hands and into the hands of the people, the decision makers in your organization.
Posit Workbench is about helping data scientists, teams of data scientists, collaborate using shared compute and providing seamless authorization and authentication so you can just get the data you need without having to fuss around trying to find the right database driver or randomly Google and finding Stack Overflow answers from five years ago. And Workbench helps you do that regardless of whether you use RStudio or JupyterLab, Jupyter Notebooks, or VS Code.
Posit Package Manager is all about bringing open-source R packages and Python packages into the walled garden of your IT organization so that your IT folks don't need to worry about you installing random code from the internet and you can still use all of the tools that you're used to.
As James mentioned in the intro video, Posit Academy is all about creating new data scientists using a project-based apprenticeship program that brings together data scientist mentors from Posit with mentors inside your organization so that folks learn the way that you all do data science.
And then finally, we have Posit Cloud, which solves problems for organizations, often small and medium businesses and education, that don't want to run their own IT organization. It allows you to compute in the cloud and publish your results to share without having to have a big team of people in your organization running your own hardware.
So let's Posit PBC. We are a company that has a mission. We want to make free and open-source software to make data science, scientific communication, and technical publishing better. And we do that by selling products that hopefully also make your lives better inside companies as well. So this is a mission that really aligns with my personal mission and is one of the reasons that I joined our studio, I joined Posit, well, I joined our team, I joined Posit, well, I joined our studio, and I continue to work at Posit today.
Next up, I'd love to introduce three of my colleagues who are going to speak about some of the cool stuff that we've been doing as a company lately. In reverse order, we have Databricks and we have James Blair, who's gonna talk about Databricks and Snowflake. George Stack is gonna talk about WebR and Carlotte Wickham is gonna talk about Quarto.
Quarto updates
It's a pleasure to be here. Coming to a conf keynote always takes me back to being a little kid on Christmas morning. I get so excited about waiting to see what presents are waiting for me under the tree.
And back then, just like now, there's always someone else there trying to spoil it, my big brother Hadley.
So the present for me two and a half years ago was Quarto. But the real present has been seeing what you've built with it. And we've seen some absolutely beautiful websites like this one from Real World Data Science. And I've seen you exquisitely wrap up your reports with custom output like this one from Megan Hall and this one from the R for the rest of us team. You can learn more about both of those if you head to the pour some glitter on it session tomorrow.
I really love to see people use Quarto for documentation. It's a little bit more like getting a pair of socks for Christmas. It's like slightly underwhelming when you open them but every day you put them on, you get a little bit of joy. And the less time you spend worrying about how to build documentation, the more time you get to spend writing it and the more people you get using what you've built. And in this case, that's a tool for spatial equity and you can learn more about that from Gabriel Morrison in the data science case studies session.
I'm here today to talk about two things that the Quarto team has been working on in the last year.
And the first comes from looking back, looking back to the R Markdown ecosystem and seeing what was missing in Quarto. And that was dashboards. So Quarto dashboards are an HTML format designed for the full page layout of plots and tables and values. And this is one I found out out in the wild, somebody already using it for the Shelby County Department of Housing.
The fundamental unit in a Quarto dashboard is a card. And these cards get laid out in rows and columns.
What does it look like to build a Quarto dashboard? Take a look at this HTML document. It's got a table, it's got a plot, it's got a map. What does it take to turn that into a dashboard? Well, you simply swap out format HTML for format dashboard. All of the computational outputs are going to get put into cards and then those cards get laid out on the page.
By default, that happens in a single column. You can see they're all stacked on top of each other. What happens if you want a two-row layout? All you have to do is add two markdown headings. It's as simple as that.
You can put anything in a Quarto dashboard that you could normally put in a Quarto document. But there are some specific dashboard components. So, for example, there are these value boxes. These are ways to have a single value but in a visually attractive way. And there is this idea of input panels, which is a way to group together interactive inputs into sidebars or toolbars.
You can also add interactivity in a Quarto dashboard any way you can in a Quarto document, including native support for observable JS. You can use frameworks like Leaflet and Plotly and 3JS through HTML widgets in R and Jupyter widgets in Python. And, of course, you can use Shiny. You can add Shiny and Shiny for Python inputs and outputs directly in code cells.
You can learn more about dashboards by heading to quarto.org and checking out our guide. If you want to see a dashboard in action at Conf, head to the It's Not R or Python session. It's R and Python. You can hear Nick Crane and Alenka Frum talk about a dashboard they've built that's got both Python and R components.
Now, if you've been paying really close attention, you'll notice I've just recommended three sessions. They're all happening tomorrow at 1 p.m. And life is never easy. There's actually a fourth session also with a great Quarto talk. So you can also head along to level up your data science skills to hear Cynthia Huang talk about some interesting things she's been using Quarto for in terms of knowledge management.
And I don't know what I'm going to be doing at 1 p.m. tomorrow. Probably just sitting somewhere completely confused and disabled by too much choice.
Quarto and the future of PDF
The next thing I want to talk about is the future. The next thing I want to talk about comes from looking towards the future. In particular, the future of PDF.
Let me tell you about our PBC report. This is a report that Posit makes every two years that tracks our progress toward our mission. And it is primarily a PDF document. Here's the cover of that document. Inside, there's some pages that have this dark blue hex background. There's some pages that have a white background but with some full bleed photos. Text is laid out in two columns. Some pages have a banner at the top. There's plots in there. This year, we made that report with Quarto.
And if you've made PDFs with Quarto before, you know that they go through LaTeX on that journey. But we did not make this report by going through LaTeX. We used Typst. Typst is a new open source framework for producing PDFs. And it's already supported by Quarto. To use Typst, you simply swap out format PDF for format Typst.
We think Typst is the future of PDF because it's really easy to get started. Quarto bundles Typst, so there's no separate install to manage. You can do it today. Typst is really fast. So rendering a PDF via Typst is much faster than rendering a PDF via LaTeX. We think it's easier to learn. There's great documentation on the Typst site. And the syntax feels a lot more familiar to people that are coming from languages like R or Python or Julia. And it's very customizable. So our report being an example of that. It took a little work to get that customization, but you can produce really beautiful documents.
So we think it's the future of PDF. It maybe isn't quite the present of PDF. Typst is still developing, and our support in Quarto is still developing too. But I do want to show you something really cool you can do in Typst right now.
So take a look at this table. This is an HTML table. It's built using the GreatTables package in Python. And an example from Grant Chalmers. It's on the GreatTables documentation site. And I think it's a beautiful table.
And GreatTables and GT and a lot of table packages really focus on absolutely beautiful HTML output. So what happens if you switch to Typst? Well, behind the scenes, Quarto takes all the CSS that makes that HTML table really beautiful, and it converts it to Typst properties. And what that means is when you get your PDF out, that table looks just as good as it did in HTML, but now it's in a PDF. And not only that, like Typst is sort of good enough to know that, hey, this table's pretty long. It's not going to fit on one page. I'll break it over two pages, and I'll repeat the headers and footers and make it all just work for you. It's really cool.
Community contributions to Quarto
The last thing I wanted to talk about is not what we've been working on, but what other people have been working on that enhances Quarto for everyone.
So when Quarto was built, it was great because it brought all these features of R marked down to Python users without requiring them to install R. But if you were a Julia user, the first thing you would see is, hey, you need to install Jupyter, and that's a Python thing. That is no longer true. There is a native Julia engine, so if you want to run Julia code without having to have an R install or a Python install, you can now do that. And that was an entirely external contribution from Julius Krumbegel.
The other contributions I want to talk about come in the form of extensions. And there are Quarto extensions to add QR codes to your slide, and there are Quarto extensions to add countdowns to your slide. There's already a Quarto extension that gives you Tufti-style output via types. And there are so many extensions that you should check out this other contribution. This is a Quarto extension listing built by Mikhail Kanwe. And it's a great place to go to see what's new in the extension world and what's popular in the extension world.
If you want to see some Quarto extensions here at Conf, you can see James Goldie talking about an extension he's built to bring the power of Svelte graphics to Quarto documents. And you can also hear Andrew Bray talk about an extension he's worked on with James to bring scroll retelling to Quarto documents.
If I haven't put enough on your schedule already, you can find the Quarto team in the lounge today at noon.
It's been an absolute pleasure telling you what we've been working on and showing what you've been working on. And I hope that together we can keep Quarto as the gift that keeps on giving.
Now you're going to hear from someone that's already given many gifts to the Quarto community. Please welcome George Stagg.
WebR and Quarto Live
Hello, my name is George Stagg and I'm the lead developer of the WebR system, a way to run R code directly inside your web browser.
So I've been working for Posit now for about two years and I'm still really excited to tell people that. Posit's mission of making great open source software really resonates with me. I've been making open source software for over 10 years now. I started in research software developing for HPC and over that time my approach to open source has always been no matter what you do, no matter what decisions I make, they must always be in the interest of the user.
And it was that philosophy that originally led me to create WebR. At the time I was supporting a group of students who were remote up and down the UK and we needed to evaluate R code in a consistent environment and capture those results. To me this felt like the best solution for the job.
If you've not seen WebR before, here's an example of a WebR application. It's a website you can visit and when you do an R console pops up and you can type code in and you can do visualizations and you can really experiment with R in a very easy way, very quickly. This is really great for new users of R who don't have something like RStudio installed. And Hadley earlier talked about the idea of code first data science. What could be more code first than opening a web browser and typing in some R code?
So how does this work? The main technology is called WebAssembly and what WebAssembly is, is a portable binary code format. So what does that mean? Really this just means that if you take a piece of software and compile it, you get a big list of instructions to execute. The word portable here just means that those instructions, the binary is the same on every machine. And that's done by running that binary inside a web browser. It's really designed to run high performance applications in this way. It does work with most modern web browsers on many devices.
And one of the things it's really great at is security because it's really designed to run on the web. It runs your code in a containerized sandbox environment. In a web browser, you interact with WebAssembly through JavaScript and that has positives and negatives. It means that WebR itself is a JavaScript application. So you need to write JavaScript to use it. And a positive of that is that it means it works great with other web technologies. A downside of that is that you probably don't love writing JavaScript code. At least not in the same way that the people in front of me love writing R code or they love writing Python code.
So for that reason, in addition to working on WebR, we produce tools built upon WebR to make your job easier.
Tools like ShinyLive. At PositConf last year, Joe Tang had a talk introducing ShinyLive for R built upon the WebR technology. And over the years since then, we've been continuing to update the system. There's been several releases. And you can use ShinyLive now in a web browser to interact with live Shiny examples. You can export existing apps using an R package. Or you can use a Quarto extension to embed Shiny applications directly into your Quarto documents.
Again, the website example is great. You can go on to a URL and type some Shiny code and it just appears. And I've actually seen this used at workshops to teach people how to use Shiny and to share examples.
And that takes me on to one of the demographics that have really taken ShinyLive and run with it. And that's the demographic of educational content educators. They've really embraced the idea of WebAssembly-powered educational content. And for that reason, we thought to ourselves, well, is there more we can do? Can we get more WebAssembly into Quarto documents? And the answer to that is yes. And I've been working on a new extension for Quarto over the last six months.
And we're calling it Quarto Live. Quarto Live is a way to get interactive code examples in R in your Quarto documents. You can directly interact with that code and you can create code exercises so that if you're an educator, you can create very easy ways for students to interact with your code and get quality feedback.
Quarto Live is a way to get interactive code examples in R in your Quarto documents. You can directly interact with that code and you can create code exercises so that if you're an educator, you can create very easy ways for students to interact with your code and get quality feedback.
We've designed Quarto Live to be really, really easy to use both for authors and for the students and learners.
So what does it look like? So let's say you have a Quarto document and it looks a bit like this. You've got some R code and it's producing a visualization. Here we have some air quality data from 1970s New York and it's plotting some temperatures and some ozone levels. Now, let's say you want to turn this into an interactive example. Much like in Charlotte's examples, you change the format into a live format and you change your R block into a web R block. And if you have the Quarto Live extension installed, that's it. You do nothing else. It just, it works.
You can see the output looks pretty much identical. That's by design. And the only real difference is that your source code above your output is now an editable code block that you can interact with and click the buttons to rerun your code. Now, this is great for students because someone who's interacting with this content for the first time may be interested. They want to know more. They want to play with this example. Previously, they would have to copy that code and open our studio and switch to a different window. But now they can literally just change the code to plot wind instead of temperature, hit the button, and it's there in the same page inside their document.
By the way, one of the things I'm really excited about this is that those are no longer pictures that are distributed with your document. They're generated dynamically on the fly every time you open that document. That means they're kind of reproducible by default, right? Because the fact that you can see that image at all means the R code in your document has executed successfully and produced some output.
If you're an educator and you want to augment these code examples into an exercise, you give it an exercise label. At that point, you can then add things to make that exercise more and more interactive. Here's an example where we're asking a student to plot some data, right? And what we're going to do is we're going to augment that with some setup code. So we create another WebR block. We tell it it's a setup block and this is the exercise that it's linked to. And what we've done is we've set up that data set for the students so they don't have to worry about it. They don't have to worry about loading that data. It's just there. They can just directly answer that question without having to worry about the extra parts. And that's because a setup block will run automatically every time when that student hits the run button.
So let's say the student tries to run their code. They see, okay, there's a blank. I need to answer this question. But they're struggling. They're new to this. They're not quite sure what's going on. Can we help? Well, we could add a hint and we do that using a quarter block. Here's a quarter block with the class hint and we've, again, linked it to our exercise using that same shared exercise label. Inside there, you can write your hint and you'll notice it's not added to the document straight away. Instead, QuartoLive adds a button to the exercise saying hint and when the student clicks it, then they get the hint text containing your block. And if the student's still struggling, you can do the same thing with a solution block. You can write your Quarto code inside a solution block and when the student hits that button, they see that solution text.
What's really great about this is that this is just Quarto syntax. There's nothing new here. All of the tools that you know how to write in Quarto to make great content will work inside these blocks. Tools like call-out blocks, tools like collapsible panels, syntax highlighted code blocks, code annotations to really explain how a piece of code works. All of these things work inside a solution block and a hint block. So you can make all of these tools work with your exercises straight away.
Now that the student has seen the solution, they can then obviously continue to interact with the code exercise, paste in that code and run it and ah, I'd recognize that logo anyway.
A nice, well, maybe not so nice, an interesting thing about educational content is that lots of students now are not really interacting with this content using a laptop or a desktop computer with a big screen. They're interacting with this content on screens that look more like this. And we've really designed WebR to work as best as it can on these constrained devices so that all those tools like visualizing and plotting and producing graphics, they work on something like a mobile phone or tablet, even if these devices themselves can't install something like R or RStudio. And much in the same way that Quarto itself resizes very nicely on these small screens, Quarto Live has again also been designed so that these exercises work well in these kind of environments. The idea here is that we want to be able to support education for any user, no matter what kind of device they're using.
Some more quick examples of the things you can do with Quarto Live. If you've ever used the LearnR or GradeThis system, you may be interested in writing algorithms to grade students' code. You can do that. Here's an example where we ask students to do some filtering and summarizing of some customer reviews of a cat bed. And if the student makes a mistake, we can give some encouraging feedback to say, you know, try again, you're almost there. If the student gets the answer right, we can give some celebratory feedback to say, great job.
What's really great about this is that the R code that the student writes is really being evaluated. And your grading algorithms are also really being evaluated. And then these are all written in R code. So you can catch certain mistakes that you know students are going to make. Say, you know students are going to use a double ampersand here when they should use one. In addition to the error text from R, which may not be entirely clear to a new student what that means, you can add targeted feedback to say, oh, this is what you've done wrong and this is how you can fix it.
The tools that make this work are very dynamic and very general. They're not just designed to work with exercises like this. In a Quarto Live document, you can use observable JS inputs



