
Keynote, Winston Chang: Lessons and opportunities with Shiny for Python
About the keynote: For the past 10 years, the R community has been able to use Shiny to bring interactive data analyses to the web. Last summer, we announced that Shiny would also be available to for Python. In this talk, I’ll discuss what we’ve learned along the way, as well as the opportunities that Shiny for Python opens up. We are at the beginning of an exciting new era for Shiny, and I hope that community of old and new Shiny users make the most of it! Speaker's bio: Winston Chang is a software engineer at Posit, PBC, who has served in various roles on the Shiny team, including team lead. He has contributed to many widely-used packages in the R ecosystem, including Shiny, devtools, and ggplot2, and is the creator of several packages, including shinydashboard, R6, and profvis. Recently he has been busy working on Shiny and Shinylive for Python. Winston has a Ph.D. in psychology from Northwestern University and is the author of the R Graphics Cookbook, published by O’Reilly Media
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Today, I'm going to be talking about lessons and opportunities with Shiny for Python. I should give you a little background here. Shiny for R has been around for over 10 years now.
I didn't work on it in the very beginning. It was Joe Chang that created it in the very beginning, but I joined the team not too long after. So, that was, yeah, it's been, you know, over ten years now, and about a year and a half ago, the Shiny team at, well, it was then our studio, now it's a deposit. We started working on Shiny for Python, and we didn't, you know, we started working on it. We kept it private for a while, but after about a year's work, we were ready to announce it to the world, which we did at last year's RStudioConf.
So, today, I'm going to talk about some of the things that I've learned in that time, a lot of things that I've learned from working on Shiny for Python in particular, and some of the opportunities that I think exist now that I've observed along the way.
Ten years is a long time
Okay, so, well, so, the first lesson is that 10 years is a long time. It's a long time to develop a code base. It's a long time to build tools surrounding Shiny. So, it's, you know, Shiny, the Shiny-related stuff that Posit makes, it's not just Shiny. It's Shiny, it's, there's a web server layer, HTTPUV, there's BSLib, which Carson just gave a talk on. There's ShinyTest, ShinyTest2, shinydashboard, all sorts of different things that are built surrounding Shiny.
It's also a long time for others to build tools that enhance Shiny. So, you know, a lot of, many of the speakers at this conference have built all sorts of really, really cool packages that help you extend Shiny and make it look better and let you do new things with it. And 10 years is also, it's also a long time for a community of users to grow and get to know each other and build connections.
So, you know, sometimes I think about Shiny for Python and, you know, in the Python world, for Shiny for Python, we're just getting started. And, you know, all this stuff that is great about Shiny for R, all the stuff that has taken time to build, it's not there yet in Python because we, you know, we just, we only announced it about half a year ago. So, and these things don't just, they don't just happen overnight. They take time and they take effort. So, hopefully with time, we'll be able to build these things. And not just us, you know, at Posit, but as a community, we'll be able to build all sorts of things together.
Naming things is hard
So, the next thing is a little bit more technical. So, naming things is hard. This is a lesson that if you've been working on software for, you know, for any period of time, you've probably learned this. Naming things is hard, especially if these are things that other people are going to be using. The good thing is that it's easier the second time around.
So, in the early days of Shiny, the names were based on, the names of various functions were based on what was common in our ecosystem at the time. So, we end up with the names like these. So, we have text input, text output, verbatim text output, plot output, render plot, UI output, and render UI with this, you know, different interesting capitalization of UI here as opposed to there.
So, you know, over time, we had wished, we sort of, we started to wish that we'd use some different names. So, for these names, you can, as you can see, we use camel case for most of the names, which is at odds with the snake cased, snake case used by tidyverse packages. And, you know, these days, we try to sort of use the same naming convention that tidyverse uses. And you can, you saw that probably in, in Carson's talk about bslib. That's how we name our functions.
The other thing is that the, we wish that we had names that were more discoverable and auto-completable. So, you know, if you're writing your Shiny application, and you want to know what kind of outputs there are, you can't just type output and then hit tab, and then it'll, you know, kind of a pop up a list. You have to know you want a text output, or you have to know that you want a verbatim text output. You have to start typing, you know, the, the T-E-X-T, and then hit tab, then you'll see that there's text output. But that still won't show you that there's this other, you know, this verbatim text output.
So, when we worked on Shiny for Python, you know, we actually talked quite a bit about what we should name things. So, on the left, we have the R, the names of these various R components. And on the right, we have the Python names. So, we have slider input, a text input, data input, action button. That one's a little bit different, doesn't say input on it. And in Python, all of these, I've prefaced them all with UI dot because they're in the UI sub-module of Shiny. So, typically you'd write them out like this, but you, you don't have to in Python. You, but you would say like UI dot input. And then if you hit tab, you'll get a list of all these different types of inputs, including the action button.
So, I think that's, you know, that's a big, that's a big benefit for people that are learning, or even if you're experienced and you, you know, you have been away from, been away from writing Shiny stuff for a while. When you come back to it, this is, this will help just help you get things done faster.
And there's a couple other things that I want to touch on in terms of naming. So, for reactive components, so like the, you know, actual reactivity stuff. So, I've long wished that we had names that are easier to talk about and easier to understand and more, more in line with present-day standards.
So, in particular, the two functions that I'm talking about are reactive and observe. So, with reactive, this one has always been a little bit difficult to talk about because you can say this is a, well, in our documentation, we call it a reactive expression. But the function name is just reactive. So, sometimes we say, oh, this is a reactive, like as a, as a noun. But we also, you say that things are reactive, like an adjective, like reactive values are reactive. This reactive expression is reactive and observers are also reactive. Like they, they're all part of this reactive computation model. So, that is a little bit confusing to write about.
And the other thing is that a reactive expression isn't exactly an expression. It's typically, and you know, when you're talking about computer stuff, when you say you have an expression, you have to explicitly evaluate it. But a reactive, this kind of reactive expression is actually more like a function. You invoke it to get the value.
And also observe is also a little confusing. I think people don't really know what observe means. And, you know, you explain, you can explain to people that involves, well, you, for observe, you give it an expression and that expression, you know, we're not interested in the return value. We're only interested in the side effect that, of that, of that code.
So, in Python, we've used some different names. We've used, we used reactive calc to represent the same thing as a reactive expression in R or this reactive function name. And for observe, we're calling it a reactive effect. So, so this lets us use the name reactive as an adjective. Like this, this thing is reactive. This calc is reactive. An effect is reactive. A reactive value is also reactive. And then there's not that ambiguity of, of like, are we talking about, you know, this adjective reactive or a noun reactive?
So, this is a reactive calc, this is short for reactive calculation. This returns value. And this is a reactive effect, which, you know, you give it a function that you want to execute for its side effects. So, and that's in line with if you've ever used the web framework, the JavaScript web framework called react, there's something called use effect. And you give that a function that executes for its side effects. And again, you still, you do have to understand like what is a side effect in a function, but at least it's right there in the name. Like this is, you know, it's, it is, it is informative what that's for.
Learning Python as an R user
Okay. So, the next thing that I learned working on Shiny for Python is that learning Python is not that hard if you have a reason to learn it. So, now this actually shouldn't, you know, this shouldn't really be surprising for people because Python has a reputation as being one of the easiest programming languages to learn.
So, you know, you might be wondering, it's not that hard, but you know, you might be wondering, well, will I be a total beginner again, like programming language beginner when I, if I start learning Python, or will my R knowledge transfer over to Python? And this actually, this is a more interesting question than it might seem on the surface because R is kind of an unusual language.
So, I'll just point out a few of the things that sort of stand out about R that's weird. So, in R you have one based indexing. So, if you want to get the first element of a vector or the first row of a data frame, you start at, you start counting at one and in Python you start at zero. And so, and most programming languages are like Python. R there's no scalar values. There's just length one vectors.
And in Python, there's this, there's a, you know, a distinction between scalar and vector objects. So, if you've got a single, you know, a single value versus like a whole, like a, a vector of values.
Next. So, R is a Lisp-like functional language and Python is an imperative object oriented language. Now this thing about R being a Lisp-like language, if you look deep into the source code of R, you'll see that it's actually implemented just like Lisp. It's, it's, yeah, it's, it's very different from other programming languages or other common programming languages.
Okay. And also our data frames and statistical tools, they're deeply embedded into the language. And with Python, those, all that stuff, that's all, you know, that all comes from third party modules. Now, I think if you're an R user, you know, some of these things are actually not that hard to deal with, like one based indexing versus zero based indexing.
Now for the next one, so R being a functional language and Python being this imperative object oriented language, they are, they're actually quite different at the core, but at the user level, the two languages aren't so different. Both of them are somewhat, somewhat functional languages and also both sort of object oriented languages. So, what I mean, like by being functional languages is that Python supports first class functions. First class function is a function as an object that you can pass to another function. And that, that then it can invoke that function later where you can compose those functions. So, R does that and Python can do that as well.
But there is, there actually is one important difference. So, in R, most data structures are immutable and in Python, most data structures are mutable, meaning they can be changed. If you have a data structure, it can be changed from anywhere else in your program. Well, if that, that part of the program has a reference to the object.
So, I'm just going to illustrate this with some code from a Shiny app. This is like, this is the server code for a Shiny app. So, let's say you've got a data frame. It's just got two columns of numbers. And, and then you have a render table. So, let's say you want to render this table so that the Y column prints with two digits after the decimal point. So, they look like this, 0.10. All right, so what I'm going to do is I take, I use the sprintf function. I give it my string format and then the column. And then I store that back in the data frame. And then I return the data frame, but this time it has a column of strings. And then the table that gets printed out will have numbers like that. And finally, I'll do a render plot where I just plot this, the X and Y values as a scatter plot.
Now we can look at the same code in Python. And if you haven't seen the Shiny for Python code, well, here's your chance. This is, it's not so different. So, you create a data frame using, this is using pandas. I say pd data frame and I give it my values here. And then I do a render table. And I save that back in my, the Y column of DF. And then I return that data frame. Excuse me. And then I'll do the render plot. And I can use the ggplot function here because this application uses plot nine, which is a, it's a implementation of ggplot2 for Python, which works remarkably well. So, if you haven't, if you use Python and you haven't used plot nine, you should definitely check it out.
Anyway, if we do this, then we get a plot that looks a little different. This, this looks like more like a straight line. So, let's put these two plots side by side here. So, in R you have this sort of, you know, this thing that goes with an upward trend here. And in Python, it looks, it almost looks like a straight line. And if you look at these Y values, it's a little weird. It's just 0.1, 0.25, and 1.6. And these are evenly spaced.
So, you know, why did this happen? This is, this is something that is, you know, points to a really deep difference between R and Python. So, in, in the R version of the code, when we did this, created this column of strings, and then we saved it back into the data frame, when you do this assignment here, R actually makes a local copy of this data frame df when you write to it. So, it does it in a way that's, usually it tries, it does it in a way that's efficient, so it doesn't have to copy everything. But it makes a local copy of df within this scope here, those curly braces, and then it returns it. And that does not affect this df. So, this df refers to the original data frame.
Now in Python, Python is actually more like, the way that Python works is, is like more programming languages. So, R is actually the weird one here, even though it behaves in a way that is convenient. So, in Python, when you do this assignment here, you say df bracket y equals this thing here. This actually alters the data frame in place. So, the, this df refers to some object, it's a data frame, and then bracket y, and then I'm assigning to that, so I'm altering that data frame in place. It doesn't make a local copy. So, then when I come and do, run ggplot, this df is actually the altered data frame where the y, where the y column has strings. And that's why it looks like this. Those are, these are actually, these are actually categorical variables.
All right, now you can fix it in Python, fix the code by, when you, you're working with df up here and you're modifying it, you say df.copy. And, and then I'm saving that as df2, and then I do all this stuff. I assign it into df2 and then return that. So, that doesn't alter this df that I'm doing, I'm plotting here. But, you know, you have to know, you have to be aware in Python of, like, oh, is this, you know, is this thing a reference to an object that somebody else has? If I alter it, do I have to, you know, can I make a shallow copy or can I make, can I, do I have to make a deep copy? These are issues that you have to think about with Python. So, in this respect, like, R is actually a lot, is a lot better to work with than, than Python is.
Now the other big difference is that data frames and statistical tools in Python are in third-party modules. So, what I've found is that the basic manipulation of tabular data in Python with pandas is kind of clunky and unintuitive. Now, this is just, obviously, just an opinion. This is my opinion. But I've talked to other people, and, you know, people who even who aren't really R users seem to agree that it is working with tabular, working with data in Python is, you know, it's kind of awkward. So, it's my opinion, but I'm also right. So, there you go.
So, it's my opinion, but I'm also right.
And, yeah, now the other, one good thing about it, those, like, these things being in third-party modules for Python is that there are, you know, other competing data frame libraries out there. And so, there's other, you know, it's possible for something else to come that is easier to work with, or faster, or, you know, whatever, whatever advantages you can think of. But at this time, you know, it's still, it is, like, the go-to library for manipulating tabular data in Python is still is pandas.
Shiny's technology still stands up
Another lesson I learned. So, is that Shiny technology still stands up well. So, coming into the Python space with a web framework, so there it's, there are other things out there that are, that already exist. So, for example, there's Dash and Streamlet. Those are, these are two popular web frameworks that are, that have, like, roughly the same level of abstraction and would be used by similar people that are doing data science.
And, you know, Shiny has been around, like I said, it's been around for 10 years. But I think, I think the technology still holds up really well. So, you know, if you compare it to Dash, now these are, these are, again, I'm going to just preface this by saying that these are, these are my opinions. This is subjective. So, take it with a grain of salt.
But Dash, to me, it seems, like, kind of hard to use for a non-web developer. If you don't already know a lot about how to build web, how webpages are built, Dash, well, it's going to force you to learn those things if you want to, if you want to make something nice. And if you look at the core of Dash, like the computational model, it uses this, an explicit callback model instead of reactivity, which means that you have to be a lot more, you have to be a lot more explicit about how you use, how the relationships between various components are set up, whereas Shiny can figure those things out automatically for you.
And the specific way that they've done it with Dash can result in a lot of data being sent back and forth between the client and the server, because they try not to keep any state on the server. So, which, you know, there's benefits to that, but there's also drawbacks.
All right. So, now, if, you know, if we compare it to Streamlet, Streamlet is an interesting thing. It's gotten very, very popular, and, you know, there's a reason for that. It's super easy to get started making a Streamlet app. And that's because you can start with a script doing your data analysis, and then you can just, you can turn that into a Streamlet application by adding, just by adding some lines of code, right? You don't have to, like, rewrite the whole thing.
But the issue with that is that you quickly hit a ceiling of what you can do with Streamlet. And this is, you know, this is a tradeoff that they chose to make, and it obviously works for some people. It works for a lot of people. But it can be quite limiting. So, what's really limiting about it is the execution model. So, basically, every time something changes, every time an input changes, like if you move a slider, or, you know, you type in some text, it'll just, it'll re-execute the entire script each time something changes.
And they have stuff, you know, they have a lot of features that can make that more efficient, so that it's, you know, you don't have to do, like, massive data processing every time, like, somebody moves a slider, if you've set up your caching correctly. But there is a limit of what you can do with caching. And there's just a limit to what you can do in general with a model where, you know, it just re-executes the whole thing every time. But, you know, it is easy to understand, so, what's happening. So, there is that. But again, you know, I think Shiny's technology, the reactivity, it holds up really well, compared to what else is out there.
Looks matter
So, the next lesson is that looks matter. What do I mean by that? Well, we know from talking to users that it's really important to be able to easily create nicer looking apps. And, you know, we are, this is, so, when Shiny came out, it was really impressive what, you know, the way that things looked, it was really impressive what you could do with a small amount of code. But now there's other things out there that can do, there's other frameworks that can do nice looking apps pretty easily.
And so, like I said, well, if you saw Carson's talk prior to this one, you know, we're working on building the foundation to make it really easy to create nicer looking apps. But that's, like, the foundational stuff right there. Not, like, the really smarter surface level. Like, if you want to make it really easy to create nice looking apps, we're not quite there yet. But it's also, like, why shinydashboard is really popular despite its age. shinydashboard has been around for a number of years now. And, you know, it looks okay. It looks, maybe it looks a little bit dated, in my opinion. But it's still quite popular because it's really easy. It's just, like, it's really easy to just create something that looks, it looks professional.
Or at least professional enough. And you don't have to think about it too much. And, of course, there's other many third party packages out there, like Shiny Semantic from Epsilon, BS4Dash, from our interface. But, yeah. So, looks matter. And this reminds me of something. I think Hadley Wickham said this once. Maybe not quite in these exact words. But, you know, with ggplot2, people came for the beautiful plots, and they stayed for the beautiful conceptual model. So, you know, that's I think that's a really useful way of thinking about things.
With ggplot2, people came for the beautiful plots, and they stayed for the beautiful conceptual model.
Balancing progress and backward compatibility
Next lesson. Balancing progress and backward compatibility is hard. So, for Shiny, we originally shipped it with, like, batteries included, in quotes there, which makes things easier for users to get started. So, we had everything that you needed in one package. All right. We had sliders, we had, you know, Bootstrap for the web page layout, and so on.
So, in the beginning, we actually had a different slider library than we use today. We used an older version of Bootstrap. We used an older version of jQuery. And changing these things is hard. The more users you have, and the more users you have, the harder it is to change them. So, with the slider library, you know, changing that was a challenge, although we did it a long time ago. Like, what if your slider has generates steps at different sizes? You know, then if your new slider generates steps at different sizes from your old slider, then people's apps might behave a little bit different.
And, you know, we bit the bullet, and we just did it when we, you know, we just changed it at some point. But the more users you have, the harder it is to change that. And similarly, Bootstrap version 2, at some point, we switched it to Bootstrap version 3. But that involved changes to markup, that involved changes to CSS classes. And we provided a shim for people to do that. But now, with BS Lib, and the Bootstrap project in general is on version 5, and that's what BS Lib is using. So, being able to move people on to that, you know, it takes work. It doesn't, you can't just, like, upgrade it, and then everything's going to keep working.
But the good news is that Shiny for Python sort of provides a laboratory, and gives us a place and a reason to try new things, and port some good ideas back to R. So, you know, there's motivation to try to change these things, like change function names. And if things work well, then, you know, we can take some of those good ideas and bring them back to Shiny for R.
The Python world is big
All right, next. The Python world is big. It is very big. So, I read a survey that Stack Overflow gave out in 2022. And in their survey, they found that 48% of users use Python, and 4.7 use R. So, that's 10 times as many people that use Python. And that's almost half of the people that use Stack Overflow. So, that is, that's an enormous percentage of programmers.
Now, it is true that in Python, like Python users cover many, many different domains. So, only some of those users do data science. And in R, almost all the users do data science. So, that means that, you know, not all Python users are necessarily people that would be interested in creating interactive data web applications. But enough of them are that it's a really large population.
So, in the R community, Shiny is a really large mindshare. In the Python community, it doesn't. And that's, you know, that's something we have to earn that. And like I said earlier, at this stage, I think we have the best core technology. And we're working on better documentation and app aesthetics to earn that mindshare in Python community.
So, if you're thinking about learning Shiny for Python, remember that it is a really big world out there. And there's a lot of Python shops out there that might be interested in the things that you can create. So, you know, just there are, this could, you know, learning Shiny and building cool web apps can open up, could open up a lot of opportunities there.
The R community is great
All right. Next lesson. The R community is great. So, R, as I'm sure everyone here knows, has a very inclusive and welcoming community. And, you know, people in the R community really put a lot of work into this. It doesn't just sort of happen by itself. And in contrast, you know, Python, well, it has pockets that can feel like that. But that feeling is not as widespread. So, it's something that, you know, I always appreciate about the R community. And it's something that it's a feeling that we try to foster where we can.
WebAssembly and shinylive
Another lesson that we've learned, which is that this is WebAssembly is very, very cool. So, WebAssembly is what makes shinylive possible. And so, if you don't know what shinylive is, I'll give you a really brief overview. So, it is Shiny, currently Shiny for Python, without a server running Python. So, normally when you deploy a Shiny application, you need a server that runs Python or R. And then that serves the application. And the user loads their web browser and they connect to that server.
With shinylive, Python runs inside of the browser itself because it's compiled to WebAssembly and browsers can run programs that are compiled to WebAssembly. So, you don't need a separate server running R or Python. And this allows for deployment on static web hosts, which can scale to extreme amounts of traffic for extremely low cost. So, that's great if you have an application that for certain classes of applications, if you need to handle a lot of traffic and they don't have some certain needs, which I'll talk about in just a minute, shinylive is really great for that.
So, it's not appropriate for all use cases. Like, you can't keep any secrets. If somebody can run your application on their computer, you know, you don't want to put any, like, secret authorization tokens or anything like that in there because they'll have access to them. You can't make secure database connections. And it's not, you know, it's not necessarily good for large datasets because you might not want to transfer large amounts of data to each user's browser.
But so, even though it has those limits, it's still the really good thing about it is that you can use the same knowledge of Shiny to create shinylive applications or traditional Shiny applications, whichever is appropriate. It's the same, you know, you're just writing a Shiny application. There's just a few different things that you can do between regular Shiny and shinylive applications. So, you can deploy it whichever way is appropriate. And this gives you much more range in the space of, like, the web applications that you can create.
So, just a little history about it. It started off as shinylive started off as an experiment. You know, we took some time when we were developing Shiny for Python to a few months into it, we took some time to sort of explore, like, hey, what's happening in the Python data science world? And we came across PyDyed, which is this build of Python for WebAssembly. And I was like, okay, you know what? It'd be really cool if we could just run this in the browser, if we could run Shiny entirely in the browser without a server. And, you know, it worked out better than we expected. So, that was really exciting to see it come together.
And we learned something from that experiment, which is that WebAssembly is awesome. And it's great for educational material. Learners don't have to switch away to a different app or web page to run code. If you've looked at the Shiny for Python website, there's on each page, there's examples that you can run. And they're just running right in the user's web browser. So, people don't have to switch context to somewhere else. They don't have to, you know, you don't have to spin up a server somewhere, install an application. You just can run it right in the web page.
And also, from the, you know, from the perspective of somebody who creates the stuff, the content, you don't have to worry about hosting, because you can put in a static web host, which is very cheap, or scaling, or security. Like, all that stuff, like, you know, that's not an issue. Security, we used to, you know, we used to have people, I probably still do, running crypto miners in our RStudio cloud instances. And that's just not an issue. If somebody wants to run a crypto miner in shinylive, that's fine. They're running on their own computer. So, good for them.
And this brings me to Web R. So, this is R compiled to WebAssembly, running in the browser. This was created by George Stagg. And, you know, as an organization, we learned, what we learned from shinylive, helped convince us that it would be worthwhile to hire George to work on Web R full time. So, when he started it, he wasn't working at Posit. But what we learned from, you know, from working on shinylive, it's like, oh, this is, this is really, this is the real deal. This is worth, you know, getting George to work on this full time. So, and you may have heard, Web R had its first release, 0.1.0 last week. So, you should definitely go check it out, if you haven't already. It's really cool to see, you can run R code right in the web browser.
And, of course, the reason I'm talking about this is, yes, we are working on shinylive for R. I can't give a firm date for this, but it is something that we're working on.
And, of course, the reason I'm talking about this is, yes, we are working on shinylive for R. I can't give a firm date for this, but it is something that we're working on.
Opportunities for R and Python users
Okay. So, for the R users out there that know Shiny, Shiny is a great reason to learn Python. So, here's what I mean by this. So, there's many groups out there that use Python and not R. And knowing Shiny well is a good way to have access to both Python and R shops. So, if you know Shiny for R, you can, I think you can quickly learn Shiny for Python, even if you don't know that much Python to begin with.
And then, and then if you start working with a group that does Python, you can be an expert in something right from the beginning. So, you can be useful right from the beginning with, with a Python group. Whereas, you know, if you just, if you were trying to, if you're, if you just know R, and then you want to start working with some Python people, you might otherwise just have to start from the beginning and be like a total, total novice. But if you're, if you, if you learn Shiny for Python first, you can do something that's really useful and unique right away.
All right. And for Python users, Shiny gives you options. So, if you're a Python user, and you, you learn Shiny for Python, you can create traditional web applications that run on a server. You can create shinylive web applications. And you can use your knowledge to write Shiny for R applications. You know, this, again, if you're just a, it's similar to what I was saying about R users learning Python. If you're a Python user, and you want to learn R, you can transfer that Shiny knowledge and be, over to R and be useful right away. And, and, you know, start working with R users.
And that's it. That's what I have to say. Thank you very much. Thank you for listening to what I have to say.


