
Barret Schloerke | {shinytest2}: Unit testing for Shiny applications | RStudio (2022)
Manually testing Shiny applications is often laborious, inconsistent, and doesn’t scale well. Whether you are developing new features, fixing bug(s), or simply upgrading dependencies, it is critical to know when regressions are introduced. The new {shinytest2} R package provides a toolkit for unit testing Shiny apps and seamlessly integrates with {testthat}. Under the hood, it uses the new {chromote} R package to render apps in a headless Chrome browser with features such as live preview and built in debugging tools. In this talk, you’ll learn how to test Shiny apps by simply recording your actions as code and extending it to test more particular aspects of your app, resulting in fewer bugs and more confidence in future development. Talk materials are available at https://bit.ly/shinytest2-conf22 Session: I like big apps: Shiny apps that scale
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
So clap twice if you've ever written a Shiny app. If you've ever written a Shiny app, you should clap twice.
Clap twice if you remember all of your app's features.
And then clap twice if you'd feel comfortable having a co-worker rewrite part of your app. Still not a lot of you.
So clap twice if you feel comfortable having a co-worker rewrite a feature. This is a big motivation to have something with automated testing as well. And having other, working with team members, this brings a big motivation to have something with automated testing so that we don't lose our app features or we could have a co-worker work on a new feature and not have it destroy your existing features.
The case for automated testing
And so we'll get into this throughout today. So this is an example of how you can do this in a lot of languages. And it does this in very few lines, I believe. If you were to do this in other languages, it would take a lot of work.
And then, like, as Anika said, there's this very familiar workflow within Shiny. And it's where we're going to tweak some of the app, we're going to add and adjust some of that reactivity. We'll then click run app, and then this is why we love R, that we can manually look at it, we can manually iterate on this, and we can do this every, like, 30 seconds, 10 seconds, even if you're really quick. And that's why we love R.
And then, like, if you have, like, 20 app, what if you have many team members? What if you have things where, like, you know, you can't quite keep it all in your head at once? And with manually experimenting, things are lost in translation. Can you explain to your coworker all of your features and have them make sure they manually test it properly? Like, I don't believe I could do that, and I'm, like, very familiar with infrastructure.
And there's no rules once you start doing more than just a toy application.
Introducing testthat
So, this kind of opens the door to a different package, and this is the testthat package. And I think this is great, you can build your whole test suite and build out your functionality. I remember having many situations where we might try to build out 100 tests, and then let's build those methods and try to pack those tests. Test-driven development. You're very aware of what's going on.
You can also have more robust code because maybe you're testing for chicken fingers. That's not something that would automatically test by default during my manual test. If you have better code structure, because you're breaking your functions down, if you test each Lego piece of your code, then when you put all that together, you can believe that you have a strong code structure at the end. Very robust. Because you know each component is doing very well.
And then finally, you have call to action. Right now, I, you know, getting very little sleep. I have a new baby boy. It's, like, you can set in a broken test and say, like, fix this in the morning, and then you don't need that 45 minutes to kind of warm up for the day. You can kind of jump back in and have that test while you are deep into your coding session.
So let's take a look at testthat and kind of see what an example is. I talked about those formal tests. This would be, like, expect equal, where we can do our standard basic test. Two times two, does it equal four? Yes. Great.
Kind of a nonstandard one, make it a little bit more robust. Two times NA, that's an NA integer. You do NA regular, that's actually a logical, and it would fail without the underscore NA integer. And then new to testthat edition 3 is snapshot testing. Snapshot testing is where we don't do an expect equal X to Y. We say, kind of keep track of this file, and then, you know, just make sure it doesn't change. Because that file may contain lots of content that can't be put into code. So we can make an example file, and then we can say expect snapshot file from testthat, and it will compare it to say, did the file change? Yes or no? And if it didn't, then the test passed.
So if we run DevTools test within the IDE, you can see that those three tests pass very quickly, and you get a success. And testthat does a wonderful job of encouragement, but that was a bad-ass code. You test over and over again, you get a new random message every so often.
Introducing shinytest2
So I'd like to take these two worlds and combine them into one. And here's what we'll get shinytest2.
The shinytest2 is unit testing for Shiny applications. But I actually want to adjust that a little bit, because it's very difficult to write all of the unit tests and then fill in the logic. Instead, it's more regression testing for Shiny applications. In that, you have existing app behavior, and you've seen this through your own manual testing, and you don't want to lose this.
Typically, we do something like three hours of work, and I want to make sure that you don't lose that three hours of work, so we're going to insert a test to make sure that behavior does not regress or does not go away.
In shinytest2, the extra helper methods that exist within there, they all leverage testthat expect snapshot file, so there's nothing special that shinytest2 is doing, it's just kind of orchestrating everything into a nice, nice package. We're going to show you a demo of that in a second.
So, shinytest2 is a chromote-based browser, typically Chrome, and allows you to execute headless browser tests. So, it doesn't need to open up, it can execute just as GitHub action or even hidden in your terminal. But one of the beautiful things is that you can view that live test application, and I'll show a demo of that later.
What about shinytest?
So, elephant in the room, what about shinytest? shinytest, if you are familiar with it, it's entering maintenance mode. The headless browser has been set on end of life for a few years now. It, Phantom JS is not really usable. It's not compatible with bootstrap 5. It's not bootstrap 5 compatible, so it's impossible to do these tests.
So, shinytest2, also, it has a different headless browser, so all of your prior screenshots, if you had them, they would be a failure, and that's not something that I want to introduce on everyone. Also, because of testthat, we have a different file structure, and since we had the opportunity, we also streamlined it, so it's a little bit different, but it's a little bit easier to leverage testthat, as well, as a different R API, as well. So, that's why there's two different.
A toy example
So, before we get into some deeper code, let's look at a very small toy example, Shiny app, and what it's going to do, it's going to ask a name, we're then going to greet that name, and we're going to say, hello, Barrett, that's it. Very toy app, very complicated, I know. But it's nice, because we have a text input, we have an action button, and we also have a text out. These are very common components within your Shiny application, and they may be located in very different locations, but, like, these are a good example of what happens.
If I was to set that up as a test, we would need to set up our own testthat. So, we would say testthat, the app says, hello, Barrett, and this is in the file there. And if we were to insert some shinytest2 code, you don't need to know it exactly, but we can kind of walk through it. Where here, inside the test, we will set up our app driver, this is the thing that kind of orchestrates everything, gets the remote going, gets your Shiny app going, and then it handles all the interactions. And then we tell the app, hey, set the input, and say, name is Barrett. And then, when that's all done, click the greet button, and then, when everything has been done, as well, say, expect values, because I want to make sure my inputs and outputs are consistent.
So, let's run that test, because we want to run these tests over and over, and rapidly, constantly. In real time, it took a few seconds, not too bad. Now, your three hours of work that you kind of worked on is now captured, and being able to replay it in a few seconds. Make sure that that work is never lost.
Recording tests
But writing tests is hard, says everyone but Hadley. The good news is, is there's a method called record test. And I think this is great, because you don't need to write those tests yourself. We can capture all of the Shiny interactions, and then record that as a test.
And I really want to stress this, because I love the McDonald's quote of, if you have time to lean, you have time to clean. But in this case, if you have time to rest, you have time to record a test.
But in this case, if you have time to rest, you have time to record a test.
And this is because recording these tests is not that long. When you do your manual testing, it takes just a few seconds. This whole GIF here is just ten seconds. It's a very familiar greeting, we do the interactions, and then I click expect values, and I save an action. That's it.
So ten seconds for me to go in, do the manual manipulations that I want, I then save my Shiny values, or expect my Shiny values, and I save an action. And then I click save. And I think it's just great, and all of your app will live on the side, and you can see your code update as to what's being generated. And then when we're all done, after at least one expectation is made, then we can save.
That code that was recorded, this is actually what is stored. So I didn't doctor this one, and it is oddly similar to the one that we looked at before.
Viewing the headless browser with chromote
But remember earlier I was talking about how we had chromote? In chromote, you can view your headless browser. If you're familiar with shinytest, it was a black box, you could only poke it from the outside and pray. There was not much you could do. With our app, they are getting more and more complicated, and that's awesome. It's wonderful to see.
But if it's a headless browser, it means you can't see it, like our regular browsers today. But chromote allows you to view it, so there's a method you can run interactively called app dollar or dollar view.
And this is what it will do. So in this GIF, I have the test on the right that I'm running interactively in my console, and what we'll do is we will library shinytest2. I will run my app driver, working directory of the app, call app view. And this will open up Chrome browser on your local machine. You can see the app there, and you can, on the right side, you can also see all of your familiar Chrome debugging tools, which is great.
You can also see your console at the bottom, and in shinytest2, I'll spit out some information about what's happening, at least with my knowledge. If I call set input, you can see the app update live. It didn't have to do anything, you didn't have to refresh. When we clicked read, it actually updated live, you can see the output there.
I think that is just so powerful, because you can be, like, why is this test not working? Run, run, run, run, insert app dollar view in the console, and you can see exactly where it's at. And you know your app, and you'll be able to discover what's wrong, whereas shinytest2 can only tell you what's going on.
Tips and suggestions
So, I have some suggestions as you leave the Shiny Shire, and your way to test. Number one is to use shiny export test values. This is a method that I was actually unaware of, and I'm on the Shiny team, and I really want to give it more limelight, because it is very powerful. We don't need to test just input and output, there's actually a third group called export. And you can export React values, you can export using React, and you can export your React value, and it doesn't need to be an input or something that has been rendered as an output. And this is great, because testing mode will allow for this to work, and you can expose more information and more work logic that doesn't be in your normal app that you use day-to-day.
Also, try to limit objects under your own control. If you do something like data structure, and then you send it into DT, and DT updates because it needs to, like, nothing wrong with that. It's not going to do anything wrong with DT doing that, but your output might change, which would produce false positives. So maybe just test your data frame going in, and then just trust that DT will always run.
Another one is, again, if you use expect snapshot file, be careful about what's going on with those snapshots. All the snapshot does is says I agree with myself, but it doesn't mean that it's correct. If you do agree with yourself, you are correct, so that's good, but there's that little logical loophole. Be safe, you can always use manual expectations or explicit expectations, such as expect equal, because that will always be truthful if you're comparing it to a static.
And then, lastly, minimize the number of screenshot expectations. Screenshots are worth a thousand tests. They do test everything, but you can also imagine that's a very high fallback. They're very brittle, they're open to a lot of issues, and also you cannot compare across operating systems or R versions, so it gets very unwieldy very quickly. So unless screenshots are absolute, I would argue maybe not do that.
Recap
So recap, regression testing for your app, you want to keep that behavior over time. If you have time to record a test, and then interactively view your application whenever possible if you run into problems. And hopefully Gandalf will change his tune from you shall not pass to pass.


