Resources

George Stagg | WebR: R compiled for WebAssembly and running in the browser | RStudio (2022)

In this talk I introduce webR, a port of R to WebAssembly using Emscripten. WebR brings a full R environment to the browser, enabling R code execution, numerical analysis, loading packages and more. No local or cloud-based R servers are required as all computation is performed within the browser. I give a brief overview of our build process for webR, describing the toolchain and some of the issues we encountered. A publicly available web-based R session is demonstrated, with package and plotting support. Talk materials are available at https://github.com/rstudio/rstudio-conf/blob/master/2022/georgestagg/webr%20-%20George%20Stagg.pdf Session: Lightning Talks

Oct 24, 2022
5 min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

So WebR is a version of R compiled to run in a browser. You might want to use the containerization or sandboxing abilities of a browser. You might want to run sort of literate programming type things where you weave software and documentation together into a single object, or even reproducible output. You might have some data science you want in a nice package that is online, or you might just be interested in universal binaries, the idea of running a piece of software on lots of different machines that could even be mobile phones or tablets.

These are all great reasons that I'm not going to talk about.

Today I'm going to talk about why I wrote WebR, and it was actually for my previous life as an academic.

So when I worked for Newcastle University, one of the things I did was I taught R to students. And at the end of the semester, one of the things we do is we evaluate what the students have learned. And we do that by asking them to write some R code, and we evaluate how well that R code runs.

Now, during the pandemic, we had to do that on devices that we had no control over, people's home devices, things where R might not even be installed. We even had students doing exams, very important exams, on a mobile phone, which is not the best thing to do.

So the question was, how do we capture that R code and make sure they're doing the right thing?

The server approach and its failure

Our first idea was actually quite good. We set a server up that sits and listens for students' R code, runs it, works out what the answer is, and then sends it back. So this way, we could give the student a website, the standard way of evaluating and assessing our students' works. And we just plugged into it this service, so it sends the code away, remembers what the student did, marks it, and then sends the answer back.

And at first glance, it's really nice, because the student's just writing R code, and they're just seeing the results. They're happy. So we tested it. It worked. We put it into production, and the first thing that happened was the server crashed. It fell over, and I had a room full of very angry students.

We put it into production, and the first thing that happened was the server crashed. It fell over, and I had a room full of very angry students.

So that was not ideal. This was just before the Christmas break. So I went away over Christmas, and I thought, okay, well, what can I do to solve this? And one of the ideas I had was to instead evaluate that R code locally inside the web browser instead of using a server. And this makes sense, because it means you're dividing the load over lots of different machines instead of depending on that one machine.

WebAssembly and Emscripten

So the way this works is that there's a technology called WebAssembly that's been around a few years now, but is getting more popular very recently. What it is is a portable binary code format. It's a bit like JavaScript, but when it's in a certain format, you can't read it properly. It runs in the browser, and what it does is enable high-performance applications to run in a web browser at almost native speed. It's supported by most modern browsers, and you can kind of think of it as Java, but good.

The next thing that would make this work is Emscripten. Emscripten is a compiler for WebAssembly, so it converts C code into WebAssembly. It's based on LLVM. Again, it's been around for a few years, and successfully used in a few projects, along with spoiler alert, sorry, Winston.

And also, Emscripten should work like this. You give it two commands, there's two commands, emconfigure, emmake, you give it C code, and a WebAssembly package falls out. When it works, it's absolute magic. It's incredible. It just works. You get some C code, and it runs online.

But there's a problem. Because R uses Fortran code. Some of the code it's inherited had original releases that were in the 70s. And even worse, Emscripten can't directly compile Fortran code. Notice I say directly. Because there's a trick. Unfortunately, the trick makes things a little bit more complicated.

It looks like that. I'm not going to go into detail. It's very messy. There is a link there if you're interested in learning more. I will point out, I think I've got time, really quickly, that there is a point in that graph that says choose your own chaos. And that's because you choose between using an ancient version of GCC or an unreleased version of LLVM. So pick your poison.

Demo and results

It does work. There's a Web site. You can go to it, and you can run it on the browser. I would say do it now, but the conference Wi-Fi is a bit iffy, and it is quite a big package that downloads. So it may not work. But when you get home, go to the Web site, try it out. You can load packages. I've built just enough of the tidyverse so you can load ggplot in the browser and make a plot. It looks like that. And, yeah. It just works. Thank you very much.