Aaron Jacobs | Introducing xrprof: A New Way to Profile R

Transcript#

This transcript was generated automatically and may contain errors.

Hi, my name is Aaron Jacobs. I work at Crescendo Technology in Toronto, which is a company focused on the technology side of sports betting. I'm also the author of xrprof, which is a new profiling tool for the R ecosystem that I'm hoping to tell you about today. It's an honor and a privilege to be part of this year's unusual RStudio conference.

Now, from my perspective, there's been an important change in the R ecosystem in the last five or so years. R has traditionally been a language for data analysis, visualization, summary, exploration locally, interactively, with some reporting. But with the emergence of frameworks like Shiny and Plumber , it's now being used more and more to build real applications. Now, on the one hand, this has created huge opportunities for our users to expand the scope of what they can do, and the value that they can bring to themselves and their organizations.

But with this increased opportunity has come much increased complexity and increased responsibility. The reality is that for many R users, presenting these applications means that these applications have users other than their authors for the first time. And often they're running in environments that are not the same local development environments that we're used to. This means that you now get bug reports like, hey, why is your Shiny app slow? Or this report is taking an hour to run, but it used to take two minutes. What's happening?

Profiling in R

Now, traditionally, to answer performance related questions like this, you would make use of a profiling tool. Profiling itself is a term from computer science. It effectively means that we collect data about where our program is spending its time. Effectively, which functions are being called most often.

Now, for R users, this is actually kind of a comfortable situation because profiling produces data. And R users know how to deal with data. We can use all of our favorite tools for analysis, visualization, summary of profiling data, just like we could with any other kind. And profiling itself is a fundamentally empirical approach to performance analysis. Don't guess, measure. And then use those measurements to make informed decisions.

Don't guess, measure. And then use those measurements to make informed decisions.

Profiling is such an important field for performance analysis that actually R has its own built-in sampling profiler. It's available through the RProf function. And there's a variety of functions in base R and the wider ecosystem that allow you to make use of RProf's format to produce, again, these analysis, these summaries, these visualizations. And so the prof tools package or the prof is package. There's also AProf and GUIProfiler, which are slightly less popular. These all do exactly this summary, analysis, visualization.

And actually, because there's a wider profiling ecosystem beyond R, lots of these packages can actually produce conversions to popular formats that are used in external tools. So for example, the ability to convert RProf formats to KCacheGrind or Google's PProf tool, or the SpeedScope format, or also the original FlameGraph software format. These are all available in the R ecosystem. So there's a big existing set of tools you can use for performance analysis using the RProf function.

Which just goes to show that even though you're writing something in a fast language, like C or C++, doesn't mean you can't make exactly the same kind of performance issues that you'd encounter in R code.

So, hopefully, that serves as some inspiration as to what might interest you if you're trying to do remote profiling for any kind of production code. Or you're trying to do profiling, local or remote, that involves R code that uses lots of C or C++.

Now, I should say XRProf itself, the project is open source. It's available on GitHub. Most of the development this year was generously funded by the R Consortium. And I'm currently looking for users to try it out, see if it works for you, see if you can break it, and let me know. I'm hoping that XRProf can become a new tool in the R profiling toolbox.

R profiling toolbox. I'm Aaron Jacobs. You can find me on Twitter, on GitHub, and on my own site. I work at Crescendo Technology in Toronto. And if this stuff kind of piques your interest, we're always hiring. Thanks for listening, and I hope you enjoyed this and all of the great talks at this year's RStudio conference.

Aaron Jacobs | Introducing xrprof: A New Way to Profile R | RStudio (2021)

Transcript#

Profiling in R

The problem with local profiling

Introducing xrprof

Profiling C and C++ code

A real-world example: mongolite

Featured software#

rstudio