Brandon Farr | 15 Years of R in Quantitative Finance | Posit (2020)

Transcript#

This transcript was generated automatically and may contain errors.

Good afternoon and welcome. My name is Brandon Farr. I'm the lead quantitative researcher at Copper Rock Capital Partners. We are a Boston-based long-only equity firm serving the institutional marketplace. We have clients from around the world with two-thirds of the money we manage coming from outside the United States.

I started using R in 2005 because it was the most powerful analytical tool I'd ever come across. Each year, R has become more powerful. Fifteen years ago, R use in finance was rare. It's become more common, but what has been the impact?

Let's take a look at this quote from Alliance Bernstein. We estimate that at least 99% of so-called fundamental models for individual companies manifest themselves in Excel. So our industry is still relying on a tool built for the data challenges of the 20th century. People use Excel because it's easy to learn. But as we all know, it then takes on a life of its own. Ad hoc solutions end up becoming the basis for important decisions. But better data analysis requires reproducibility and scalability.

But better data analysis requires reproducibility and scalability.

So in preparing for today, I decided to focus on what makes R compelling for those cases. What's going to dent that 99%? I'm going to focus on two important use cases. First, using R to do better analysis. By better analysis, I mean you spend more time interpreting your data than manipulating it. Second, using R to build what you cannot buy in the marketplace. I'll use Shiny to make my case.

It had been months since I had done the work, and yet within two minutes of reviewing the code, I was able to answer the question accurately and confidently. If I had done the work in Excel, I would have had no chance.

Building Cognos: a custom Shiny app

Next, I'd like to show you how we have used R and Shiny at Copper Rock to build something we couldn't buy. Our quantitative model covers over 10,000 stocks. In the past, we delivered this to analysts and portfolio managers in an Excel spreadsheet. Here's an example of one of the weekly files. It only contains the most recent data. It's frustrating to filter. When you apply filters in Excel, you can't understand the logic. The best you can tell is whether or not a column has a filter applied to it or not. It's difficult to see the relationship between groups of stocks, and it's difficult to integrate this data with other data from our vendors.

So we created Cognos. Cognos is a Shiny app built to meet our team's specific analytical needs. Nothing available from Bloomberg or FactSet matches it. Even though we are a small, focused shop, R and Shiny has allowed us to create this powerful custom application.

Here's a snapshot of Microsoft through the lens of our quantitative model. Our model data indicated here comes from a Microsoft Azure database. We pull in data from Bloomberg via their API, and MSCI ESG data is sourced from flat files. All of this data provides the context that was missing from our Excel-based solution. Data is no longer siloed. Analysts don't have to make the connection. Cognos is doing it for them.

Just seeing the data in the middle of the screen moved our conversations from conjecture to data-based reality. We no longer had to rely on the recollection that, oh, the model has always liked Microsoft. Cognos can tell us, and the orange line tells us it hasn't always liked Microsoft.

On this screen, you see two important components of our quantitative model represented in the scattered plot. Higher scores should indicate better stocks. We've found it most productive to reserve our fundamental research for highly ranked stocks. So by focusing on stocks in the first quadrant, analysts can better allocate their research time. With the interactivity of Shiny apps, our analysts quickly navigate their way through a massive universe. Cognos really helps them kick-start their analysis. It is used by our team all day long.

R and Shiny helped us build something we couldn't buy. Cognos has become our hub for data delivery. It is naturally extensible, needs no training, and requires very few IT resources.

Closing thoughts

So we walked through an example of how easy it is to set up a process to do better analysis in R. Then we looked at how Shiny apps can help you build something you can't buy, matched to exactly the way your team works. I think it is a great example of how a relatively simple Shiny app can have a significant impact. Recall, my focus for today was to make a dent in the 99%. I think these two use cases are examples of how we might start. I encourage you all to chip in.

Q&A

What package, new package, are you most excited about?

Well, as I said, I think I'm just excited about all the changes happening in the tidyverse dplyr. That pattern of being able to read either the tabs of an excel spreadsheet or files that you would get from calling dir on a directory. And using that to be the starting point for a table, which then you use map to pull all the data in. Just allows you to naturally keep either file locations, names, and data all together. And then it flows naturally through the rest of the process. I literally just started using that a month or two ago and enjoy it quite a bit.

I've used R for 15 years. I don't think ggplot2 was out. We had plier and reshape. I think that what can happen as you work, you get set in one way of doing things. And it's difficult to get off those patterns. So this pattern may be old to you guys, but to me it's been a great refresher to my work that I do.

What led you to the use of list columns instead of multiple tables at different levels?

Two thoughts. Attending RStudioConf and listening to Jenny Bryan talk about them with the Lego characters. But it's also just, to me, it just makes it easy to think about the data. Because if I need to get into the details, I can go get it. But very often I want to work at that summary level. And if something's spread across multiple tables, then to do that work, I might need to do joins just even for exploratory analysis.

And you mentioned that you're working with FactSet data, and I assume other vendors. And when the format of that data changes, is that a challenge? And how do you deal with that challenge? Yeah, so I mentioned that this process that I laid out is robust typically to code errors, but not to data file formats changing. Example, depending on when you come in, FactSet will either have seven or eight rows in the header column. Or seven or eight rows in the header of a file. And so that used to kill, if that was hard-coded in a bunch of different areas in your code, that would kill the code. But now, like I said, taking that reading or processing code and separating into a single function, and then reusing that function. If it changes, I make the change once, and I can use it going forward. It's also something, obviously, that's obvious to have as a parameter in the function call.

And have you considered, almost getting political, democratizing R across the firm outside of your immediate group? And I think as an example, replacing Excel with usage by analysts. And if you have or have not, where's the friction you're seeing from that? Sure. So Cognos is the first step there. When I got to Copper Rock, that quantitative model that we delivered every week as a 10,000-row spreadsheet was actually printed out. And people would print it front and back, open it, use rulers to scan and find it. So that was the level of user I was starting with. So to jump right into coding was not going to happen. But I think Cognos is the first step there.

And then I've started more in meetings, because of the readability of dplyr code, actually dynamically answering questions with the code rather than saying, I'll get back to you. And I hope that as people see me do that, the code's up on the screen, they'll start to become more interested.

What was your workflow when designing the UI for the Shiny app? How did you make sure the data was presented in a useful way to the bottoms-up panelists? That's a good question. I think I tried it about three or four times, and I bounced back and forth between Flex Dashboard and Shiny or just standard Shiny. I have two levels of tabs because there's a group of information that's related to stocks, information that's related to our portfolios, and each of those has tabs underneath it. And I had a hard time working in Flex Dashboard, so I went back and coded it from the beginning.

But I use this tool in our morning meeting every day. It's up there. We bounce back and forth between Bloomberg and this tool. And so I certainly get a lot of live feedback with people who are having issues with it, or as I hear there's other data they'd like to see there, I incorporate that into the design.

Brandon Farr | 15 Years of R in Quantitative Finance | Posit (2020)

Transcript#

Working with financial data

The limits of Excel for scalable analysis

Doing better analysis in R

Building Cognos: a custom Shiny app

Closing thoughts

Q&A