Resources

Hao Zhu | Empowering a data team with RStudio addins | RStudio (2019)

RStudio addins provide a mechanism to extend RStudio in various ways. Addins can interact with the RStudio IDE through RStudio API. It can also provide users a graphical interface with the power of Shiny. In practice, we found it very useful for enhancing or streamlining interaction with data and computing infrastructure. In this talk, we will demonstrate how our team develops and uses RStudio addins to empower our work. You will see some internal tools created to help us manage database connections, and an addin which helps us access external cloud computing resources. We will also show an example of using the addins in rcrossref and citr to download and manage citation and literature databases during rmarkdown document development. VIEW MATERIALS https://github.com/hebrewseniorlife/addin_demo About the Author Hao Zhu Hao is a data analyst and software developer working at the Hinda and Arthur Marcus Institute for Aging Research. He completed his training at Boston University School of Medicine in the program on Clinical Investigation. His interests include research reproducibility, data visualization and machine learning. At the Marcus Institute, he works with different teams on various topics, ranging from smartphone motion sensors to MRI images, and helps researchers understand their data by creating analytical reports and web applications. At the same time, Hao leads the development of R packages in the Biostatistics Core. He has contributed multiple R packages to the open source R community, such as kableExtra and memor. He also has a passion for teaching and has mentored several students at the Marcus Institute

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

I'm Hao. A lot of you might know me as the developer of CableXtra, but today I won't tell anything about CableXtra. I will tell you how we can use RStudio addin to improve our lives. I mean, not only for ourselves, but also for the teams that you are working in.

So here are the links to the slides and also to some code examples we're going to discuss later. Feel free to write it down. I'll take a picture. I hope I can make it bigger. I

So before I start my talk, I want to give you guys a little bit of information about who we are and our technical infrastructure. I'm working in this institution and we are a group of Asian researchers located in Boston. We don't have a very big data group, but we have a very diverse background in terms of culture and training. We love everything RStudio made and we are a big fan of RStudio server and we have our self-hosted RStudio server and we also have Connect.

So how many people sitting here have used RStudio addins before? Pretty much. Yeah. Great. So for those who have never tried it before, RStudio addin, they are these little buttons on this drop down menu that is accessible by clicking this addins button around your toolbar. Here are some facts. So RStudio released this feature about two years ago and I won't read through all these things, but the primary goal of RStudio addin is to help us as data scientists or not your manager. Well, if your manager is using RStudio, he might use RStudio addin as well.

Something I want to shout out here is RStudio addins, it has to be delivered in the form of R packages. One R package can hold multiple addins. But something I want to echo here is, like, don't be afraid of terms like developing R package. I will show you later on, like, this thing is super, super simple. You can do it, like, within one minute and you can even try it out tonight after this talk.

So in this talk, first of all, we are going to start with a very basic introduction of how we can create a RStudio addin and I'm going to show you a few addins we created at our workplace and some addins we love from the community. So let's start.

Creating a simple RStudio addin

Last week I sent out a tweet on Twitter by, let me show you, this addin is called switch RStudio theme. If you click that, it goes to dark. If you click it again, it goes to white. This feature is super helpful for me because, I mean, if I'm doing all my programming job, I hope I'm, like, a dark theme hacker, but sometimes if I want to show my results to my coworkers, I would like to show the white, normal version of me.

So it seems like a lot of people on Twitter, they are very curious about, like, how can we make this thing? In that tweet I was saying, like, we can actually make this thing within one minute. So let's now see how we can do that.

So basically what we are trying to do here is, so this is options, like global options where you can set the themes. What we are trying to do is, like, we are going to switch between these clouds and cloud midnight. Midnight. That goes.

So the first step, so here I'm going to talk about the first step of creating this addin. The first thing I want to mention, I want to, like, put this into your mind, that to the end, RStudio addins, they are just R functions. Everything you can do is in the R function. You can just bundle it with a button, RStudio, and make it an addin.

So we use this RStudio API package. There's a function called get them in four, and we see it tells us the name of the theme we are currently using, and there's an option called dark, and tells me right now, well, when I was rendering this document, it was false. And then we did some very basic, and it should be the first class in your, like, R class, like how to do this. We assign this value to current theme. If this one is true, if this dark is true, then we go to clouds. If dark is false, then we go to midnight, and we assign that value to next theme. And after that, we, there is another function, magic function, in the RStudio API called apply them, and we send that value to this function, so we can literally do that in three lines of code.

And we, after that, we just wrap it up with a function call, and we save it somewhere in the R folder. So that's our first step. Very easy, right? Everyone gets it, right? And now you have a function that is working. In fact, if you run that function inside your console, it will just change your RStudio theme. And now you want RStudio to know that, okay, we have an add-in that is available, or you want to bind it with something.

How can we do that inside this read after me? Like inst RStudio add-ins.dcf, write the following four lines of code. Name something, which is a name you will see in the buttons. Description, which is a message you will see when you hover your mouse over. And binding, this is the most important one. It tells you which R function it will run. And interactive, well, if your add-in has an interface, then it's interactive. Otherwise, false.

So, here's a minimal structure for RStudio add-in, like, package. So, we have this R file we just talked about. And these add-ins.dcf, that's where we define these add-ins. And then we have a description. For those who have ever tried to write an R package, you know that we need this description file to be a package. Basically, it has some information, like, who wrote this package, how can we call it, things like that. Don't be worried about it. And there's even a function in use that will help you to set up a new package within one line of code. So, only three files. And literally, like, after we have these three files, you can have an add-in that is possible.

RStudio addins, they are just R functions. Everything you can do is in the R function. You can just bundle it with a button, RStudio, and make it an addin.

Bookmark and re-export add-ins

Another example I'm gonna show you here is, like, bookmarks. You might wonder why we need a bookmark. Because, I mean, we have bookmarks in the browser. It works perfectly. But if you are working inside a big group, or if you are, well, for example, at my workplace, I am an RCU administrator. And after we set it up, this RStudio server, there's some certain information you want other people to know. For example, where can we access the shared drive? And where, like, where is our Git server? So, in the end, I wrote documentation using bookdown, and we published it on the RCU Connect.

However, I cannot control other people's, like, browser, right? I just cannot hack into their computer and do those things. Well, something I can do is, like, I can load up R package and pack this link up as a button and pin it on the top of their add-in list. So, in the end, I can just send them an email and say, okay, do you see that button in the RStudio? You click that and click the second link, and you will be brought to a documentation site. For you guys who have used bookdown before, like, bookdown even supports search. If you have any issues, you can just go ahead and search it.

Very simple feature, but it's proven to be very useful for a lot of our colleagues. Guess how many lines of code do we need? We only need one line of code. There is a function called RCAPI, and just put that link in, and that's it. That's literally one line of code. That will solve your issue.

And one issue I have with RStudio add-ins, like, if you are really a big RStudio add-in fan, you install a lot of fancy add-ins, in the end, you will have something, well, this is my work machine, so not that crazy. You have an add-in like this, like that. Although RStudio provides us a search feature, but I hate to type, you know. So I want something that can help us to pin my favorite add-in at the very top. For those people who are wearing green shirts, like, RStudio people, here is a feature request. Please, please do that. But for now, what you can do is you can re-export these awesome add-ins from other packages by simply, once again, another one line of code.

For example, I really like Yuhei's Shalingan package, and there is an infinite moon reader add-in. So basically what I do is I execute this function, well, I'm typing the function name here, and I assign it to this open Shalingan function I just randomly created. And that's it. You can do that literally just by doing that.

So you may wonder how can I know which function was a function that was called by the add-in. One trick is, like, if you, for example, here, if you execute the original infinite moon reader, and you will see, okay, well, I'm not inside a book down folder, but you will see this is a function. Just copy it, paste it into a function that you name it, and then define, like, the existence of the add-in. And that's it. That's everything you need to do.

Reference management with add-ins

So I already showed you three easy examples, but this one is kind of a complicated one. I put it here because I feel like reference management is always tricky for a lot of us when we are trying to write R Markdown. So our solution is use a combination of R cross-reference, which is from ROpenSci, and this CIDR from Federicost. Sorry if I didn't pronounce it right.

So let me show you an example you will see. And in this demo I provided, you see, I created an internal package, and I export and re-export, so right now it's pinned at the top of my list. Also, one trick I want to mention here is, like, for your package, there's a very simple but kind of dirty trick. Start it with an A.

So first step is, like, you start this add-in from cross-reference, and you see it has an interactive interface, which is created by Shiny. And the example I want to try here, for those who attended the R Markdown workshop, you might have already seen this, but let me search Hardee's tidy data internet. Okay, I find the first one, and you see there's a link. We can click this link to make sure that it's a right copy, and you see I got linked to this journal of statistical software. So, okay, we find the right one, right?

And we can click this button, add to my citations, and after a while, since it's using all these API calls, it says edit, well, it takes longer today, but, yeah. If we click this, we'll be able to see, okay, this BibTeX file is already added here, just by searching.

And if we create a new R Markdown document, if we want to cite this citation, Hardee's tidy data, which is awesome paper, I recommend everybody to read, we can use this add-in from CIDAR. So, basically, what it does is, like, it goes into this file, and then find all these references. You can basically just click that and insert a citation. And that's it.

It's a great example to show you how you can accomplish this kind of complex task by using add-ins.

Other add-in use cases

We have several other add-in use cases. I will just quickly go through them. We don't have enough time to go through them one by one. But as a, so, for example, for the first one, as a RStudio admin, we need to, we sometimes need to tell, or, let's say, for those people who use ODBC to connect to different, like, database, we know that we can set up our profile inside this hidden file, the .odbc.ini file, and then you can just use RStudio's connection panel to connect to different database just by clicking.

It was extremely tricky, like, in any version of RStudio before 1.2 preview, that RStudio, like, I mean, previously, it is not possible, well, it's not very easy to access hidden files. But right now, there is an option to turn it on. But still, it won't be easier to tell your colleagues to click a button on the RStudio add-in panel, and that button will bring up this .ni file, and you can just send them these login credentials, and they will just put that number in, put those information in, and they will be able to connect to desired database without too much headache on training and things.

We also developed some add-ins to, we installed Octave, an alternative to MATLAB, which charges money, to, so we installed Octave on this server, and we have an add-in that basically just starts a terminal using the RStudio API, the terminal section. It starts a terminal, and then starts this Octave automatically for me. One trick is, like, surely if you want to send commands from the text editor to R console, you hit Command-Enter, and this thing is, I think it's Alt-Command-Enter. I might get it wrong, but you can check the hotkey.

Yeah, and something I want to mention here is about this R2 cluster package that we are currently developing, because for those who are working in academia, like, sometimes we need to access, like, some high-performance cluster to do some really, like, computational heavy work. This kind of thing is, like, if you use it every day, you know how to use it. If you don't use it every, if you just use it for one time, then it's okay to go through all the documentation. But, I mean, the worst case is, like, you use it every once in a while, and I keep forgetting how to do that.

So, basically, we created this add-in that will help us. So, we use Shiny modules. So, basically, it will save your username and some, like, ports, these kind of things, within your home directory, and use this information to help you to log in. And it has pre-populated commands here, and when you click this Play button, you will send these commands to R3 terminal and execute it. Like, you have the option whether or not you execute it, but, yeah, you get the idea. And also, it has, like, a panel to help us to request certain service. I mean, if you have used these HPCs before, you know that it's complicated. That's why we need this add-in.

Take-home messages

So, take-home messages. First thing, like, if I can only, like, present for, like, 10 seconds, I will only say these, saying it loud. Creating R3 add-ins is really, really easy, and it's really, really fun. If you find that, okay, there are a lot of fancy add-ins, well, that's not because creating add-in is difficult. It's because, well, it's either a Shiny app or it's just a lot of, like, amazing R jobs behind the theme. It's not add-in. Add-in is super easy. You can do it tonight.

Creating R3 add-ins is really, really easy, and it's really, really fun. It's not add-in. Add-in is super easy. You can do it tonight.

Second, because it's so easy, you can even do it, like, for personal use. In fact, I feel like, for a lot of us, I mean, I started as a very plain, like, R user before. I did not develop R packages until, yeah, a few years ago. But if you have never created an R package, it is actually a very good starting point. Although it sounds weird, but it is a very good starting point because, in this case, you don't need to worry, since it's for your personal use, you don't need to worry about any documentations. You don't need to worry about namespace and a lot of these kind of tricky things. Get started tonight and do it.

Also, if you are working in a big group, this is a great tool for teams to form standardized practice. Because, I mean, keeping good documentation, we always know that keeping good documentation is the key to success for a team. But if you compare the method, like, you give someone a page of steps to follow, or you write an add-in that packs up these ten steps on that page, which one is easier? Of course, I would rather just click a button, right?

And, really, in the end, I would just want to give credits to my teammates, especially this gentleman on the left side and our money provider. And, thank you.

Q&A

So, we have time for a few questions for Hound. My colleague over here will throw the throwable microphone over to whoever has a question.

Thanks, man. Definitely learned a lot from that talk. I was wondering if there are any other features that you feel like are lacking or that RStudio could improve with add-ins, apart from that one that you mentioned.

Well, some features, well, it's actually not about add-in itself, but rather RStudio API. Yeah, because add-ins can interact with the RStudio interface through the RStudio API. If we can have more options open through the API, although it might sound risky, but it might give us more options to do with the add-in. It will open up tons of fun and tons of possibilities.