Resources

Lauren Chadwick | Meet You Where You R | RStudio (2020)

At RStudio, we wake up and go to bed thinking about the positive impact that open source work and data science has had and can have on the world. To maximize this impact, we find three areas of investment absolutely critical to ensure our open source community keeps up with the world’s changes and outlives us all: 1. Find ways to make R more approachable. 2. Enable teams of all types & sizes (educational, professional, etc.) to be able to leverage the work they’re doing in R, and effortlessly communicate that work to others. 3. Extend the language so our open-source community can continue to be at the forefront of innovation, no matter their preference of tool or language

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Thank you. Thank you, Greg. Good afternoon, everyone. My name is Lauren Chadwick. I work on RStudio's customer success team. I really didn't want to kick off today with a typical this is my title, and this is what it means type of introduction. But it turns out, as it relates to the broader topic and what I want to share with you today, my role in customer success is one of the more relevant factors. Now, to this day, I've supported over 200 organizations, primarily companies in the life sciences umbrella, as well as a lot of our university accounts. And in that journey, I've gotten to really dig in and figure out what these people are doing with R, what problems they're solving, what challenges they're facing, what their architecture and their environment looks like, and what their forward-looking plans are for the language.

All of the good stuff, right? I get that high-level view over all of these teams that I work with. And it's my absolute passion to dig in and figure out how we can best support them and how we can make their lives a little bit easier. And regardless of where you are on your journey with R today, and regardless of what you know, what your story is, I can't get that full picture for you today. I can't figure out what the most interesting thing to you is in the first 20 minutes that I have here. But what I can say is that you're an asset to our community. You're somebody that RStudio is thinking a lot about and really wants to help and probably can help.

And that's what I want to talk to you about today. I've titled my talk Meet You Where You Are in hopes to shed some light on the ways that RStudio can be leveraged at various points in a data science journey and how the teams and the universities that I'm working with are already doing that today. Now, in the eyes of RStudio, there's really three key areas that we're hyper focused on to ensure that the open source community that this right here lives on for a really long time. The first one is approachability. How to make the R language approachable. How to make it easy to learn and easy to use R. The second one is communicating in production. How do you bridge the gap between the R user and the data scientist working in R every day and the stakeholder of the work that they're doing? And the third one is leveraging other tools. How do you extend the R language and the RStudio software to work well with other tools and languages and technologies that are valued by all of you and by the community?

Approachability and RStudio Cloud

And as you'll see as we go along, each of these three buckets kind of naturally maps to different points of a data science journey that a team might go through. Now, when you think about approachability, making the R language approachable, easy to learn, easy to use, a few things from RStudio that you might think about are things like the tidy verse or the cheat sheets that we all love or the webinars that RStudio frequently holds to deep dive into certain topics. Or perhaps one of our later investments, RStudio Cloud.

Now, can I get a show of hands here for anybody that knows what RStudio Cloud is or anybody that's used it? Great. A lot of you. Well, for those of you who don't know, RStudio Cloud is a free RStudio hosted platform designed for the purpose of teaching and learning R. In fact, if you go to RStudio.cloud right now and you scroll down half an inch, you'll see that our mission is to literally make it easy to share, teach, and learn data science using R. And the reason RStudio Cloud makes that so easy is because you don't have to download or install anything to be able to use it. All you need is an email address.

And once you've taken those few minutes and you've created an account with RStudio Cloud, you log in here, you'll see, like I'm showing on my screen, that you have the RStudio IDE at your disposal without having to install anything. So rather than your first experience with R being all about trying to download and get everything working on your desktop, you get to focus that time and energy on getting excited about learning R, getting excited about learning the IDE. And once you're out here, you have all of these different tools at your disposal to help make it even easier. For example, you have these primers, which are essentially interactive tutorials designed to help you learn the basics of the language. You have your user guide. You have your cheat sheets. Without even having to leave the RStudio Cloud platform, you have this wealth of resources at your fingertips to help make it easy.

And in conversations that I have with teams, I see this being used at all sorts of levels here. For example, at the individual level, I work with a lot of IT and admin resources that don't necessarily need to know R for their day-to-day work, but they want to better understand what they're bringing into their university or their organization. Just want to understand it a little bit better. At the academic level, RStudio is supporting over 650 universities that are using RStudio Cloud to teach data science in the classroom today. Over 650 universities. That number is constantly growing. On the corporate side of the world, my customer success team sees this being used pretty extensively as well. In my personal experience, I work with a lot of companies that are undergoing some form of a SaaS to R migration and need to completely learn the R language from scratch. We also see companies and universities that have their own internal R meetup groups or their own conferences or R day or shiny day. And to encourage participation in those sessions and encourage people to come and get their hands on R, they're using RStudio Cloud for that.

In fact, a great corporate example of a team that's using RStudio Cloud pretty extensively is Johnson & Johnson. Last month on December 4th, Johnson & Johnson did a public-facing webinar where they dove into their use case for RStudio Cloud and why they chose to use it and what they're using it for, for corporate trainings. And the reason they chose to use it, as you can see here, is because of simplicity. It's easy to use. And that's our whole goal with approachability is to make R easy to learn and easy to use.

Communicating in production with RStudio Connect

But maybe you're at a point in your journey where you don't need to learn R. You already know it. You're using it every day. You're working in R. You're making reports. You're making shiny applications, dashboards, what have you. So how do we help you? And that brings us to our next area of investment, which is communicating in production. How do you bridge that gap between the R users and teams working in R every day and the stakeholders of the work that they're doing? Right? And we know that as a data scientist that's working in R every day, that your very first thought when you built a R markdown report or a dashboard or a great shiny application, that your very first thought is not going to be, how do I get this to everyone else? Right? Your first thought is going to be all the steps that it takes to make it interactive and make it look nice and make it work.

But at the end of the day, unless maybe you're a hobbyist, you're not about to make that dashboard or that markdown report or that application for no reason. You're doing it because somebody out there, whether it's your students, your university, your research team, your customers, your clients, somebody out there wants to interact with the work that you're doing to make a decision or to answer a question, to gain more insights. Right? And if you're anything like the teams I'm working with, you might already have ways of doing that today. Maybe you have a Git repository for all your work. Or maybe you're copying and pasting your code and your plots into PowerPoint slides before your meetings. Maybe if it comes to an R markdown report or a notebook, you're emailing them around as attachments. All of your work is all over the place.

That's okay. That could work for some time. But what happens when you build more and more data products and more types of data products and more and more people become interested in the work that you're doing and want to be able to see it? And what happens when every single person on your team is doing the exact same thing as you are and suddenly everyone's work is throughout different inboxes, different emails, different folders, different Git repositories, different PowerPoint slides? That's a mess. That's inarguably messy, it's manual, it's time-consuming, and you're setting yourself up to lose sight of work that you spent a lot of time and energy on. And that's a mess that my team and customer success saw day in and day out when meeting with different universities and different companies. And that's the mess that we engineered RStudio Connect to help fix.

That's inarguably messy, it's manual, it's time-consuming, and you're setting yourself up to lose sight of work that you spent a lot of time and energy on.

Now, RStudio Connect is a publishing platform. It's a central location for all the work that you're doing in R and all the work that you're doing in Python. And it's intended to be installed behind your university or your company's firewall. So RStudio doesn't have access to any of your data or any of the data products that you're putting on RStudio Connect. And to show you an example, I'm going to flip back over here to the RStudio IDE. The one that I'm showing here right now is RStudio Server Pro. This is our professional browser-based IDE. But you can use any IDE that you want, the free desktop, the free server, whatever your preference is. And as you can see here, I'm about to show you an application that I put a ton of blood and sweat and tears into. You ready?

Here it is, old faithful, hello world, my application, right? And as you can see, in this moment, this application can only be consumed by me, right? It's running in my laptop on my personal session. Nobody else can interact with it unless I pass my laptop around the room. As great of an option as it would be to pass my laptop around the room to all of you, you'll also notice that I have this little blue button here in the top right-hand corner. It's a publishing button. When I click that button, what it allows me to do is send my application or my report or whatever I've built here to RStudio Connect. So you can see now that I'm on RStudio Connect. I have the same application, but now it's in a position where other people can interact with it. And the cool thing about Connect is I, as the data scientist, the publisher of this amazing application, I get to control who exactly I want to be able to interact with it, whether it's just a few people on my team or my entire research group or my entire company, my entire customer base, right? I get to control that.

And once you're out here, there's all sorts of different bells and whistles designed for the different use cases that you might have for a publishing platform like this, whether it be an internal collaboration use case amongst your university or whether you're a company trying to deploy an application to hundreds of customers. We see it all. Connect was engineered with all of those different use cases in mind. And I don't have the time to go through each and every one of them today, but what I can show you is a few of our fan favorites here. For example, if I wanted to share my application with a group of people on my team, but I didn't want them to have to log into RStudio Connect, what I can do is assign a custom vanity URL to the application and share that with them. So if I share this URL down here and they click it, they can open the same application in a web browser here and interact with it without seeing all the background of RStudio Connect.

A lot of teams that I work with have a use case for R Markdown as well. So if you have an R Markdown report, you want it to automatically run every Monday at 8 a.m., you can do that with RStudio Connect and set it to run automatically on a recurring basis. And then from there, you get to decide whether or not you want an email to go out to a group of people based on certain parameters that are important to you. Now, every morning, Monday to Friday, I wake up and I have two refreshed R Markdown reports in my inbox ready to be searched. And that's awesome, right? Nobody had to touch that to make that happen.

So RStudio Connect is really designed to help organize your content in a way that makes sense for your use case and automate processes that are manual in nature. But maybe one of the biggest advantages of RStudio Connect is it's not only designed for the work that you're doing in R, it's also designed for the work that you're doing in Python. So for example, if you have a different part of your university or a different part of your company that's using Jupyter Notebooks or maybe you've made a reticulated asset that uses both R and Python under the hood, with that same deployment process, you can also send that to RStudio Connect. And to give you a glimpse, a lot of teams that we're working with, a lot of universities that we're working with have already found this successful in their workflows. In fact, in the last month, I've worked with two completely different universities and their corresponding med schools to help implement a company-wide, university-wide instance of RStudio Connect for all of their teams' R products and all of their teams' Python data products.

Leveraging other tools: R and Python together

And that takes me nicely to the third and final investment area that I want to share with you today, which is leveraging other tools. Now, to nobody's surprise, R is very special to us. If I had to guess, I'd guess that it also holds some degree of importance to all of you as well. And if not, you might be in the wrong hotel this week. But we can also come to terms with the idea that once you're using R and RStudio in your workflows, and once you bring them to production, that there's also going to be a lot of other tools and languages and technologies that are important to you. Right? For example, maybe you're using Docker or you're using Spark. You're trying to connect to databases. You're using Git or TensorFlow. Right? So, how do we as a company extend the R language and the RStudio software to work nicely with all of those other types of tools?

Now, for the sake of relevancy to the last example, let's take a deeper look into Python. Now, a lot of universities and a lot of companies have a use case for both R and Python. Both languages are incredibly powerful. They're used for different purposes. They're solving different problems. And we work with a lot of teams that have a use case for both. So, how do we help support those teams? How do we help make it so you can use R and Python in the same exact workflow or at least have them coexist nicely under the same roof?

Now, we already talked about how with RStudio Connect, you can share the Python work that you're doing. But what if we take a step back and look at the development side, the admin side? What about the IT and the admin resources that are responsible for overseeing two completely different environments? Or what about the multilingual user that's constantly flipping between development environments just to get their work done? How do we support those people? To answer that question, a few months ago, back in September, RStudio introduced Jupyter support in RStudio Server Pro, which is that professional browser-based IDE where I showed you that hideous application before. And now, universities and companies that have any mix of a use case for both R and Python have one development environment to log into to get their work done and just one environment to manage their workflows.

Flip back over here, I'm going to show you RStudio Server Pro. As you can see here, this time I'm on the home screen. If I click this new session button, you'll see that I have an editor option where I can choose from RStudio, JupyterLab, or Jupyter Notebooks. Now, let's say this time around I'm just a Python user. I don't care about R. I don't want to use it. I just want to start a Jupyter Notebook. So, I'm going to select Jupyter Notebook. If I click start session, you'll see that I have my Jupyter Notebook running inside of RStudio Server Pro. So, as a Python user, I don't even have to know that other R users or Python users are running RStudio sessions or JupyterLab sessions on the same server as me. All I care about is my Jupyter Notebook. If the time comes where I'm ready to share this with other people, you'll see that I have the same blue publishing button where I can share it out to RStudio Connect. So, we've come full circle.

Let's say for a second I'm a multilingual user using both R and Python. Just as easily, I can flip back, go to RStudio, the home screen, and start another RStudio session like I did before. And a lot of you may know that within RStudio itself, you can also do things like run Python scripts or add Python code chunks to your R Markdown reports or use packages like Reticulate to embed Python code into your R session. There's all sorts of ways that you can leverage both languages in your workflows. And as new as a lot of this functionality is, we've already seen it as a pretty popular option amongst the teams that we're working with.

Now, a lot of teams that we're working with, for example, I was just in Seattle a few days ago, and I was meeting with a company, and they're looking to replace their Jupyter Hub solution with this solution because they see it as an opportunity to really reduce their footprint because they know it's fully supported by RStudio. So, they know if anything goes wrong with their R environment or their Python environment, they have a team of people they can reach out to for help.

Closing thoughts

Now, I hope to at least have painted a high-level picture of the ways that RStudio is thinking about different individuals and data science teams at various parts of a data science journey. So, whether you relate to any of these specific examples or feel like you fall somewhere in between, just know that RStudio is thinking about you. We want to help you. We want you to live this value live. Because at the end of the day, those universities and companies investing in the work that we're doing are the sole reason that we're able to invest back so heavily in the open source and the reason we're able to codify it in our mission, like JJ said earlier. And we hope to be able to do this for a really long time, because we really want to support the open source community.

Those universities and companies investing in the work that we're doing are the sole reason that we're able to invest back so heavily in the open source and the reason we're able to codify it in our mission.

Now, after everything I wanted to share with you today, I imagined I wouldn't have a ton of time for questions, but I do want to let you know that the RStudio team will be out in the lounge over the next few days, hoping you guys come and talk to them, whether it be about this or about anything else that you're wondering what people are doing. I also wanted to share some resources and contact information in case you have questions or want to explore any of this at your leisure. Feel free to take a picture of this or write down anything. Thank you.