RStudio Team Deep Dive | In A Hosted Environment

Transcript#

This transcript was generated automatically and may contain errors.

All righty, happy Thursday, y'all. Thanks for joining me today. We're just getting started. It's right at 10 a.m. Central Standard Time for me here in Texas, but we'll wait a few minutes to let some folks roll in, and then we'll get started on our RStudio Team eval demo. We're going to have a bit of a live code, a little bit of slides, and definitely going to be taking questions as they come up in the chat. If you do have any questions, feel free to drop them into the YouTube chat here on the live stream that you're watching.

A lot of the examples I'm using, you can find the public source code here on my GitHub, so that's going to have some of the details for the slides as well as some of the example applications we're showing in R and Python. I also have my colleagues from RStudio who will be answering some questions, so if you see me talking and you see people answering questions in the chat, I'm lucky enough to have some great team members behind me. We also have Kelly O'Brien, who's the RStudio Connect product manager, so she might answer some additional questions there in the chat.

What we've been doing is kind of for a lot of these is if you'd like to say hello in the chat, feel free to say hi to each other in the chat. Usually it's between 60 and 600 people, somewhere in that range in the group, so feel free to talk amongst yourselves or ask questions as well.

Overview of RStudio Team

So for today, again, like I mentioned, there's going to be a little bit of some slide craft in terms of like we go some slides, we're going to walk through, but the bulk of today is going to be walking through the RStudio team evaluation. So the team evaluation is all three of our professional products, so RStudio Workbench, RStudio Connect, and RStudio Package Manager. The evaluation means that they're time-gated in terms of it's just trying out the products and we provide the hosted environment for that.

If you did want to try it out, you just fill out this form and you can click start in the evaluation and then I'll walk through what the evaluation actually looks like, how you can use the products, and then really what today is about is understanding how these things fit into your workflow in an enterprise, as well as how you make some of these arguments for using open source. Even if you're not using our professional products, we'd love for you to use them, but even if you're not going to use them, how do you motivate folks to use R and Python and all the wonderful packages that you use in both languages within your organization?

So again, I'll jump back into the slides and we'll talk a little bit about RStudio team, and then we'll hop over into the evaluation proper and have quite a bit of live code and kind of walking through the whole process. Again, for RStudio team, we think of this as really like a single home for R and Python for all data science teams. So regardless of if you're 90% R, 10% Python, or 100% Python and no R or some other mix in the middle, we've tried to build these products in a way that they're serving both kind of constituencies, that both people are very happy and able to be productive with these products.

As far as the three products, if you weren't familiar with them and kind of the different people who would be interested, you know, maybe you're a data scientist or data analyst, maybe you're a decision maker, a business user, someone else kind of trying to make decisions with data, with data science, or maybe you're even an R admin or Python admin or IT admin that's more concerned about all this open source work that's being done and people are trying to be productive, but you're trying to support them and make things secure and scalable and have proper operations around it. So again, we kind of think of this as these are the three core personas that we're trying to solve problems for.

In terms of data scientists really want to use R and Python, they want to develop their applications and share them with the rest of the organization, whether that's for other data scientists or for business users trying to make decisions with the data science work they're doing. And then the IT team, whether you're an R admin, meaning that you're kind of in both worlds or a Python admin or a traditional IT admin, you're just trying to support all this and make things operate smoothly.

Again, the three products are RStudio Workbench, RStudio Package Manager, and RStudio Connect. RStudio Workbench is where you would do all of your data analysis and kind of writing your code. So you can write there in R, you can write there in Python, you have the RStudio IDE, as well as VS Code and Jupyter. So a nice mix of different environments you can write code in. RStudio Connect is where you'll publish all your results. So that could be R Markdown reports or Jupyter notebooks. It could also be interactive applications like Shiny or Dash or Flask or Streamlit in Python. And then lastly, RStudio Package Manager is supporting all those open source libraries. You can actually store copies of them on-premise or in your cloud, behind your firewall, wherever you need them, as well as your own internally developed packages.

So as far as what we're kind of focusing on, here at RStudio, we're focused on open source first in terms of we're really trying to build an open source model where we give away actually the vast majority of the software we create. So things like Tidy Models, the Tidyverse , R Markdown, those are open source packages that we're giving away, you know, freely available to anyone who wants to use them. The professional products build on top of those to provide some of the things that organizations are more interested in. So security, scalability, you know, access control, as well as logging and authentication and all that.

We're also focused on being code first in terms of we really believe that to do, you know, really serious data science and do, you know, productive work, that code is going to make you better. And then learning even a little bit of code or learning a lot of code can help you be a more productive data scientist and have more control over what you're building. And lastly, centralized or even cloud-based. So this means that our products run on a server, whether that's on-premise, bare metal, you know, existing servers you have, or in a virtual private cloud like you might find on, say, AWS, the Amazon Marketplace, or Azure, or Google Cloud, or the other providers.

The problem RStudio Team solves

So we'll talk about one more slide, then jump into the live coding section, but really the problem here we're trying to solve is that there's this chasm or this gap between what data science teams are doing and trying to doing, and actually impacting the business and, you know, creating business value or affecting decisions in a positive way.

So data science teams create things, but it's stuck on their laptop, or they're trying to share something with their boss or their colleague, and they have to email it to each other, or they, you know, have to put into a shared drive, the person downloads it and moves it over.

So how do we cross this chasm and get the data science work that you're doing into the hands of either live decision makers, or into the automated decisions that people are doing inside of the software? So as a data scientist, you're either creating insights, or trying to share insights, or creating models that are going to be used downstream, and that's what you're doing today, but you're still trying to get them closer to either decision makers for automated decisions, or actual human interaction.

So in our world, you'll be creating these insights, or writing code in R and Python in a data science workbench that supports both languages. So log in one time, you have your full environment for R, you have your full environment for Python. So with R and Python, you can build amazing things. You can build applications, reports, APIs, you can send, you know, programmatic emails, as opposed to manual emails. There's a lot of power here in terms of things you can build.

Once these things are finished, you can now get them onto a deployment server, basically a centralized location, where if people want to access the things you're creating, they know to go here. They go to this location, they log in, it's access controlled, it automatically scales up to meet the user needs, all the things you want there. And this will deliver the actual applications, as well as, you know, scheduling emails, and other assets, or automations that go to live decision makers, or hosting things like APIs, so Plumber in R, or Flask, or FastAPI in Python, which integrate into other software, whether it's, you know, Java, or JavaScript, or your website, any other type of things there.

Once these things are finished, you can now get them onto a deployment server, basically a centralized location, where if people want to access the things you're creating, they know to go here. They go to this location, they log in, it's access controlled, it automatically scales up to meet the user needs, all the things you want there.

And part of the beauty here, is that because this was published from version control, Connect can actually say, I'm going to watch that repository, and I'll check for updates periodically. So this is kind of a lightweight version of, not quite continuous integration, but semi-continuous integration, if I'll use that term that way, and Kelly doesn't get mad at me.

So I have the ability to publish with the push button deployment from RStudio, or Jupyter. I can also publish from the R console, or the terminal, and I can also publish from version control, or bring in something like continuous integration, or continuous deployment, if I wanted to go that route as well.

RStudio's open source mission

So in terms of closing and wrapping up here, in terms of who we are at RStudio, and what makes us a little bit different, again, a lot of the things we're building are actually free and open source software, and we absolutely want as many people as possible to use that. That's what the Tidyverse is, that's what R Markdown, and Shiny, and Tidy models, and all the different, you know, software that we're creating, and give away for free. These professional software kind of build upon some of that, and it allows us to give away a lot of the software that we create. So if you do need things like authentication, or you're trying to use Shiny in an organization, and your IT team says it's a non-starter unless we have authentication, or scalability, that's where something like RStudio Connect can come in and solve that problem specifically for you.

Specifically in terms of how our funding works, more than 50% of our engineering resources go to free and open source software. So basically half, and more than half, of our engineering efforts go to giving away software that's free. Our pro products fund this ongoing effort so that we can kind of achieve this mission of giving away and creating free and open source software. And additionally, we contribute to organizations like NumFocus to extend our impact in organizations like Python, where we're not necessarily developing as much today, maybe we do some more in the future, but where we want to contribute financially as well.

And lastly, something I'm really proud of is we're a public benefit corporation, meaning that all of our decisions have to be balanced with the best interest of the community, our customers, the internal employees, and shareholders. We're not just here to improve stock prices for some company, we're here to actually serve the community and do the best job that we can at that.

And lastly, something I'm really proud of is we're a public benefit corporation, meaning that all of our decisions have to be balanced with the best interest of the community, our customers, the internal employees, and shareholders. We're not just here to improve stock prices for some company, we're here to actually serve the community and do the best job that we can at that.

In terms of wrapping up as we're getting close to the top of the hour, if you did want to learn more, you can read all about Serious Data Science on our blog. Again, these slides, or I saw a slightly different version that I'd published is available at this link at the bottom that I'll post in the chat again. And that has links to everything I talked about within the slide deck. Trust radius reviews, you can book a live meeting or just send an email to us if you don't want to talk live, that's fine. And if you wanted to evaluate RStudio Team and do something similar to what I showed you today, you can fill out this form, we'll send you an evaluation as soon as possible.

So that's really it for today. I'll try and answer a few more questions as they come in, but we are at the top of the hour and I know some people may have only scheduled out the existing time, but I'll stick around for a few more minutes so that we can answer some more questions. Looks like Kelly, thank you so much for answering a lot of the questions as they came up, that was a huge help. So thank you, just wanted to get, again, Kelly O'Brien, the Product Manager for RStudio Connect, many thanks for her time today in terms of seeing what kind of questions you all have as community members and for helping answer a lot of those.