Sean Lopp & Lou Bajuk | R & Python: A Data Science Love Story

And so to kind of summarize, really what we're after is making it easier for people to make important decisions with data. And that's going to require teams to use R and Python together.

And there are some considerations for team leaders and stakeholders when they set out to do that. The big ones are being able to access those insights consistently and reliably, which is really what production means. And for teams, it's being able to deliver that content regularly and seamlessly.

And so that's what RStudio team and specifically RStudio Connect is designed to do, is to help those decision makers really rely on those data insights and to ultimately get a return on the investment from the data scientists work in any language.

So with that, I'm going to hand things back over to Lou to kind of recap what we've talked about and look ahead into the future a little bit.

Recap and community investment

Thank you, Sean. I appreciate that. That was a great series of demos illustrating the way that we help our customers tackle the major challenges of bilingual teams.

So just recapping what Sean just showed, through our products, we allow data scientists to combine R and Python in a single project using the Reticulate package. We make it easy for these data scientists to launch Jupyter Notebooks or JupyterLab from the same infrastructure where they launch the RStudio IDE.

From the DevOps and IT perspective, we make it possible to provide all this capability in a single infrastructure for both R and Python and Jupyter, which means users can continue to use their favorite tools, whether it's the RStudio IDE or whether it's a Jupyter Notebook.

And very importantly, from the DevOps perspective, they're making their users happy while making their own lives simpler and less expensive by configuring and integrating and scaling and securing a development infrastructure just once as opposed to multiple times.

From the data science leader's point of view, we've shown how it's easy now to collaborate across your team, sharing your R and Python work between those team members, maximizing their productivity, and how to really repeatably, reproducibly deliver value to the business by delivering regularly updated reports, custom results, and self-serve applications through a single portal or directly in your stakeholders' inbox via email.

And from the business stakeholders' point of view, it's possible for them to access all these up-to-date interactive analyses, dashboards, and emails, making sure that these results are current since they can be updated on a scheduled basis, and so that they can get the answers, the insights they need when they want them in order to make better decisions.

All these capabilities around integrating with Jupyter Notebooks, supporting Jupyter environments, supporting these diverse bilingual teams, these are all available in our commercial products. Our commercial products are bundled together in RStudio team and made up of three main components, RStudio Server Pro, which provides the development infrastructure for R and Python, RStudio Connect, which is a platform that allows the data scientists to publish their results to business users and other collaborators so they can use those insights, and RStudio Package Manager, which manages all the complexity around R packages to surface in the other two platforms.

And to date, we've had well over 1,000 large organizations that use our commercial products to solve their day-to-day data science challenges and really scale up their data science usage into production so they can leverage the value of their data science team, really maximize the value of the investment in all the different data science products they've done, whether it's R or Python or Spark or Kubernetes or something else.

Leverage all that value to get better answers, to make better decisions, and really, really maximize the impact of their data science work.

As part, and we've talked now a lot about what's available in our products, both open source and commercial, in addition to that, RStudio for many years has been a major supporter of the R community, and now we're supporters of the Python community as well.

The RStudio community is a great portal on our website for asking and answering questions around open source data science, R and Python, and our products. Great place to get information.

Every year, we sponsor RStudio Come, which is the biggest gathering of open source data science users in the world. We've got our next conference coming up just in a couple of weeks, if any of you are going to be in the San Francisco area at that time, where we have a couple of slots available left in that. If not, we're also going to be live streaming the presentation, and we'll provide that link at the end of the presentation.

The RStudio education team is devoted to helping train the next million open source data science users. We provide capabilities around not only pointers to other learning materials and our own learning materials, but we also provide a train-the-trainer certification so that we can train people in the open source data science community in teaching on R, and really encourage them to scale out and provide training to the others, because we feel like that's the best way to really get the capabilities for using R out across the community.

We're also a member of a number of cross-vendor groups that help support the data science community. We were one of the founders of the R Consortium, which focuses on delivering valuable infrastructure and supporting working groups for the R community.

We're now major sponsors of NumFocus as well, which is another cross-vendor group for supporting investment in open source data science. They're the umbrella organization that provides the primary funding for the Jupyter project.

And we're helping incubate Ursa Labs, providing operational support and infrastructure for this industry-funded development group, which specializes in developing open source data science tools that cross languages, including most recently the Apache Arrow project.

Resources and Q&A

And so before we wrap up and head to questions, just a few links on where you can find more information to get an overview of what we talked about today and links to a lot more detailed information. Your one-stop landing page is rstudio.com slash Python. Lots of information there.

You can also contact us to learn more. That's a great way if you want to get a detailed demo, if you want to get some questions answered or just do a deeper dive. That's the best way to do it. We will get to as many questions as we can today, but there are a ton, which is awesome. We appreciate the engagement.

If we don't get to your question today, we will follow up. The link to the webinar recording, as well as the slides, as well as the scripts that Sean used today, all that will be provided to the attendees. We've got some several questions on that. As I mentioned, our conference is coming up in two weeks.

In the first section of his demo, Sean focused on using the reticulate package to call Python. And so here's a couple of links on the website for the package, samples, and a deeper dive webinar for using that.

We also got a number of questions around configuration and versioning and whatnot. And I'll hand a couple of those off to Sean in a minute. But for some of the deeper information there, again, our documentation is a great source for that. We can provide you information. And as always, the RStudio community is a great place to ask questions.

So with that, we'll dive into questions. As I said, we will do our best to get to all your questions. If we don't, we'll follow up. If you'd like to set up a conversation, this is a great, this link is also in the slides, but you can click on this and they'll give us a chance to directly set up a follow-up conversation with you.

So diving into the questions, as I said, there's a ton, I appreciate it. I also want to apologize to everyone who suffered any audio problems. We had a handful of people with issues on that. If you had any audio problems, then this recording, the full recording will be sent out along with the slides.

So Sean, we've got a number of questions here. One that comes up a few times is, can you clarify the difference between what's available in our open source products and the work we provide there and what's unique to our commercial products around support for Python and Jupyter?

Yeah, absolutely. So we're dedicated to making open source tools available for data scientists. And so everything

Sean Lopp & Lou Bajuk | R & Python: A Data Science Love Story | RStudio (2020)

Transcript#

RStudio's focus on integration

Challenges for bilingual data science teams

Combining R and Python as a data scientist

Supporting bilingual teams with RStudio Server

Recap and community investment

Resources and Q&A

Featured software#

rstudio

webinars

Sean Lopp & Lou Bajuk | R & Python: A Data Science Love Story | RStudio (2020)

Transcript#

RStudio's focus on integration

Challenges for bilingual data science teams

Combining R and Python as a data scientist

Supporting bilingual teams with RStudio Server

Sharing results with RStudio Connect

Recap and community investment

Resources and Q&A

Featured software#

rstudio

webinars