
Posit Cloud Essentials | Ep 1: Getting Started
On the last Tuesday of every month, we host an event – Posit Cloud Essentials – where we explore the ins and outs of Posit Cloud, diving into its key features, valuable tips, and real-world use cases. The event is open to all and hosted on YouTube with a live Q&A during each month’s event. This month, Alex Chisholm, Product Manager for Posit Cloud, walks through how to get started with a free Posit Cloud account. Enabling you to conduct analysis, generate insights, and share findings from your web browser. What is Posit Cloud? Posit Cloud makes it easy to move your entire workflow into a unified online experience, complete with project management and publishing capabilities. Use your favorite coding languages and environments and share your work seamlessly with others, all from the comfort of your own web browser. No registration is required to join the events. Simply add the event to your calendar using the link below. Create a free Posit Cloud account → https://posit.cloud/ Explore the source code used in this demo → https://posit.cloud/spaces/394911/join?access_code=v-iZm0epL-n-vNHpht4JAfH47gmqWdg0cM6hyll7 Add future Posit Cloud Essential events to your calendar → http://evt.to/adahaeuow Q&A from this demo can be found here → https://app.sli.do/event/q2aLBPfVRAvUCFryNs9YuL
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Hi everybody and welcome. My name is Alex Chisholm and I'm the product manager for Posit Cloud here at Posit. In this first installment of Posit Cloud Essentials, we're going to cover everything you need to know about getting started with a free account in just a few clicks. Posit Cloud is an online platform where you can do and share data science in your favorite coding and languages, all from the comfort of your web browser. In essence, Posit Cloud makes it easy for you to manage a variety of data projects in siloed, reproducible, environments that contain your code, your data, and your results, all of which can be shared.
A space in Posit Cloud serves as the building block for the tool.
You can create R projects in the RStudio IDE or work natively with Python and Jupyter. You can publish data applications and documents from cloud or from an external environment, such as RStudio Desktop or VS Code. You can also securely store database connections to power your projects and outputs using a set of 16 included professional drivers.
Finally, you can invite members to your space to collaborate on work or to simply showcase your results, all the while keeping your eye on usage analytics for specific pieces of content.
Posit Cloud is a freemium tool. This means that you're able to create a free account and gain access to all of its functionality. If you happen to run into limitations in terms of compute power or number of published outputs, feel free to explore the affordable and flexible paid plans.
On today's demo, we're going to create a free account, set up our first space, and then do some real work. Let's get started by going to Posit.Cloud.
Creating a free account
So when you get to Posit.Cloud, you're going to find a variety of options and information about the tool. The first thing you want to do is set up an account. You can click on get started to start the process. You'll see a variety of plans. Today we're just going to start with a free plan. When you click sign up, you're going to be able to either put in an email and password along with your first and last name, or you can sign up directly from something like Gmail with Google or GitHub. I'm going to go ahead and use Google.
So I have an account already set up on Gmail for this demo. I'm going to click it, and we've now created my Posit.Cloud account. It's going to redirect me into the tool for us to begin doing work. When you first log in, you're going to be brought to an area called your workspace. You might want to think about this area like a playground. You can create new projects and But for this demo, we're going to create a more functional space called a new space by clicking this button. And when I do this, why don't we go ahead and name it? We'll name it biostats.
So if I create this new space, you can see that I'm brought into something that looks relatively similar to that first area we went into, but there are more tabs and there is more functionality in here that we're going to use throughout the demo.
Setting up a data connection
The first thing I want to do is create a data connection. You don't have to do it this way, but since we're going to use this for our example, might as well start there. So if you go across the top of the pane and click on data, you can see that there are no current data connections, and I can go ahead and add one. By clicking on driver, you can see all of the different types of database connections that we can save. I'm going to use Postgres, and I'll go ahead and name this connection RNA central. For this example, we're going to use a public data set or database, I should say, but you can find more information on rnacentral.org, including the credentials to log in and start making queries against it. So for now, I'm just going to paste in all the information I need here.
First, I have my server, I've got my port number, I'm going to bring in the database name itself, and then finally, my username and password. You'll notice when I pasted in the password here that we don't actually see what it is. We're also going to be able to save this as an encrypted variable, which will become an environment variable when we open up RStudio in a few moments. So let's go ahead and hit OK. I now have my database connection stored.
Creating an RStudio project
I'm going to go back to content, and we'll create our first project. Clicking this blue button, I can create either an RStudio project or a Jupyter project. I can also create one going straight to a GitHub repo, but for now, we're going to go ahead and create a new RStudio project. So this should take around 20 seconds to deploy, and during this time, I might as well go ahead and rename my project, and we'll just call this RNA analysis.
So we're up and running. We have a fully functioning RStudio IDE session here, and we have to decide what do we want to do next. And typically, one of the first things you're doing in data analysis, right, is connecting to some kind of data source. So we can go and retrieve that connection we made earlier, and if I go in here to the upper right-hand panel and click on connections, click new connection, you'll see what we saved earlier as an RStudio IDE.
The connection we saved earlier is already here. So I'm going to click this, and we'll see that RStudio is going to give us the ability to create some credentials to make the connection. First, I'll test it. So testing is going to go out. It's going to install the packages that I need for this specific type of Postgres database connection, and you can also see that, one, it is a success. Two, you can see in the script that's going to be made for our connection, to mask the password. So we don't need to keep the password of this database in the script that we use. Why don't I go ahead and load this in a new R script, and we can load the package that we need, ODBC, and also DBI. I can go ahead and make my connection.
Once that connection is made, you'll see again in the connections tab, all of the different tables and schemas available to us from this connection. We want to do a little bit more. Let's start querying this, and one good package to do that is dbplyr. Maybe I'll save this script as connect. I don't yet have dbplyr installed, so let's go ahead and do that. dbplyr is going to allow us to use relatively straightforward commands to go out to the database, run some queries, and bring back data. Along with that installation, we get dbplyr, which will help with some of the code to get this.
So let me also load that library, and now I'm just going to bring in as an example a code that says, let's make a new variable called results. I'm going to go out to this table within the database. I happen to know that it is in the table called RNA. I only want to get the top 100 rows, but I do want to bring this back to me so I can look at what's in there. So if I run this, if I load the libraries and then run this, you can see now we've stored a variable named results. If I click on the environment tab, I'll be able to see results. Clicking on it shows me the 100 records and the type of variables that we have in here.
So that's great. We were able to go out from our database connection, grab some real data, and now we can do something with it. I also want to add on the bottom here just a database disconnect if I were to run that script later on so we can kill our connection to that database.
Building a Shiny app
Now what if I wanted to take the information that we saw and put it into an interactive report? I can go ahead in the same session and create a new Shiny web app. And this is going to ask me to install packages related to Shiny, which is relatively quick. And we're going to take the base example and we're going to swap in this real database connection to have a new example that we can use to test the functionality of both creating these interactive applications and then publishing it to cloud. So maybe we call the application RNA.
I'm now in a Shiny script and I can do a few things. First, I just want to bring this data or bring this code sequence over to my app. I'm just going to paste it at the top, maybe clean it up a little bit, move the libraries next to each other. I still want to go out to and get a connection to this database. Now that I'm going to move into the actual application, maybe I don't want just 100 records, maybe I want 2000 records. And I'll leave this disconnect here as well so that when the application goes out, grabs the data, it also closes the connection. So then we want to work this new data into this example, and we'll go through and just change a few things from this stock example that isn't here. Maybe we call this RNA analysis and leave all of this. One important thing I'm going to do is change my data. I no longer want to take the sample data. I want to bring in results and I want that length variable. And then finally, we can just do a few things to change how it looks. Maybe I get rid of the X access label completely, and then I change the overall title to RNA length.
I think we've done everything that we need. I can also probably simplify my call to the database. I know I only need that length variable. Maybe I'll bring in the ID and length. All right, so now we have what I think is an updated app based upon that data connection that we made. Why don't we go ahead and preview this? And let's try to get this to show up in the viewer pane so that we all can see it here. I'm going to run my app.
Yep, and it worked. We can see that we still have this slider to control the number of bins within our histogram. But from the data that we're pulling in, once again, live from this database, we can see that we kind of have a split distribution with a lot of records that are up in that 1500 range, and then several that are down 500 and below. So that's great, right? We've gone through, we've made a shiny app, we've brought in real data to it. What is fantastic about Posit Cloud is you can also now publish this app specifically to your space, either to go straight there next time when you want to see the maybe updated results, or if you want to share it with other people without having to let them into your project. So why don't I, I'm going to cancel this, and now we're going to move over to publishing.
What is fantastic about Posit Cloud is you can also now publish this app specifically to your space, either to go straight there next time when you want to see the maybe updated results, or if you want to share it with other people without having to let them into your project.
Publishing to Posit Cloud
This is the push button publisher within the RStudio IDE. If I click on this, it's going to ask me again to install some packages. These packages are going to assist with the publishing process. And once this is ready, it's going to ask me for a few different things. The first thing it's going to ask, which are the files you want from this folder to upload? Right now I only have this application.r file, so that's the only one I'm going to bring up. It's going to ask me where am I going to try to publish this to? And what is fantastic is just by setting up this new Posit Cloud account, we've already matched your tokens and your secrets and all those credentials that you need to publish automatically. So when you press publish now, it will put all of this interactive application into your own space. So I'll click this. Given the size of this and the amount of files that we're putting up, this will probably take a couple of minutes. So let me show you a few other things.
You might have noticed the RAM indication at the top. So that's telling us how much of our available memory we're using. Right now we're at about 50 percent. If I click on the gear, we can see that we can add in analysis, or we can add in a description that'll show up later. So analysis of RNA beta maybe is what we want to put. I click access. I can see that only I can see this right now, but maybe we want everybody to see this. So I'm going to change it to everybody in biostats. And then finally, if you wanted to change the RAM or the compute or the background execution time, you can do that from these toggles. This is a free account. So if you tried to go over one gigabyte of RAM, for instance, it would tell you you can't do that and you would have to upgrade into a paid plan to get up to 16 and very soon 32 gigabytes of memory.
And now I think this is ready to launch and what will happen, it'll try to put it up in a new tab. It blocks it. The browser blocks it initially, but if you hit try again, so this pops up in a new tab. We have our interactive application going out, grabbing the data from the database and allowing the user to then change the number of bins associated with the histogram or the length variable. So this is pretty cool.
Managing members and usage
If I click back on biostats into my space, I'm going to see a few things. I now have two pieces of content. I have my Shiny application and I have the RNA analysis project. When we were going through the project, we already decided to make the project be available for all space members. You can see that this project has created one output, which is this RNA app, but you can see that the app itself is set to private right now. If I go to the dot dot dot here in settings, I can change the access rights for the application itself. So I can change it from you to everybody in biostats. Now, of course, the only person in biostats at the moment is me, the creator of the space. But if I go to members, I can start managing who has access and rights to come into my space, interact with my projects and view my outputs. There are a few ways that we can do this.
I could do an invitation requirement where I click on add member and I would put in your email address. You would get an email inviting you to the space. You'd have to sign up for Posit Cloud, but then you could become a member within this space. Or another common way is through sharing a link. And if you share a link, you can also select the role of the person who is going to be invited in and change the permissions associated with what their status would be, again, in this biostats space. Just for an example, I'm going to copy the sharing link and quickly go off screen to another account that I have. I'm going to get the invitation. I'm going to accept the invitation. And now that I've done that, I'm going to be able to refresh my page here. And you can see in addition to the demo user, what we started with, I've now brought in a contributor named Alex Chisholm, who has a certain set of rights.
And going back to content, because I've made these two pieces of content available to everybody who's a space member, now Alex Chisholm on his account will be able to see these two pieces of content, open up my project, save it as a new one for himself, or interact directly with my application and start playing with that interactive dashboard that we made.
The final bit that I want to show you here, and another benefit of Posit Cloud, is when you have a lot of people in a space and you have a lot of content that could be interacted with, it's nice to know who is finding value in what. So if you click on the usage tab, you're going to find a set of usage analytics for the specific space that you happen to be in. You can set it to calendar month or your billing month or the usage period if you happen to be on a paid plan. We just started this space, so you won't expect a lot of activity. But if you go in here, you can see that both the demo user and Alex Chisholm, the other member, are accounted for. The person we invited hasn't interacted with anything yet, so they don't have any compute hours associated with the assets. But if I click on demo user, you're going to be able to find out just exactly what pieces of content that this person has been engaging with. And you can see most of the compute time has been associated with the project for RNA analysis, more so than the output itself.
I know we covered a lot of information in this short tour, but I think it touched on all of those major parts, those pillars, those building blocks of Posit Cloud that we started with today. We were able to create a new project, publish an output, both of which were based on a live data connection. We showed a little bit about member management within your space, and then looking at usage analytics to see what people are up to. So we've been able to do a lot. I hope you enjoyed that demonstration of Posit Cloud. We continue to innovate and add features and functionality. I encourage you to visit posit.cloud to create a free account today and reach out to us if you have any questions. And we're happy to take some here as well during Q&A. Thanks so much.
