Resources

Live Q&A: August 30th at 11:30 ET | Pins Workflow Demo Q&A Session

**IF YOU ARE HERE FIRST, PLEASE JOIN US IN THE DEMO ROOM AT 11 ET ON AUGUST 30TH - https://youtu.be/t8A-ysXinpE?feature=shared ** The Q&A portion will begin at 11:30 ET in this YouTube Room and the demo *should* automatically bring you over here. Thanks so much for joining us! - Rachael ____________________________________________ Please join Ryan Johnson, Isabel Zimmerman, and Rachael Dempsey here for live Q&A following the end-to-end workflow demo on pins on August 30th. If you end up here first, here's the link to the demo room: https://youtu.be/t8A-ysXinpE?feature=shared Please use the YouTube Chat for Q&A or feel free to ask questions anonymously here: pos.it/demo-questions If you'd like to add future end-to-end workflow demos to your calendar you can use this link: pos.it/team-demo Follow-up links: * Demo recording will be here: https://youtu.be/t8A-ysXinpE?feature=shared * Posit Team: https://posit.co/products/enterprise/team/ * Talk to us directly: https://posit.co/schedule-a-call/?booking_calendar__c=RST_YT_Demo * Posit Team demo resources: pos.it/demo-resources Thanks for joining us!

Aug 31, 2023
34 min

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hey everybody, we're going to wait a minute here for everybody to come over from the demo. Thank you so much for joining us. How's everyone's day going?

Let's see if it pushed everybody over here. I just re-shared the Q&A link in the chat as well. We'll give everybody, let's say two minutes here.

Hey Sanjay, hey Sergei. Nice to see you all in the YouTube chat here. Okay, I can see a good number of us are coming over here so I think we can get started. But hi everybody, so nice to see you here on YouTube. Hope everyone's having a great Wednesday. And thanks for joining us over here for the live Q&A.

Thank you so much Ryan for the great Pins demo as well. I guess we could all introduce ourselves here too. If I haven't met anybody yet, I'm Rachel Dempsey. I lead our customer marketing and pro community here at Posit. And Ryan and I are joined by a special guest here today. Isabel, if you want to introduce yourself.

Hi everyone, I am Isabel. I'm the maintainer of the Pins for Python package and also the setter for Python package. Which is a model ops package that lives very closely to Pins. So I've got a lot of love for this nifty little tool that I get to use all the time. It's quite useful for my everyday work. So thanks for having me.

Ryan, I know you introduced yourself before in the demo. But why don't you introduce yourself again here. Yeah, absolutely. So I'm Ryan. I'm a data science advisor here at Posit. And I'm the one whose voice you just heard for the last half hour. I really enjoy giving these demos. And I hope you all are finding them valuable. It's pretty hard to believe we've already been doing this for five months. But hopefully this structure works well for you.

I saw a lot of great feedback afterwards in the chat. And so we're certainly going to incorporate that feedback into future demos. I would love any additional feedback. And love all of your questions about Pins, workflows, Posit team, all that fun stuff.

Introductions and icebreaker

Well, I was talking to a few different community builders. And we were talking about fun questions to ask people in the community. And I would love to ask you all here, what's something that you're proud of this month? And maybe Ryan and Isabel, you could answer that for me too.

I think in a non-data science world, I'm so proud. I have a five-month-old puppy that is a little high energy. And she finally learned how to sit and down this month, which is very exciting. Maybe in a data science world, I have been a longtime contributor and user of Pins. But it was just within the last month that I was promoted to maintainer. So that is also something I'm very proud of and excited to continue working with.

I would say from like a non-data sciencey, non-Posit thing, my wife and I are expecting our first kid in December. And we procrastinated a lot in preparing for this kid. And I finally painted the baby room. And I am very proud of that.

Non-data science, I've been taking learning guitar more seriously the past few months. And I've actually been practicing a lot more. Work-related, we're working on releasing a new Posit Connect page that I've worked on a lot, the first iteration of it. And hopefully that's going to be going live this week. So I'm really excited about that.

Q&A on pins

Well, let's jump into some of the questions from earlier. As a reminder, you can ask questions anonymously, too. But you can also use the YouTube chat here. So on the screen it says the short link, POS.IT slash demo dash questions. That will bring you to Slido where you can ask anonymously. And we'll be hanging out here for 15, 20, however many minutes you all are asking questions.

But one of the questions from earlier was, are pins mainly for sharing data between users? Or can R code be shared or pinned as well?

Pins are really great for sharing typically objects. So things like data sets, models, plots, objects in R and Python. In terms of code itself, we would certainly suggest some type of version control using Git, for example, publishing your source code to GitHub, which makes it super easy to share that source code with other users. So I wouldn't necessarily say pins is a great way for sharing code. But it is a fantastic way for sharing the objects that you're creating in R and Python.

I would completely agree with that. If you're looking to use the best tool for the job, I would say version control for code. And then pins for a little slice of data or not necessarily large data sets. This is not a replacement for a database. But being able to share code quickly with others, probably not for pins.

So, Isabel, I know you helped me answer this question in Slido. But the question was, why do you paste the API key in the Jupyter notebook? Is that a security risk?

So this is a demo. This is not a security risk for us in this very moment. It is best practice to be using a package called .env in the Python world. In the R world, it will automatically load from your environment when you make the board connect. But in Python, if you use that .env package, you can use a .env file and it will load in all of your environment variables. Then you don't have anything in plain text. And you can share your screen with others and they won't see your API keys.

Another question I'll copy over to show on the screen here. Is wondering about the speed with large data sets or objects and the relative merits of different object load methods. Local, directly from pins, loaded into memory from pin.

Pins is really great for sharing these like final products that you've built. Whether that be like a cleaned up data set or a polished model or just really pretty plot that you want to share with other people or other pieces of content, for example. It is not a substitute for storing terabytes of data, for example. And you should really consider using a database for stuff like that. So I always personally like, you know, if I have a raw data set, I clean it up. It's nice and pretty. And I want that data set to feed into a lot of different pieces of content. That's a perfect use case for pins, in my opinion.

There is a suggested limit on the pins documentation of 500 megabytes. That is what we have tested up to. And we know pins is super equipped to handle. So anything after that is a little bit at your own risk for the speed.

I'm going to jump over to some of the YouTube questions here. And I see Rebecca. Asked, are there other features? So pins is newish. Are there other features in the pipeline? I see the delete old versions mentioned.

So for between the two packages, there's a little bit of work right now of getting pins Python, which is even newer than pins are up to speed. Specifically looking at versions and non-version boards, looking at being able to de-parse boards. So move these into different files for like pipelining things with data burn specific. I don't know if there's really anything big on the horizon, but we're always looking for community input, especially when people are looking for new locations of boards. There are other locations beyond Posit Connect like AWS or Google Cloud. GitHub was one Kaggle was one as well as like local folders as well.

So is it 500 megabytes per pin object or 500 total in pin? Per pin object.

Harlan, I see you had a question here. Can the data stay on the data server without having to be transferred to the pin server? This is just to keep data security model consistent without having to have two data servers.

It's really going to be up to your use case. A lot of folks, you know, take their data from a database. They pull it in and clean it up and they can reload it to that same database. And that's totally fine. That might be a work well for your team. But if there's a, if you have Posit Connect and you have these polished data sets and you want to share that content with other pieces of content on the Connect server, you know, I think it makes a lot of sense to publish it to Posit Connect so that pin can be easily made available. But I don't think there's really any right answer to that specific question. It's kind of based on your team's need.

And it might be helpful. I know I answered another question with this, but we're also happy to chat directly with you, like one-on-one, so we can talk about your specific use case.

I see, Joseph, you asked a question that says, is there any way to schedule a workflow to run based on an event? For example, every time a new data set is downloaded to a directory.

Yeah, so for the Connect server, at the moment, it's really based on timed scheduling. So you set a time and then you can set it to every day, every second. It could be as infrequent as every year, two years, if you had a workflow for that, which would be kind of odd. But how most folks kind of get around that is they can have it, you know, check for updates every minute. And if yes, kind of program these if statements. If yes, it can do something. If not, it cannot do something. But I currently, within Posit Connect, at least, there's no trigger to set off a workflow, other than just kind of a scheduled report.

Emails and Blastula

I see there's a few questions on sending emails. And so maybe we'll try and group some of these together. But one was, is there a way to send the email with the content in the body of the email rather than as an HTML attachment?

Yeah, I can speak to this one. So the answer is yes. So when we went through today's demo, I created a Quarto document and I placed that on Connect. I set that to run. And as you might've seen, when you send that email, it's typically going to be a direct link to that content on the Connect server. However, with R Markdown, there's another package called Blastula. And some of you may have heard of this package before, but what that actually allows you to do is to embed plots, tables, text directly in that email rather than a direct link. So the answer is yes, you can embed content from an R Markdown document using the Blastula package.

I see George asked a question over on Slido and said, pins is awesome. If I used it, I would probably have a hundred plus pins. How would all of these be managed, searched, or categorized?

There's a few different ways you can do this, especially with using Connect. So Connect has some built-in organization tools, including the ability to add labels to various pieces of content, including pins. So for example, if you had maybe of those a hundred plus pins, maybe you had five or 10 of them associated with a specific project, you can easily give those a label and search for them easily within the Connect instance. There's also other tools. Some of you may have heard of a tool called Connect Widgets, which allows you to take various pieces of content. It doesn't have to be a pin. It can be a pin, a Shiny app, a Quarto, or a Markdown, and essentially create a single document that organizes all that content. So you'd certainly have options for organizing your content on the Connect instance, and I would definitely suggest using the various labels and also Connect Widgets.

Can you use the usual pin read in a Shiny app, or does it have to be pin reactive read?

I'm not a huge Shiny user, but I believe the way this works is pin read is if you want to read it once, and then pin reactive read is if you want it to update every single time some reactive component changes. Yeah, that's a perfect answer. That's exactly what I was going to say.

Arthur I see asked in YouTube, any plans to offer pins for other languages? Not currently, but if someone's interested in starting their own pins for another language, we would certainly love to support you in your journey.

Pin versioning and archival

What is the pin archival process?

I think it's referring to probably more like the pin versioning. And so as you take a pin and that data will essentially change, most data does, you can pin it to the same location and it doesn't overwrite the original data. It just creates another version on top of it. And so you can essentially have multiple archive versions of the same pin in one location rather than having like 20,000 data sets. And I know as we've been teasing this demo on like LinkedIn and stuff, we talked about how do you take a data set or anything and make it like the final version? Like, what do you call it? Is it data final, data final one, data final two, data final, I'm totally serious dot CSV. This, the versioning allows you to just have it in one location and the most recent version will always be up to date, but you can always go back to previous versions as well. And you can kind of set thresholds like how many you want to save.

And Isabel, I'm not sure if there's like, there are ways to trim certain versions as well. Is that correct? That is correct. You can use something called pin prune, I believe is the function that you would be looking for to remove pins at a certain interval.

I'm totally serious dot CSV. This, the versioning allows you to just have it in one location and the most recent version will always be up to date, but you can always go back to previous versions as well.

Rebecca, I think I maybe missed this question earlier, but you said, I see the 500 megabyte in the documentation. Is there a way, or maybe it exists to throw a warning if you're trying to pin something larger than that?

I don't know off the top of my head if it does throw a warning or not. Connect will fail if you pin something larger or like so large that it can't be handled. I don't know if we would ever have like a hard error just because that is a little bit of a fuzzy, like if you pin something that's 501 megabytes, I don't want your entire pin to fail, but this is great feedback that maybe it should be either something subtle or maybe a few more times in documentation or somewhere because yes, it is hard to remember all those nitty gritty pieces as you're trying to just complete the end to end workflow.

Sharing boards and collaborating

How would one go about sharing a board in a project that has multiple collaborators that all need the same data? How does the project know about the board?

So every time you're creating a new document or a new artifact, or what I think is what is being referred to as a project in this question, you're gonna have one line of code too if you wanna return it across multiple lines that says like board equals board connect or board equals board folder. And that is how that specific piece of code knows about the board. And you will run that every single time you create a new artifact.

Another question from YouTube was, so if you're on a team and you do the work in Posit Workbench, how do you pin the output for your team instead of it just being used by yourself?

Yeah, if you work in Workbench, how do you pin the output for your team instead of yourself? Yeah, so that's actually kind of like a perfect use case for Posit Connect. So Connect, you know, the take home message is that it takes anything you publish to it, a pin, a shiny app, quarter document, anything, and makes it easily, but also kind of, you can totally customize who has access to it and who you can share it with. So we talked about, I'm not sure if we covered a lot of detail in today's demo, but once it's hosted on Connect, there are sharing settings. And so if you want to take that pinned output and you want to share it with your teammate, other people that have access to Posit Connect, you can be very explicit and kind of call them out individually. You can open it up to everyone who has access to Connect Server, or you can open it up to the world as well. You have a lot of options for how you can share this pinned object with your colleagues.

Can somebody download a pin data set or plot directly from the board? They said, I can see this being useful for a not so technical supervisor. With Posit Connect, you absolutely can. There's, when you click on the pinned object, there will be a link to directly download the pinned object. I'm not sure for other boards, but I know you can at least do that for Connect.

I see another question from Christopher, which is, if I create a board locally with the intention to share with other users, does the board need to be created in a shared drive or can other users connect to a board in my file system?

My understanding is that if you do intend to share that with others, it would have to be on a shared drive. But you can use things like Dropbox and S3 Buckets for more collaborative boards, including Posit Connect, or you can share to a local file, either on your personal computer or if that file is on a shared network, you can share it that way.

Posit Conference and wrap-up

Another question is about, and we didn't mention conference yet, but the Posit Conference is coming up in just a few weeks here and would love to see many of you there. But one of the questions was, will some of this be included in the Posit Conference in Chicago? If so, which sessions?

The only thing I can say, we do have a kind of like a professional tool lounge where we can just hang out. You don't have to be a customer of Posit. You can also just, if you're a, you know, you love Posit and you love data science, a lot of Posit employees are just hanging out and we love talking about this stuff. But I actually will be there. And I think at least twice we will be hosting some demos as well, some live demos. And Rachel and I were actually just talking about this yesterday. And we think like incorporating pins into one of these demos, since obviously as you all can see, there's a lot of interest in this package and these various workflows that it enables. So that's the only thing I can speak of. I'm sure there's sessions and talks that are going to talk about pins.

I just realized we were over the top of the hour and didn't even notice how quickly that went by. But I think we have gotten to most of the questions. But again, if there's anything that we missed, feel free to reach out to me directly here. We can also continue using the Slido link, so if you wanted to just keep asking questions here. But for some people who had one-off questions where you wanted to talk to us specifically about your use case or your team's environment, I did make this little short link where you could schedule a call with us too and chat one-on-one.

But thank you, Ryan and Isabel, here for answering all these questions and joining us today. Thank you, Ryan, for the great demo as well. And thank you all for joining. Hopefully, I'll get to see some of you in person at the Posit Conference too, or even virtually if the conference is available virtually as well. Thanks, everybody. Thanks. Bye, everybody. Have a great rest of the day.