
End-to-End Data Science Workflow with the Posit Team Snowflake Native App
Posit Product Manager, Chetan Thapar, demonstrates how the Posit Team Native App for Snowflake delivers an end-to-end workflow—exploration, iteration, and deployment—in minutes, not weeks. Built directly inside the Snowflake security perimeter, the app gives data teams instant access to governed data, managed infrastructure, and familiar tools like Posit Workbench, Connect, and Shiny for Python. Watch how AI assistants like DataBot and Positron accelerate EDA, streamline coding, and help developers build an interactive, LLM-powered dashboard with ease. With one-click deployment and automatic Snowflake governance applied to every user, this demo shows what modern data science looks like when speed, security, and productivity work together. Learn more about the Posit partnership with Snowflake: https://posit.co/use-cases/snowflake/ Get the Posit Team Native App: https://pos.it/Team-Native-App-Snowflake
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
The goal of data science is to turn data into value fast. But in traditional enterprise data science, we see a friction between the agility that the data scientists need and the complexity of the enterprise, which leads to a start, a stop, and a wait pattern, which kills momentum. This could be waiting for procurement, security reviews, or waiting for infrastructure, or waiting for deployment, and perhaps always being stuck on outdated tools.
The Posit Team Native App is designed to solve this. Firstly, it provides instant setup, so you get running in minutes, not weeks. Second, it removes the ops headaches. It's a fully managed service, so data scientists get the latest and greatest features with automatic upgrades, and platform teams don't have to worry about managing complex infrastructure or software. It is secure by default by running inside the Snowflake security perimeter, and it inherits and extends Snowflake's data governance. Finally, it delivers improved productivity with AI tools and a seamless development-to-deployment workflow.
Demo overview
So let's see Posit Team in action. We'll do a demo today, and our job to be done is to deliver an interactive LLM-powered dashboard for our business stakeholders. We'll follow five steps, accessing the platform and data, exploring the data, iterating on our data product, deploying the artifact, and interacting with it as a business user. Let's get started.
Accessing the platform and data
The first step for me is to access the platform. Unfortunately, with the Posit Team Native App, my start is almost instant. I can install this application from the Snowflake marketplace, and once it's installed, I can activate Workbench, which is my development environment, and Connect, which is my deployment environment, all within the same Posit Team Native App.
So let's start with the development side and log into Workbench. I'm going to start a new session, and I can choose any of these managed IDEs. For this demo, I'm going to use Positron, which is our next-gen polyglot IDE optimized for data science. This is where both Python and R are first-class citizens. One thing to note here is my session credential. These are all of the Snowflake roles which I have access to, and so here I'm logging with the role SolEng into my Positron IDE.
One thing to note as this is coming up is that Workbench is automatically taking my credentials, my OAuth, and my role and securely inheriting it. We can see this if I just go into the terminal and look at my environment variables. You can see the SolEng role as well as the Snowflake home. This is where my OAuth token is stored. So what this means is that when I run a code such as this, in which I'm connecting to the Lending Club database, it should just work. I don't have to expose any of my credentials.
We can see here the connection to Snowflake has been created, and the Lending Club database is the IBIS expression. So again, notice what I did not do here. I didn't write any boilerplate connection code, manage API keys, I didn't expose any of my secrets, Workbench managed all of that, and I was able to connect with just one line of code.
Exploratory data analysis with DataBot
So now as I'm connected to this data, I would love to understand this more. This is where data scientists spend the majority of their time. And to accelerate this exploratory data analysis, we have introduced DataBot. DataBot is our AI agent that's designed to accelerate exploratory data analysis. So if you give DataBot a high-level instruction, it will write and execute code, analyze the output, and suggest next steps, keeping you in the loop the entire time.
So I'm going to start with asking it to explore the Lending Club database that we just connected to. Right at the bat, you can see that it understood that we are connected to Snowflake, and the Lending Club data is an IBIS expression. And the reason it's able to do this is because DataBot works on top of Positron, and so it has context of my session variables, the plots that I generate, the connections that I have, and essentially what that means is that the outcome that you have from DataBot is very contextually aware, and so the EDA workflow is tailored to your environment.
Great, so DataBot did some initial analysis, it found there were about two and a quarter million loans that are in this database. It provided us some key characteristics. And after that, it gives us a few hypotheses which might be relevant for us to explore further. Now I can go ahead and choose any one of these, or I can choose something that I want to explore myself. And here I am interested in the relationship between employment length and loan default rates. So let's provide that query to DataBot, and it will be off to the races.
So as DataBot is doing this analysis, let's jump into our Snowflake account, go into our query history, and see what's happening here. So you can see these are the queries that DataBot is generating on my behalf to understand the data. And you can see these are not simple select star from and downloading all of the data into your session memory, but they're involved SQL queries that it is running so that the computation can be pushed to the scalable Snowflake warehouses versus doing that within memory. And we should be able to see this here. So if you look at the data table here, you can see a total of about 50 to 60 rows that it has imported on a two and a half million database.
And you can see these are not simple select star from and downloading all of the data into your session memory, but they're involved SQL queries that it is running so that the computation can be pushed to the scalable Snowflake warehouses versus doing that within memory.
So we see here that DataBot found some insights, found some missing employment data, and some interesting patterns. And this is interesting. It seems like there is a inverse U-shaped curve in terms of the relation between employment length and default rate, and it provides us some business implications there. And so we can go ahead and go dive deeper into any one of these.
But another thing I could do here is say, hey, all right, I want to re-branch my analysis. This is a hypothesis that I'm not so interested in exploring further. So I just go back and ask it to, for instance, in this case, analyze geographical patterns in loan performance. And it starts doing this analysis as a different branch. And I can switch between different branches and go to one which is more relevant. So this is almost like a get branch for analysis, see which one's most promising and go for that one. So let's go back to the geographical pattern analysis that's happening.
So DataBot has gone and done some analysis around the regional patterns, found some economic correlations, and some key insights around clustering and so on. And this is, out of the two branches of analysis, I feel that the geographical pattern analysis is something which is more promising. And so at this stage, I'm like, OK, I want to make sure that this is reproducible. Maybe I share it with my peer or just share it with myself for posterity.
So that's where I can use something like SlashReport, which is going to create a Quarto file which is reproducible where I have all of my code. And before it does that, it actually provides a report outline because this is the actual output. So I want to make sure that the plan is something which makes sense to me. And I'm going to just go ahead and say, yes, this outline looks fine. But you can definitely edit it if there are certain parts of the analysis that you wanted to highlight. And you see here, the Quarto document has started out. There is a call out that this has been created by AI, and there is a place for human review along with the name, the role, and so on. And this will now create a clean file which is fully reproducible.
So once this is done, we will have this available in our Explorer. You can see the Learning Lab, Geographical Analysis, or QMD. And this is, again, what DataBot is focused on, making sure we accelerate the data scientist's productivity by having a code first ethos as well as ensuring that the data scientist always stays in the loop, has reproducible artifacts. And so we're really excited about this innovation.
Building the shiny-python/">Shiny for Python app
Now that my EDA process is complete, I can go to the next step. I've already used my insights to build this Shiny for Python app, which will query the Lending Club database. Now, before I go further in the workflow, I want to show a couple of important open source packages. The first is QueryChat. This package allows me to interact with my data using natural language. It provides the UI scaffolding for that. The second one is ChatList. This really abstracts away the complexity of connecting to LLMs. So here you can see I'm connecting to Snowflake Cortex and using the Cloud for Solid model.
All right. So let's see how this app, the Shiny for Python app, is doing. We'll run it within Positron. This is what QueryChat allows me to do, ask these natural language questions to this dashboard. And I can ask something like show states only on the West Coast. At the back end, it's going to leverage Cortex to do this text to SQL and then update my dashboard, which is pulling data from the Lending Club database automatically. So I see the dashboard has been updated. If I go into state analysis, I just see those three states, which is associating with the West Coast. So this is good.
Now, let's say at this time, a stakeholder just asks me to see a simple table of default rate by state within the dashboard. We don't have that in the state analysis. And so I can either build that code within my code base directly, or I can use Positron Assistant. Now, Positron Assistant is our LLM tool within our Positron IDE, which helps with code generation, explanation, and debugging. So I'll ask it to calculate the default rate by state and then add it to the dashboard. So let's see how Assistant does this.
Okay, so Assistant has made some changes in our code. We can see it added some aggregation logic, and it showed that this is something that's added to the UI as well. And we can go ahead and see how our state analysis is looking. So I see there is a default rate by state, and there's a visual that's been added to show this default rate. So some more details, but I'm happy with it at this stage. So I'm going to keep the changes that Assistant has recommended, and I think we are ready to deploy this application into production.
One-click deployment
So the actual process of deploying an application is just dead simple in Posit Team Native App. Because Workbench and Connect are in the same Native App, the networking and authentication challenges that are often found in connecting separate development and deployment environments, those are eliminated. And so all I need to do is go to my publisher and just click Deploy. And here, Connect handles the process of managing all of my dependencies, ensuring my code runs successfully as a deployed artifact, and you can see within a few seconds, we have already published something to Connect.
So let me just go back to Posit Team and open Posit Connect. And here we see the Shiny for Python app that we were building that I just published a few seconds back.
Interacting as a business user and data governance
Now let's see how our users are interacting with this app. On the left, I'm logged in as a business manager. I'll ask our app, what is the average employment length of loans originating in New York? The Cortex-powered app understands the question, builds a SQL, queries the data, and returns the answer. About six years based on 175,000 loans. Perfect.
On the right, a junior analyst with a different Snowflake role asks the exact same question. And they're getting a different answer. LLM tells them that the data is actually masked. Why? Because their underlying Snowflake role has this data masking policy. So we wrote zero lines of security code in our app, and the Posit Team native app inherited and extended Snowflake's data governance.
So we wrote zero lines of security code in our app, and the Posit Team native app inherited and extended Snowflake's data governance.
Recap
To recap today, we got instant secure access to Snowflake data with managed credentials on Posit Workbench. We went from raw data to an EDA report in minutes for DataBot while pushing the analysis to Snowflake. We used Positron Assistant to instantly iterate on our production data product. We saw a frictionless one-click deployment flow to connect. And finally, we delivered a fully governed secure data product that inherits Snowflake's data governance. This is the end-to-end data science workflow. There's no start, stop, wait pattern, just flow. All inside the Posit Team native app on Snowflake.

