Resources

Fastest way to Convert Jupyter Notebooks into Analytics Reports! (using Quarto)

Transform messy Jupyter notebooks into polished HTML reports with Quarto. Learn how to create professional, client-ready reports that effectively communicate data science insights to stakeholders. Using a telecom customer churn analysis example, we'll cover: - Converting notebooks to formatted HTML reports - Customizing layouts and interactive elements - Adding business context to technical analysis - Publishing and sharing reports effectively Perfect for data scientists and analysts who need to present technical work to business audiences. Link to Code: https://github.com/KeithGalli/telecom-churn-analysis Quarto Crash Course Video: https://youtu.be/_VKxTPWDhA4?si=kcbQ8M9p6HH5QE2w Share your work with Posit Connect Cloud: https://pos.it/keith_qc Video by @KeithGalli Video timeline! 0:00 - Video Overview & Accessing Code/Data 1:46 - Rendering Jupyter Notebook as HTML Output 3:30 - Making quick improvements to our HTML Report (adjusting YAML parameters, hiding code/output cells) 8:15 - Understanding the Business Context & Insights from our Analysis and adding written details to report. 14:30 - Improving page formatting (margins, body size, etc) 17:30 - Adding business recommendations for our telecom client 21:00 - Further aesthetic improvements (larger font-size, organizing info into columns, using a qmd file, etc.) 30:17 - Publishing our HTML report using Posit Connect Cloud #python #jupyter #quarto

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hey what's up everyone and welcome back to another video. So in this video we're going to solve a common data science data analysis challenge and that is ultimately bridging the gap between the you know data science notebook or wherever you write your code and the presentations the slideshows the reports that you often share with business stakeholders. So in this video we will see how we can take a notebook that looks like this you know a bunch of code everywhere a bunch of graphs but it's a bit messy there's you know a lot here and it's hard to process by just looking at this alone and we'll see how we can turn it into a much more human readable much more digestible html analytics report as you can see here with descriptions of you know what the different graphs means and some recommendations for a telecom company.

So the data set that we'll be working off of in this video comes from this Kaggle data set it is telecom customer churn data so given a bunch of customers and information about the customers like how much they're spending per month how long they've been with the company etc can we predict or not even necessarily can we predict but can we see why certain people are churning meaning they're leaving our telecom company versus others are not. So this data is available on Kaggle I think it originally came from IBM but you can also access it via the github repo that is linked in the description.

So we'll see how we can take a messy notebook that has some analysis and insights and turn it into an html report using Quarto.

Rendering the notebook with Quarto

So I think the first step in this is that we have this notebook and I think we can see what happens if we render this with Quarto by default as an html file. So what I might do is that you can access all these files via the github link so feel free to clone the repo but I might just make a copy just so I can kind of see the changes and what we've done of this notebook paste it and I might call this something like report.ipythonnotebook and both the starting file analysis.ipythonnotebook and the ending file this report that we're going to be working off are both in the github repo so feel free to see the before and after.

But the first step that I want to do is take this notebook I'm going to open up a terminal window and I guess I can do that in vs code with this and I want to render this ipythonnotebook as an html file. If you're not super familiar with Quarto we've posted a Quarto crash course recently. A link to that is in the description so you can understand all the basics of Quarto and a video like this will make more sense but you also can watch this without any prior Quarto knowledge and kind of get a feel of what you can do with Quarto. But I'm going to go ahead and I'm going to do a Quarto render report.ipythonnotebook and it's kind of cool that I can just create files off of these Jupyter notebooks directly and we see we have a report.html created so if I look at my files we see a report.html.

Okay we see we have a file that looks like this and basically it just looks very very similar to our notebook so it's not quite super digestible yet. I think we can do some things to immediately make it more digestible for a potential client. So imagine we are working for this telecom company and we're a consulting company and they hired us to be like hey why are our customers leaving? We want to get it into a format where we can bring back our analysis and be like hey this is why and we're not quite there yet. We don't want them to see the code. We don't want to see a lot of the aspects of this so let's go ahead and make some immediate changes that will make this a bit easier to share with a client.

Hiding code and output with YAML parameters

The first thing you'll want to do is we should above our title add some of the yaml parameters so with any sort of Quarto document we're going to add these parameters and I think it's fine that you do this in a markdown cell. I think you could also do this in a raw text cell within your Jupyter notebook but I'm going to make this format. We'll start with an html. I don't need to add a title right now but one thing that I do want to probably do is I'm going to make the code echo equal and false. We'll make the code not show up so I'm going to go ahead and now rerun this. We can look at this file now and refresh this and now we don't see any of our code so that's a good starting point.

However there's also like just a bunch of random text and stuff on the screen that I'm still seeing so let's maybe try to hide some more details here so like for example all this stuff probably don't need to see so one thing that's cool that you can do is you can do use this pound pipe syntax and specify different parameters within your code cells so if I do output is false here and now rerun this we see all this info right now in our I guess I'm at the very bottom so let me go up top we see all this info right now in our report but if I refresh this we don't see it anymore and we start with our table so that's a bit better of an option.

I also like don't need to show stuff like import necessary libraries and load data that's not very helpful here so let's go ahead and hide that from our notebook and one thing you can do is if you know Quarto you can use that same type of fence div syntax so I could add a fence div like content hidden there's a bunch of different built-ins like content hidden that are useful to know because you can also do like content visible and you could add like conditionals like only showing these headers if the format is a pdf for example so content hidden and content visible are good parameters to know about I add this and now if we rerun our notebook we will see this disappear when I refresh so we see that disappear so that's hiding like a header.

This is already looking a bit better but it's not perfect one thing that I might want to do too is that there seems to be error messages down here I probably should hide all error outputs I think that would be worthwhile so I can do that with doing warning is also false and again we'll regenerate this refresh this and we see those messages disappear.

Other things is maybe as part of this analysis as part of this consulting work we're doing we're building a model to help predict churn of users but maybe all of this stuff is not quite ready to share yet so if I wanted to hide a bunch of sections in code I could do it the same way we just hid that that header so go into our code I'm going to go down to the model section and above this I will add that same fence div and I just want to hide all of this content for now maybe in the future we will share another report with them and I could add another so we have this up top I could add another cell down at the bottom and just close off this will make make this mark down and close off it and if we run this we're going to see that whole model building section disappear so we see we have model building right now if I refresh the page it all disappears.

Adding narrative and insights

Okay so now we just really have our analysis and I think graphs alone aren't necessarily the most helpful I think it's a matter of showing the graphs and telling this story alongside the graphs so let's start telling stories as part of our code and I guess to tell the story we have to understand the story so a few things that we'll note immediately is that okay roughly I guess this gives us an exact number 26.5 percent of our customers churned and that's about you know roughly 2,000 and you know we had you know 5,200 let's say stay as part of our company.

If we look at the distribution of this information we see that their tenure so how long they've been with the company the likelihood that they're going to churn is super super high it's a great percentage of those new customers and it's still pretty high the first few months here in this histogram we see so if you're a new customer we want to pay more attention to you because you're at the greatest risk of churning as we can see by this so understanding new customers are likely to churn that's helpful insight for our customer.

If we look at the distribution of this information we see that their tenure so how long they've been with the company the likelihood that they're going to churn is super super high it's a great percentage of those new customers and it's still pretty high the first few months here in this histogram we see so if you're a new customer we want to pay more attention to you because you're at the greatest risk of churning as we can see by this so understanding new customers are likely to churn that's helpful insight for our customer.

We also see that if we look at the relative monthly charges we kind of get you know low churn low churn low churn if the price is cheap but as we start getting into these higher price points the percentage of customers in that bin that that monthly charge range there's a higher percentage of them that are going to churn if they're paying these higher monthly charges another insight from this numeric data. And then finally we see that basically as the total amount of charges so I would think of this as tenure times monthly charges if people stick around and have spent a lot of money with our telecom company uh they're they're less likely to churn you know so we want to keep our customers happy in the early months because once they've spent some money and you know know our service know our product then they're less likely to turn so another insight from all of this.

So I'm doing things like this actually getting the insights and thinking about them and then let's actually write some text on the screen that kind of captures some of this stuff so I'll go ahead and capture some of those thoughts and then you can also look through these bar charts and see some other insights on these more binary variables so like gender has very little effect if they're a senior citizen yes they're more likely to churn so maybe they're not understanding the technology as well or something maybe there's more hurdles it seems like if someone has a partner independence they're less likely to churn so maybe just more important that they have our service they're less you know independent and can just you know pick up and leave and move to a new new provider more more people and their family depend on the current service. We see that fiber optic gives us a much higher likelihood of churn as well as month-to-month contracts are great great indicators of churn so we want to probably move as many people to one year and two-year contracts so some more insights I'm just kind of speaking out but let's put this all on the screen in our notebook.

Okay so we have our report and I want to add some text to all this so we just were talking about these plots so maybe I add a markdown cell above or below this graph and like this markdown and I can paste in some thoughts I have about these numeric features this is kind of what I just said you can add this and then maybe for the same thing with the bar charts we add some text details we had this markdown and we add some you know bullet points on kind of what we're thinking about so these were some insights from the bar charts.

Cool let's uh rerun this real quick and just see what we have now one good thing to note real quick is if you are generating these Quarto reports via the command line and via a jupyter notebook one thing that you'll want to do is add is potentially add this render and then this dash dash execute parameter this will just make sure that all of your jupyter notebook cells are run so that it's not just using what's already on the screen it's actually rerunning the code as well to produce the report. Okay so let's run that command.

Okay and now let's look at our report okay so we should see some text now pop up on the screen. Okay we see that that's good we'll improve the formatting soon that's good we have some more insights maybe we also add some insights at the top so before this initial data exploration section so just kind of talking about the process that we're doing in general so what I could do here is add you know right below this title some more text so maybe I'm just giving a bit of an overview on what we're doing here and let's say our consulting company is called like alpha consulting co and this is just kind of a you know a high level what this report is providing.

Add that to our report try to think if there's other things that we want to do um one thing that I'm seeing down here too is that this text is showing up on the screen alongside our graphs if you do a plot.show with your your matplotlib this will help fix this issue so I'm going to do plot.show in both of these scenarios so we should see that disappear.

Let's go ahead and run that command again okay so we should see this text disappear we should see an overview over at the top of our report cool we got our overview we got our high level overview of the data.

Layout and formatting improvements

Cool this is starting to be better and better one thing I'm seeing is some of the page details are getting cut off one there's a few different solutions to this one thing that you can do is there's a web page that can be helpful here I'm going to go to docs output formats page layout and this gives us a little bit of information on you know an html page as well as like you know how much space we have in the bottom body here so what I might do to help give myself more space for my graphs is I might make the page layout full and then I might specify a little bit of a larger body width here in our document.

So to do that we can go into our parameters I can add a maybe we specifically modify the html settings in case we wanted to add more types of documents so I want to add page what was that called page layout full full and then maybe I wanted to customize the grid and we see the grid down here at the bottom and maybe I want to change the body width to be a bit bigger than 800 px so we're basically changing this we're changing those specifically the body width so I'll say it's like 1200 px this is kind of a little bit of a playing around with things to figure this out.

Cool another minor detail that you'll want to do is if you just took this report.html file and like let's say I you know opened it up in the finder let's say I made a copy of it and put it in my downloads and then I tried opening it up what you'll see is a file that looks like this and not the most appealing looking file so another thing that I recommend that you add to your code is embed resources true and that basically instead of adding these extra files that you know do the formatting or like the styling and whatnot it all encapsulate is encapsulated in the report.html so if you do move the file or share the file whoever gets it will be able to work off of things immediately.

Maybe we also wanted to add a pdf version of this to our doc and this would require some more formatting probably I probably won't get into the details here but it's a cool setting to see that we have and maybe we also add a table of contents to our doc so we added a bunch of things here let's see what our changes look like.

Okay so let's think what we added here we had a table of contents pdf we embedded the resources we made the page wider so now we got something that looks like this taking up much more of the page and you can kind of decide what's right for you but I like that it has this ability that's pretty good we're getting there.

You know maybe we add some recommendations down at the bottom for what we recommend they do as a company based on what we've already chatted about a little bit like we want to spend more attention with them early on in their tenure we want to maybe add incentives or something to get more people off of these month-to-month contracts and to the one-year and two-year contracts because we see with these graphs down here like these are all the month-to-month contracts high churn percentage even if they've been with the company for a while whereas these are the churn percentages for one-year contracts you know pretty you know you know pretty low across the board and then it gets even lower across the board this is relative percentages for these bins when we get to two-year contracts so maybe we offer some discounts or whatnot.

This graph isn't very helpful let's hide this graph too but let's add some recommendations to our notebook so you can either do it below or before model building I would I'm gonna just do it after or before the model building stuff because that's all hidden anyways so we'll add you know markdown read some recommendations and I'm just going to paste these in to save some time.

Okay so here are some recommendations and here are some of our findings just kind of recapped it's good to kind of encapsulate this all in text I would say especially if you're sending this and not actually talking about it with a client in a meeting or something it's just a standalone document cool and it also mentions that we're going to you know work on our model that's currently hidden from the report that's good.

What was the other thing that we mentioned we wanted to do here oh we wanted to hide this graph specifically I think I don't think this was very useful oh it's not showing right now whatever reason let's try running all this code again.

Okay I want to hide this so I could again do output false to hide it let's rerun this we're going to see this disappear and we're going to see our recommendations at the bottom so I'm going to refresh this first what I was trying to check with this plot was that you know how much are people spending on the different contracts but this dimensionality doesn't really help us so it's much more um you know it's a lot more of a question of how much is so it's much more um meaningful of a visual I guess this is not the same visualization but basically what we see from this is just that people kind of spend different monthly charges whether they're a month a month one year or two years so there's not like a significant difference in charges that they're paying or like they're not saving a significant amount of money it seems like as is on one year two year which is something we probably should recommend for them to do but I don't need to show that graph.

Yeah so here are recommendations cool cool all right I think the biggest thing we want to do now is just maybe package this uh text and this data in a little bit better of a format so other things we could do here is we could add I could add a bigger font size maybe I make my like font size 18 pixels I don't know if this is going to be too big let's just see what this looks like real quick.

Uh I don't know if the font size changed anything instead of pixels I might do have to do pt run that one more time okay so our text is bigger so depending on if you think the text should be bigger or not that's good I think we should just try to like organize this a bit better you know make these side by side and columns like just kind of fill the space a bit better this is a really tall graph so you know put this side by side or I guess put this side by side with this.

Converting to Quarto markdown and adding columns

So let's go ahead and do that and one thing that I do encourage you to check out is that you know it's nice sometimes to work in this jupyter notebook but you also can very easily convert your jupyter notebook file this ipython notebook file into a Quarto markdown file so if I run Quarto convert report.ipython notebook we'll now have an equivalent Quarto markdown file very very helpful and like maybe I want to just do some of the steps here because I think it's a little easier to like add columns and whatnot via this Quarto markdown file.

For example we wanted to maybe make two columns so I could add a layout number of columns equals two right here and then the churn graph should be right next to the the counts should be right next to the pie chart also one nice thing about using the qmd is I can really quickly render it via the preview button and I can see it in this little window.

Okay so that's now nicely next to each other okay maybe we make this a level two header make this a level three header numeric features right here.

Okay I want to make this graph right here side by side with my text so to do that what I could do is I want to maybe move the text to be on the left side I can also separate it from this other section by using these three asterisks you'll see a line now be created between them but then I want to add we'll add a different type of layout here we'll add a layout equals and we'll put in how about 30 of our space gets taken up by the text and 70 percent gets taken up by the chart.

I think that this should work out of the box let's see and I probably will have to open this up full screen to actually see this okay side by side cool we see a nice line between here okay this layout's not working right now not sure why you might have uh specified something a little bit off.

I think maybe I have to actually surround this in quotes okay that's not quite working how we wanted to what we could do though is we can show like basically it's getting confused because it's taking this as a column and then it's taking this as a column and then it's taking all this as a column so what we could do to make this a little bit more explicit is I can make this an outer fence divs use five to do that I can make an inner fence divs which is my you know first column and then close that off and then make a second column and close that off right before the full fence div closes.

So we see that it's already looking better at least in this preview refresh cool that looks pretty good we also probably could caption this as like these are categorical features so I might add a little bit more notation.

Categorical features cool maybe I want to add some text here too just maybe add some more formatting here and I want this to take up how about 70 of the space is to take up 30 so little details on the fiber optic information.

We see we have these charts maybe I wanted to put them all next to each other so I could do something like layout number of columns equals three and I could do something called I could go to this article layout and see all the different options maybe I wanted to take up the full screen I could add this column screen parameter or column screen inset so I'll do that real quick cool.

Now what we're going to see oh I got to close this off but actually it's already basically working cool maybe I want to add a line in between these and you can decide if this is right or not maybe this is too squished but I want to add some more sections so add a line separate this a bit more add a line here to separate a bit more and then maybe add some text after this basically just saying or maybe I put it above so maybe I give this a little section header churn by type contract maybe I give this a header of you know why might internet service matter.

Okay okay okay then add some text how about here like just basically specifying what we are seeing with the different types of contracts run this refresh our page see that cool cool recommendations awesome this looks pretty good.

Adding interactivity and publishing

One minor thing is if you wanted to make this data a little bit more play aroundable you could add a interactive table so that's done in done using this itables library so I could do import itables and this is in the requirements.txt and then I could do itables.show how about the first 50 rows.

Now if we see what this looks like now we can just like have a little bit more interactivity looking at our data so if you gave this to a client they could you know play around this a bit.

Um this all looks pretty good last thing is you could either send them this report.html or maybe it makes sense for you to host it somewhere so one easy thing that you can do is if you push this to a github repo and you sign up with posit connect cloud you can publish a Quarto doc so like we could take this repo that we're working off of we take the master branch and then we could publish so I might need to push this real quick.

So there's two ways I would recommend publishing you can either publish this qmd file or you could actually add your report.html and publish and and make sure it's you know has this embed resources true and add this to your repo and specify that to be rendered so to do either of these things though I would basically see my changes git add report.qmd maybe I add my report you know the changes to my analysis report maybe I add that report.html.

Uh you know you know the one thing is once you've converted the ipython notebook to the qmd you might have two kind of files in parallel you might want to figure out what you want to do with both of them so you could either keep them both or you know decide what format you want to work in finished code from video git push origin master.

So now if I go to that posit connect cloud I will see the option to specify a primary file so our file is called report.qmd or html either of these will work just fine I'll go ahead and do the html version publish this.

Now we have our report online on a web page we can share this with someone we can get the link we can get the link right here copy it share it with a client whatnot and you know that's going to be much easier to work with this you still have your same code you're still your same notebook that you could play around with and tweak things more but now we have a nice like more human readable way to display our insights and findings pretty cool stuff pretty quick that we can do this.

You still have your same code you're still your same notebook that you could play around with and tweak things more but now we have a nice like more human readable way to display our insights and findings pretty cool stuff pretty quick that we can do this.

Um definitely can be very very helpful to incorporate into your workflow one last thing that I should mention is that we could also very easily convert our qmd file so if I let's say deleted my report file now I could very easily convert my report.qmd file back to a ipython file so you can go to and from these formats very easily it might change up things a little bit on you on how things are formatted and worded but uh can be very very helpful.

All right that is converting jupyter notebooks to reports in python using Quarto hopefully you enjoyed this tutorial if you did make sure to throw it a thumbs up and subscribe to the channel if you haven't already we'll be back with another Quarto and python video very soon until next time everyone peace out.