Alan Carlson | Robust, modular dashboards that minimize tech debt | RStudio
Robust, modular dashboards that minimize tech debt Presented by Alan Carlson, Snap Finance Abstract Dashboards can be complex but building them shouldn’t be! We’ve built a wrapper for developing production level dashboards that streamlines onboarding new developers and standardizes the initial infrastructure to mitigate tech debt. Now you and your team can spend more time developing insights and less time trying to spin up shiny code with {graveler}. Speaker Bio As the Tech Lead for the BI (Business Intelligence) team, Alan's primary focus at Snap is researching, creating, and maintaining methods that help the rest of Snap’s BI Team in their work. From dashboards to visualizations to R code in general, he has built multiple packages and bookdowns that make BI easier to train and to use within the RStudio environment. Helpful Links: Blog Post: https://www.rstudio.com/blog/make-robust-modular-dashboards-with-golem-and-graveler/ Graveler package: https://github.com/ghcarlalan/graveler Environment variables: https://docs.rstudio.com/connect/user/content-settings/#content-vars Git-backed publishing: https://docs.rstudio.com/connect/user/git-backed/ If you'd like to join events live: colorado.rstudio.com/rsc/community-events Question about style guides: Tidyverse Style Guide: https://style.tidyverse.org/ Efficient R Programming book that Colin Gillespie wrote: https://csgillespie.github.io/efficientR/ Questions about RStudio Team: ⬢ RStudio Connect: https://www.rstudio.com/products/connect/ ⬢ Chat with RStudio about RStudio Team: rstd.io/chat-with-rstudio
image: thumbnail.jpg
Transcript#
This transcript was generated automatically and may contain errors.
Thank you so much for joining us today. Welcome to the RStudio Enterprise Community Meetup. This is our finance meetup today. I'm Rachel, I'm calling in from our Boston office. I'm actually in the RStudio office today. I'm joined by my co-host, Dylan Lackey from our finance team as well.
Dylan, feel free to say hello if you want. Yes, yes, hello everyone. Dylan Lackey here based out of the US in Austin, Texas. Been in open source software for many, many years, helping folks specialize in making that jump from open source to more of a robust enterprise offering. Actually, Rachel is my predecessor, so everything I know I've learned from Rachel. So happy to be here and happy to help out.
If you just joined now too, feel free to introduce yourselves through the chat window and maybe say hello or where you're calling in from.
I do just want to mention that if you want to turn on live transcription for the meetup, you can do so as well, just in the Zoom bar below if you press the more button.
But to go through a brief agenda for today, we'll just have some short introductions and then Alan will lead us through their journey towards a reproducible development workflow using a shiny dashboard framework. And then we'll have lots of time for questions and open discussion too. And just a reminder to everyone that the meetup is recorded, so it will be shared to the RStudio YouTube after.
But for anyone who is joining for the first time, this is a friendly and open meetup environment for teams to be able to share the work that you're doing within your organizations, teach lessons learned and network with each other, but really just to allow us all to learn from each other. We really want to create spaces where everybody can participate and we can hear from everyone.
But with that, thank you all again so much for joining us today. I would love to turn it over to our speaker, Alan Carlson. So Alan is tech lead for the business intelligence team at Snap Finance.
Introduction and background
Thank you, Rachel. Yeah, thanks everyone for joining today. So I'm gonna talk about, building robust modular dashboards with Golem and Graveler packages.
So as far as what we're gonna cover today, just a quick introduction about myself. So like Rachel mentioned, I work for Snap Finance, they're a personal lending firm for customers across the nation. And I'm the tech lead on the business intelligence team. I've been a developer for four and a half years and then a tech lead for the past two. And when I started on the team, there was only me and another developer. So we've come a long way from just two of us to now, I believe eight of us on the team.
My academic background is in business. So I have my undergrad and master's in finance. And I'm telling you that because just as a disclaimer, since all of my programming knowledge is job-based, I don't have a deep understanding of the finer points of programming. So if any of those kinds of questions come up, I'll do my best to answer them, but I understand that's not my exact area of expertise.
Business challenges
So like I mentioned, two years ago, there was just two developers, right? It was me and another guy. And we had just started working with Shiny at that point. Up until then, markdowns were sufficient, data pulls were sufficient, but the business was growing and they wanted more interactivity and granular looks into their data.
And part of this was we had legacy code that was being transferred to our department. So the data science team had built around these six dashboards that were pretty complex, right? They have six to eight different tabs on a sidebar. They all do kind of different UI layouts and backends. And what we noticed was that these six dashboards were all coded slightly differently, right? So the look and feel is pretty similar, but some of them had pure API backends, some of them used SQL instead. They had similarly named functions that did entirely different purposes on the different dashboards. Some used selectize input versus picker input.
And so the challenge was, you spend all this time focusing on a dashboard, but when you go to the next dashboard, you can only port like 40% of that knowledge with the new intricacies in the new dashboard.
And the other challenge at the same time we were facing is we were about to hire four new developers that had various backgrounds and skillsets, but they all had minimal experience with R and Shiny specifically, right?
And so the questions that we were trying to answer immediately was how do we standardize our workflow so we can minimize future tech debt with our dashboards that we build? And secondarily, how can I get new hires to understand what we're doing and start building dashboards very quickly, right?
The Golem package
And so that, when I was researching that problem, we learned about what the Golem package was. And currently the Golem package has a very fleshed out GitHub repo. It has a published book that you can either purchase or view online. But when I looked at it two years ago, it was just this little book down with five or so sections that had bare bones concepts and they're still fleshing out the process.
But the idea of Golem has stayed the same, right? They're trying to build modular dashboards with a package-based framework. And before we go into what that means, I just wanna cover those concepts really quick with a couple of slides.
So all Shiny apps have the same thing in common, right? They use a UI to create a layout and then a server to connect data to that layout. And so when you build Shiny dashboards, there's basically three ways you do so, right? The first method is the all-in-one approach, right? So you can make a very simple dashboard with an app.r file, all your codes in one place, and it makes simple dashboards.
When you add more complexity, you kind of split the UI and server into their own files because that separates their purposes cleanly, right? The UI has all the layouts in it, the server files all like the code backend and so you know that each of those is referring to its very specific parts.
The third method is called modularization. And what this does is, let's say you wanna build a dashboard that serves like four different parts of the business, right? Sales, product, marketing, and like some executive view. If you try and put all of that into one server file, that's going to be a very large server file, which isn't very helpful. So modularization sticks all these sections into their own UI and server for each individual dashboard piece.
Additionally, this avoids namespace conflicts that you can sometimes run into with the other methods. But the key point here is that when you modularize your dashboards, the UI and server become self-contained functions that your dashboard will eventually call when it builds its dashboard.
R packages, if you've never built an R package, you've certainly installed one. And what a package actually is, is it's a bundle of files, datasets, unit tests, and individual functions, right? Whether you're using dplyr or ggplot2, if you look into their GitHub repos, all of their functions are tiny little discrete R scripts that are bundled together as a package, right?
And so there's that phrase again, self-contained functions. And so what Golem does, what it's trying to do basically, is it looks at the commonalities between modularization and package frameworks. In other words, Golem says, well, if all my modules are just functions already and a package is just a bundle of functions, why can't I just make my dashboard a package and run it and deploy it that way? And the advantage to that is packages are very standardized, right? There are specific folders and directories and locations where files go within an R package.
Because of that though, Golem has no shortage of creation options, right? So if you just install Golem and try to run with their workflow, right, there's tons of different deployment types you can do, either RPUBs, RStudio Connect, Docker, I believe they even do like GitHub Pages at some point. There's different unit tests you can create either at like the module or the app level. Again, various ways to import files and different configurations for your actual app.
And so when I was testing this out, I'm trying to answer those two questions, right? Does it solve those two challenges we're facing? Standardization, absolutely, right? Packages, like I mentioned, they have to be built a certain way in order to work, right? They have to have an instance folder, a man folder, namespaces, descriptions, all that is standard across like every single R package that's ever built.
But because of all these complex options, it doesn't do a good job of communication to new developers or people that don't have a lot of experience with R and R Shiny. And that's by design. If you look at the first page of the Golem book, it actually says, you know, if you're reading this, you have some intermediate advanced level of how Shiny works on the backend. And this is trying to explore more possibilities there.
Introducing Graveler
So we built Graveler to kind of address that complexity. We take away all those options, we strip them down to the bare minimum and say, here's these defaults, let's run with those defaults and you'll be good to go, right?
If you're curious about the name, so I'm a huge Pokemon nerd and Golem itself is, well, I don't believe the Golem package was named after the Pokemon Golem, but if I'm taking this, you know, big complex package and I'm making it simpler, the simpler version of Golem in the Pokemon world is called Graveler, so that's the name there. And I'm pleased to announce that Graveler is available on GitHub currently.
So we built Graveler to kind of address that complexity. We take away all those options, we strip them down to the bare minimum and say, here's these defaults, let's run with those defaults and you'll be good to go, right?
You can install it with devtools, colon, colon, install underscore GitHub, ghcarlallen, slash Graveler. It's in a bunch of links in the blog post and whatever. So you can install it for yourself there.
Live demo: building a Graveler dashboard
So I've already installed the package, of course. When you install it, you know, you might run into some library dependencies. This is my first like real like public package I've tried to do. So if you run into any of that, you know, feel free to raise the issue on GitHub or, you know, let me know at some point and I can try and work on getting those imports correctly.
So the first step, you know, you'll make a new project either with the button up here, or of course file new project. You can create, you know, a new directory and you might be thinking, well, do I go new project, new package, train application, none of them because Graveler will now have its own little wizard type down here at the bottom. There is also the Golem one, right? Because if you're using Graveler, you have to use Golem as a backend.
Okay, so you'll have just a few simple options here. Directory name, I'll just make this, you know, RStudio Meetup. I'll just make it as part of my desktop for now. The package name, usually you want those to be, you know, pretty short and sweet. And over here, you have your display title. So in the top left of your dashboard, that's what the user will see as the dashboard title. All right, so I'll just call this Meetup.
Here's the default Graveler layout, right? You have this directory that we just made. Has all the necessary files that you need to build a package. And it has the three helper files to get you started going, right?
So before we get into the first file here, the O1 dev, if you're unfamiliar with packages, this description file is where your dependencies lay, right? So whenever you build a package, you have to import other packages that it might rely on, right? So let's say you're trying to build some custom package that uses the dplyr summarize function, right? You have to import the dplyr package in order for your package to work, right?
So instead of adding them one by one, this O1 file does, or O1 dev, it will add dependencies to your description and to your package itself. So this is just a vector of the dashboards, the bare bones dashboards that needs to work. Dashboard themes, obviously colors everything, dev tools just in case, dplyr lubridate, golem pins, shiny, so on and so forth.
So you run the vector over to my environment here, then it just runs a simple for loop and attaches those packages, right? You'll notice your console gets very colorful, right? Because it will add the package to the imports field in description, right? So if I swing back around to the description file, ta-da, it has all of my imports ready to go.
The second part is we need to add an app.r file for publishing. So RStudio Connect, when it publishes a dashboard, it looks for an app.r file in order to actually tell it that that file is a dashboard, right? If I were to just upload this directory as it exists, RStudio would throw an error and be like, I don't know what this is. This might be a package. I don't publish packages. That's not what I'm here for. Instead, create like an actual file I can understand, right?
And you'll notice my comment that says, be sure to enter one in the console prompt. That's because Golem, in order to talk to the rest of the dashboard backend with its configurations, it has to make its own little YAML configuration file. And so all you need to do is just hit yes, or type one, excuse me. And then you can read through all this, and it basically says, here's your app.r file. Here's this Golem config YAML file that we have. You never need to edit those. It's just an automated file creation for backend deployment.
Then I won't run this line of code. It takes a few minutes, but here at Snap, we've integrated with RStudio's get back content feature. And so what this does, it allows you to write what's called a manifest.json file, which basically takes a snapshot of all your packages so that RStudio can actually connect to your GitHub repo, find that package snapshot, and then publish from that snapshot. It's been super useful because instead of worrying about people individually clicking the publish button, trying to overwrite stuff, you can all work from the same GitHub repo, same GitHub file, and the dashboards will update accordingly.
So then after that's done, you can navigate to the run.dev folder because your dashboard's already ready. This is the second file that opened. It just navigates to that file. So this is the file that you actually use to run your dashboard locally, right?
As far as like what all these lines do, the prod equals false basically prevents like console messages from printing, or excuse me now, true prevents console messages from printing. So when it's false, it allows it to kind of like debug and develop more. Usually I just leave it to false by default. I never really change it to true.
The next two lines are important because it will detach all your libraries and remove your environment, right? And that's important because it needs to clean your environment so that when you run your dashboard locally, you know that it's going to work exactly with what's in the dashboard itself, right? When I first started working with Shiny, I'd run into an issue where it works locally, but not when I publish, because for whatever reason, I had some local variable that was just never in the dashboard. Like none of that logic was present. So it works locally because it's trying to read my local environment. So we wipe the environment to make sure that the dashboard has exactly what it needs to run.
Then finally, again, if you're familiar with package development, this is the same thing as the package build, install and restart function down here. So this creates a namespace, which is where all of your exports and imports come from. And then it loads like the meetup package. And in actuality, you know, once you run this line, your package now exists in your local environment, right?
And that's it. Your dashboard's already built. It has, you know, the display title that we made. It has a working sidebar example. It has a collapsible sidebar over here, a fun little Graveler logo that you can change, the colors you can change as well. But this took long as I was explaining it, but in reality, you can run all those lines in less than a minute, right? And you have the entirety of the framework that all you need to do is just add modules to, right?
Adding modules
Which is a good segue into, you know, how do I create modules and actually put content into this dashboard?
So if we go back to the O1 dev file, there was one line that we forgot at the bottom, which was adding modules. So with Graveler, there's called a level up function. And all you do is you can paste that in your console. You don't really want to comment out this line because this is supposed to be just once or run once, right? So I just copy the comments and put it in my console and then we'll call this, I don't know, summary or foo or, you know, whatever you want to call your name of your module. I recommend keeping them small because it is going to add some prefixes to this name.
So when you run the level up function, it's also going to create two files to open for you. The mod underscore name or summary in this case, and then the mod underscore summary, underscore FCT display.
This module is the module code, right? This is one section of your dashboard that you can make. So you might be thinking, oh, great, I'll put in a box real quick.
You think, okay, cool. Now I can just rerun my dashboard, right? Everything will work. That is optimistic thinking because it will not work because your module has yet to actually connect to your actual dashboard layout, right? And so, you know, there's no new sidebar, there's no box over here because you built the module, but you haven't connected it to anything.
So at the bottom of every module, there are these three lines that you copy into their respective files. So you have your body.r, your app server.r, and your sidebar.r. So open all those right now. All our files are located in the r folder itself.
So you simply just copy all these in order. It tells you where to add them. So I comped it out here, adding my module here. For app server, oops. Also a comment, you know, connect to a pin board if you use that part of RStudio Connect. And then finally, you have your sidebar item.
So technically you don't need the example there because the example obviously can't connect to anything. It's just to show that a sidebar is working. So I usually just delete this line or you can comp it out, either or.
So this we'll call a summary and the tab name automatically connects to the module itself. And then you can use any kind of the icon libraries, either Fontos, Moogle if icon, or whatever.
You also don't need to save your files. I just did, but as a force of habit, which is good, I suppose. But if you have unsaved files, whenever you run the dev file, it will automatically save everything in your package before it tries to build again. So pay attention to app server and body up here. I run these lines of code, it saves them. And we'll incorporate those changes there. So there we go. Summary, text box, hooray.
And now you can simply add, you know, as much UI and server functionality as you want, right? That's outside the purpose of this meeting because this is a backend, not a shiny front end enhancer.
The FCT display file and functions
Okay, so now that we have your module, you're probably asking, well, okay, if we have the module working, what is this file? What is this FCT display file? So in Golem, their philosophy is, you want to abstract away as much as you can from even your module code, right? So for example, you know, you can have a bunch of different charts or tables in this, but at the moment your code starts to repeat itself, you want to put that inside of a function to make this as clean and compact as possible.
And so it's a hard concept to explain just verbally. So if I'm going, so excuse me, I'm going to switch to the actual Graveler package because inside of it, there is an example dashboard that's fleshed out called Geodude.
So this is the default example. You can also run this yourself with, I believe, Graveler. Geodude example as well with no parameters in it.
But so in this first tab, you know, you notice that there are four graphs here, right? Representing ants, ants, combs, quartet. So with a couple of GG plot leads and whatnot. So how did I build this, right? Did I write a GG plot code for different times? No, because, you know, let's say someone tells you, I want to change those dots from a horrendous lime green to an actual human visible color.
So in my module, here's my UI, right? I have just a two by two grade of charts. And you'll notice in my plot leads renders, I have my own custom function here. So I have my dataset and it pipes into a function. And that's what this FCT display file is for. It's to put all of your functions into one place so that if you need to edit parts of your dashboard, you only need to edit like one piece of the code, right? Where it's duplicated.
So in this case, you know, let's say I want to change it back to orange. Now I can save this, rerun the dashboard and it should work just fine. And there you go. All four graphs update with the new orange color. This applies to any, you know, GG plot option or data table option or whatever functionality you have in your data.
The other advantage to adding functions to your FCT file is everyone's favorite documentation, right? Dashboards and their functionalities at this level can become incredibly complex. And so it's always good to have functionality or excuse me, functions and documentation for those functions so that yourself or other people reading your dashboard code six months down the road will understand, you know, what this code is actually trying to do.
And the other benefit is, you know, if you recall, if we run these lines, I should be able to export from my package and look at that. There's my custom function. Guess what happens when I put the question mark in front of it? It builds the actual help documentation per Roxygen documentation for code. So all of this workflow with Golem and whatnot is trying to help you standardize your package development, standardize your dashboard development, actually create documentation for packages down the road.
Deployment and customization
Finally, before we get into a little more customization, there is the final file that we had open, which was this O2 deploy. So what O2 deploy does is when you publish your dashboards on RStudio Connect, they break for good reason because they need to have different environment variables to connect either to like database credentials or if you're using pins, you need to have your API key in there somewhere.
And RStudio, by design, you can't like set these beforehand, right? So you have to publish and then add them after the fact. So what this does is it does a programmatic, right? You can go in and add your environment variables by yourself, by hand, by adding the name and the value. What this does is it uses the Connect API library, which I believe is still in development mode. So it's not a CRAN package yet, but you can load this. You can load your custom URLs. And then you can connect to the server, then you find the ID for that piece of content you're deploying. You can paste that whenever, and then you can add the environment right here, right? So in this example, I'm finding whatever I just published. I get the environment for that published content and I add a new variable called key. And then I set my own, right? Set it equal to whatever my API key is personally. So that saves us a lot of time with deploying stuff in RStudio Connect.
But I imagine you're probably asking yourself at some point, well, this is great, but I don't want this like brown and gray dashboard, right? I have very cool purple or blue or yellow business, right? I want to brand my own dashboards. So if you go into this theme file right here at the very bottom, that's where you can just set your colors. So there's three colors you can do, primary, accent, and secondary.
You can omit these entirely and just kind of, you know, customize everything line by line. I don't feel like you need to do that, but be my guest. So if I just change these colors to, I don't even know, so many hex codes. We'll go that red and black. So if I run my dashboard again, it will update with these colors. So my primary is this dark red, this accent is this bright red, and then there's this like black bar on the left for this sidebar option here.
As well as the logo, let's say you have your own custom logo to do, just like in a standard Shiny app, all of the external assets are in the instaapp www folder. So you just add your logo here and replace the header value with the new file path. Here's that file path is. So I would just replace this with, you know, your logo, SVG, PNG, JPEG, whatever.
You can also attach URLs. So whenever we publish content, the title and the logo will link back to our content page. So instead of like hitting the back button or anything like that, they can just like right click the title, open a new tab back to the content page. By default, it just goes to RStudio's products page. So you can change those however you want.
Outcomes and potential enhancements
But with that, I believe that's it for like the Graveler dashboard workflow itself. Again, I'm not gonna go into, you know, creating UI elements or pin calls or anything like that. Cause that's, you know, you can research, there's plenty of better documentation than I could ever make out there for that.
So potential enhancements, you know, if some of you've worked with this kind of, you know, dashboarding or customization before, you might have thought of these ideas already. So as far as, you know, with any code, there's always improvements to be made. Here's what I've thought of for the Graveler package immediately. So as of like Shiny 1.5, there's a new method of calling modules. So instead of the actual call module function, there's I think module server is what it does. It has a simpler syntax and understanding. So it'd be cool if I could convert it to that.
Obviously with the dashboard theme package, and you have those colors and that customization, but BSLib has been integrated with Shiny already. And so you can use that more intuitively and create cleaner colors and layouts. Again, at the time when I built this, none of these things existed two years ago.
And then finally, you know, packages live on testing, right? The more complex your dashboard is, the more points of breakage you're creating itself. And so if we can somehow, you know, connect unit tests to be put on these like dashboard packages, that would be great. It's just one of those areas that, again, I don't really have a fundamental understanding of unit testing. And so I have trouble automating something I don't really know the ins and outs of. But in general, you'll spend less time debugging all of your parameterized code.
But with that, you know, that's all I had to talk about like from a high level. So, I mean, feel free to ask questions about like the package itself, you know, the backend, how it's doing its things. But in general, Graveler allows like our new developers to, you know, start building dashboards right away. We have about like a two week onboarding period with, you know, like intermediate SQL and R training and then in-house training as well. And after the end of two weeks, we give them a week to build their first Graveler dashboard, that workflow. And so, I mean, in the turnaround of 15 business days, you have people building, you know, some of the best, you know, shiny apps. Well, I guess not the best shiny apps you can make. Robust, shiny apps very, very quickly.
So, huge shout out to RStudio as well. I mean, it itself was a game changer for us, but the ability to like integrate with GitHub and Package Manager and, you know, being able to install our custom packages like Graveler to use and publish, our workflow has gotten a lot more simple and a lot more standardized.
And so, I mean, in the turnaround of 15 business days, you have people building, you know, some of the best, you know, shiny apps. Well, I guess not the best shiny apps you can make. Robust, shiny apps very, very quickly.
Q&A
Thank you so much, Alan. I always say we're all clapping, even though you can't hear us. If we can use our clapping emojis here on Zoom. But thank you so much for your presentation. And I also want to put a shout out for a great blog post that Alan recently wrote as well, which walks through this too and includes a lot of great links too.
One was, have you worked with the Golem team at all on the package? Ah, no, I have not. I believe I met one of the engineers one of the authors at RStudio Conference a couple of years ago. Just I wanted to get like his thoughts on like what Golem was, but it was just a very small conversation and no, I haven't worked with them at all. That'd be cool though, but.
It was, are there any plans to include any simple GUIs in the package either to link modules to other files and or to diagrammatically show modules and how they fit? Yes. So we've thought about that in the past. Definitely there's room for module templates. So, I mean, you can, you know, just like I wrote box with array, you can just put this as a default. And so it can make kind of like these, these little template layouts. It's just not something I've had time to work on in the past, but that is an idea we've had.
I don't really know how to do that from like the package programmatic perspective. I've done it personally, just making like, you know, like a readme file for a project. I'll make just a quick little diagram with either diagram R or, you know, various flow chart packages. So, and there's a way you can do it programmatically that would be awesome. I just, I always forget how to do it, but that's what I have for that question.
And it is, what's your opinion on Shiny Mobile? Ah, yes, so we looked into Shiny Mobile once upon a time and as cool as it was, it was just such a new framework to do that we had already pretty much doubled down on like Golem and Graveler and this method of building. So we didn't find a ton of value back then in trying to also learn Shiny Mobile. That's slightly changed because with the growth of our business, we have more focus on like our external sales team that's out in the country. And so they primarily work on mobile devices. And so that's something we're currently trying to revisit, but it's still in the early stages of planning. So I don't really have much more info on that.