Resources

When R Met Python: A Meet Cute on Posit Connect (Blake Abbenante, Suffolk Construction)

When R Met Python: A Meet Cute on Posit Connect Speaker(s): Blake Abbenante Abstract: Data teams often leverage multiple programming languages—driven by a multitude of reasons, be it task-specific requirements or personal preference. In this talk, I'll share how we enabled our developer base to build in the language of their choice, and how we leverage Posit Connect to unify them. By exposing core functionality as APIs using R’s plumber and Python’s FastAPI packages, and building parity in internal packages for both languages, we created a shared toolkit that streamlines workflows and fosters collaboration. Join me to explore our journey in breaking down language barriers and empowering data innovation. posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Thank you. So, I'm going to go ahead and give my number one takeaway before I even get started, and that is, if you choose to speak at a conference next year and have to be on stage, do not forget a belt, because I'm having a really tough time at the moment.

Anyways, let's get started. So, I'm Blake. I'm Director of Analytics and Data Science at Suffolk Construction in Boston, and today we're going to talk a little bit about app development and polyglot environments. So, I want to start with a story or situation that might be familiar to you. Say you're at work in a meeting or at lunch or in between meetings and you hear some people chatting about a challenge that they're facing that can't quite be solved with the data tools that are currently in place. Maybe it requires some on-the-fly data manipulation that Tableau can't quite handle, or the data's too unwieldy to handle in Excel, whatever it is, but you think, you know, I can build an app to solve that.

So, you do, and it does. So, your app's a home run. You know, the business loves it. Adoption grows. It's really spreading by word of mouth, you know, real grassroots success story. Way more popular than you could possibly imagine when you started. So, you knock out a few easy wins, you know, tackle some low-hanging fruit, thinking that will kind of calm the noise, but instead, demand continues to snowball, right? New feature requests, requests for other integrations, kind of a nonstop stream of UI tweaks or enhancements. It's all coming faster than you can handle.

And now, your once kind of simple, beautiful app has turned into a hot mess, right? You have kind of esoteric code paths that only you can follow, a litany of like half-baked, half-implemented features, and still way too many demands for you to handle. So, now, you know, you're burnt out, your users are frustrated, and kind of progress has grinded to a halt. And the worst part is, you know, it didn't really feel catastrophic along the way. It was probably quite the opposite, maybe even like a little euphoric. You know, you had really engaged users. The feedback you were getting was positive. They were just kind of little tweaks along the way until suddenly, you're at a boiling point, right?

You had really engaged users. The feedback you were getting was positive. They were just kind of little tweaks along the way until suddenly, you're at a boiling point, right?

What started as a win has now become kind of unsustainable. So, that's what we want to focus on today, kind of a real-world example of that that we ran into at Suffolk and how we try to catch and combat those scenarios.

So, I think when most people, or at least historically, maybe not in this session, talk about polyglot environments, they think most about developer choice and flexibility, but that's really only half of the equation. I think what often gets overlooked is how tools like Workbench and especially Connect can really be leveraged to help you deploy apps at scale. And the sweet spot would be marrying smart app design with the right polyglot strategy to move from, you know, a MVP to a fully-featured deployed product. So, that's what we'll unpack a little bit today in the approach that we take at Suffolk.

The real-world problem: Primavera P6

So, let's talk about the real-world example that I mentioned. This is Primavera P6. It's kind of the go-to de facto standard for scheduling data in the construction industry. Primavera was acquired by Oracle in the early 2000s, and honestly, I don't know if they've touched it since then. So, there's a few, you know, it presents a few challenges. One is it's rather expensive, so we have limited licenses. Two, at least in the Suffolk instance, it does not run in the browser, so, you know, all the joys of enterprise software management, it also only runs on Windows. And three, even if you were able to get a license and have it installed, it is really not the most user-friendly or intuitive tool.

So, all this, you know, combines together to create a problem for people in our company to interact with scheduled data in any kind of meaningful way. It's mostly disseminated through, you know, PDF screenshots or Excel dumps. So, this is a situation, you know, I thought we definitely have the data, so we could probably build something to help out here. So, that's what we did. We built a very rudimentary Gantt chart. Then we had to add some filtering to it. Then the first big pivot where people were saying, well, instead of just reading schedules, can we create schedules, which necessitated, you know, a bunch of drop-downs and sliders for inputs to get data about the schedules, and then once we were generating schedules, we needed summary tabs for it, and then this all became overwhelming, so people said, can we just chat with the interface instead of having to use all these widgets we asked for?

And all the while, we're still, you know, iterating over the visual style of our app. And when it started kind of as a very simplistic Shiny app, suddenly became something much, much more than that. And this kind of, like, is on top of what we talked about. So, like, this is, we're obviously not using any kind of modular or components in this. It's just, like, one more thing we kept adding.

So it's important to remember here as well that, you know, our team is not a team of software engineers. We're data analysts and data scientists, and while we try to adhere to best practices, you know, most of the team's experience with things like GitHub are as solo practitioners, so when you have people trying to work together, you know, invariably we end up with problems like this, which, you know, even seeing them now raises the hair on the back of my neck.

Reframing the problem

So that's kind of the conundrum we're stuck with, right? Like, how do we enable app development to scale with business demand? And the way that we try to approach it is, you know, reframe the problem. So it's no longer this one-off app that we built that we're chasing our tail around, trying to keep up with demand, but take a step back and think of it as a product we want to grow and sustain and stick around for the business for a while.

So that reframing requires three things, right? So smart app design, a plan for moving deliberately from a proof of concept to an MVP to our fully featured product. Technical separation, leveraging APIs and internal packages that make our work independent and language agnostic. And team scalability. So how do we create space for multiple developers or teams to work in parallel?

All right. So that kind of reframing allows us to go from that monstrosity of a monolithic single file app that you see, that we saw before, to something that would scale in both functionality and complexity, all with, like, individual discrete domains that can kind of be worked independently.

Smart app design

So let's talk about how these pillars help us accomplish that. First, we'll talk about app design. So some of you might be familiar with this sketch from Henrik Nyberg about the principles of MVPs, minimum viable product, and the idea being that we want to move away from kind of Big Bang delivery, you know, iterative development that does not deliver anything to the client until the very end, to a much more iterative and incremental approach, really tightly coupled around the core functionality of the needs of the user. That leaves us room to scale.

And as well-intentioned as we are with this approach, sometimes, you know, you build a proof of concept and you end up with something like this, which is maybe somewhere in between, right? We have a happy customer, but we've built, like, a tool that has really maybe limited us in how we can scale. And I often fall victim to this, and I think it's okay for us to say that, you know, while it's great if our proof of concept scales into our MVP, it's not a necessity. They serve two different purposes, right?

So our proof of concept is about proving viability and getting kind of buy-in from our customers. So a lot of times, that's scrappy, you're cutting corners, you just want to get something out there. It's not necessarily adhering to best practices. And MVP is about building a foundation that will scale and support, you know, more complex functionality in the future. So again, while it's great if it does scale, I think, you know, open communications with your business stakeholders to say that we're going to reset after we've gotten buy-in is a critical part.

APIs and internal packages

So if design is about planning for scale, the next question is, how do we make that scaling possible? And that's where APIs and packages come into play. So, you know, I'm sure you're all familiar with, there's a bunch of different variants of it, but first time you write code, you write it. Second time, you copy it. Third time, turn it into a function. So APIs and packages are where we implement that, right? So it's our place to write code once and make it trustworthy and usable throughout all our environments. And internal packages are kind of the helper function or where we kind of house our helper functions that developers use day in and day out, again and again. So our approach at Suffolk is to maintain packages in both R and Python that have feature parity across both of them. And that ensures that, you know, developers in each of their languages have the tools available to them that they need.

So kind of a real world example of this, you know, we use Databricks as a data warehouse. If you've ever seen, like, the Posit Connect cookbook on implementation across, you know, the Python version versus the R version, the mechanisms are similar, but the implementation is quite different. So, you know, if I was a Python developer and I went to go to R, at best I might be a little confused. At worst, I might just shut it down and go back to Python. So our thought was, you know, was alluded to, like, while there is a lot of similarities, can we make them even more similar? So we wrappered those both in functions internally so they're exactly the same across both. And that proved to be really successful.

So we took that, looked across, like, the ecosystem for kind of other similar examples and started wrappering some other packages as well. So, you know, responses from Vetiver, reading, writing, setting permissions on pins, and most recently, ChatList and Elmer, the interface. The other additional benefit of this is because these are internal packages, we can make very opinionated decisions about the functions we expose and the values that we set in them. So, you know, model choices, where we write pins, those kind of things, we can remove away and make sure we have a standard implementation. But the other ancillary benefit we've seen to this is that, you know, this dual package strategy has really kind of enabled our aspiring polyglot developers to be more comfortable trying their less familiar language because it's one less bit of cognitive load that they have to worry about.

So if packages are kind of our helper functions inside of an app, APIs are our super functions that make that functionality available anywhere. As a best practice, you've probably seen this before, we've implemented this call twice, so as a good developer, maybe I move it to a function, and kind of the next step we take is the thought of, like, okay, does this function meet two criteria? One, is there a value in it besides in this app itself? And do we envision a roadmap of it that makes it — that there's additional complexity that, you know, has the chance to grow, probably would benefit from individual developer focus.

And the idea is that, you know, it's the same input, same outputs, but it's callable from any environment, whether it's R or Python or even somewhere beyond. The other benefit of it is that at that point, then, you no longer need to be as concerned about the underlying mechanisms once it's removed from the app. You just kind of need to know the schema and the end point and hopefully functionality continues to work.

So this is a real example that we did again where we had, you know, a call built into the app. We extracted it to an API end point deployed on Connect. It was originally an R model deployed with Vetiver and plumber. It eventually migrated to a Python model deployed with Vetiver and FastAPI, and then eventually the complexity of it was a FastAPI that not only used Vetiver but other business logic embedded in it. But the key feature is that the apps themselves that were using it would never have to change, right? Like the end point stayed the same.

Team scalability and contracts

So we can have smart design and technical separation, but without a way to work in parallel, we'd still be bottlenecked. So how we unblock teams isn't really magic. It's agreement, and, you know, we think of agreement in terms of kind of like a contract, not a literal actual contract. People aren't signing papers, but kind of like more akin to like a data contract. So, you know, an agreement of the inputs and outputs of a particular function, and that way if I'm, you know, developing services and you're developing the UI, we both know what to expect.

So for us, APIs are where those boundaries are defined. They're the agreement that says, you know, you hand me this, I will return that to you, and it doesn't matter if the API is like half-baked, hard-coded at the start with, or in various stages of development. The important part is we have an agreement so everyone can continue to move forward on their own. And in that sense, every kind of API becomes its own mini-MVP, which, you know, again, as long as we can agree on the schema, the payload, and the endpoints, it can continue to iterate over time, and those other downstream teams can wire up to it knowing that it will evolve without breaking.

Every kind of API becomes its own mini-MVP, which, you know, again, as long as we can agree on the schema, the payload, and the endpoints, it can continue to iterate over time, and those other downstream teams can wire up to it knowing that it will evolve without breaking.

So once we have those contracts in place, that's kind of where scalability can kick in, and we no longer have a single developer as a bottleneck. So we can have back-end teams developing APIs, we can have package teams continuing to polish our internal functions, UI teams continuing to add functionality and features to our Shiny apps, and they can all move forward together as a whole. And the important thing is also those contracts don't enforce any particular language, right? Like the agreement is the boundary, not the implementation. So if I'm comfortable working in R, I can develop in R, there's a package in Python, I can freely use Python, the right tool can be used for the right job without creating chaos.

It's also not just about language. It's all about, you know, process as well. So, you know, if we have a team thick in the middle of app development, maybe we'll use Sprint or Agile. If it's a more mature API, maybe they use Kanban to manage their backlog. If I have a smaller piece, maybe I use something even more lightweight. The important part is, like, we do not overstandardize. We allow teams the flexibility to deliver by what makes sense for that team.

So once we have this all in place, it's not just about scaling one app with one team. We're creating an ecosystem that allows for multiple apps, multiple processes, you know, multiple stacks to kind of all work in parallel and evolve at the same time. To me, that's really what scalability is about. There's some aspect of speed to it, but it's really about creating an environment that many approaches can move forward together in harmony.

Discipline and a product-first mindset

So, you know, it sounds really easy. You put it on slides, but in practice, it takes a lot of discipline. And that discipline really is dependent on a few things. So to accomplish that, you know, we need clear communication around these contracts, all of which can be facilitated by constant communication and transparency across these teams and developers who are working on the different pieces.

And then finally, I think this is all enabled by this idea of shifting from a code-first mindset to a product-first mindset. And the way I think of that is, you know, historically the way I've worked is I have this idea, I open up RStudio, and I just start writing with a hope that I'm going to get to what I want to get to at some point. And this is kind of shifting that, right? So even in that mindset, I'm still doing design, but it's like design at every step along the way. What we want to do is shift all of that in the aggregate to up front so we can all kind of decide on what the best plan is for implementation of this product as a whole.

Okay. I will say, you know, in addition to the change in just the process, obviously there are some technical considerations to think about. You know, moving everything out of app to APIs does have some performance or response time concerns to think about. Other things like versioning feature flags, you know, those are all important details to make this work at scale that have to be considered.

But, you know, we started today with a story that you've probably lived to some degree yourself. You know, a quick app that snowballs into something unsustainable. We talk about the way out of that trap being for us to design deliberately, abstract smartly, and scale responsibly. You know, that means treating our proof of concepts as experiments and our MVPs as foundations. It means moving our helper functions into these internal packages and our core functionality into these super functions as APIs. And it means giving our teams the space and structure to work in parallel without chaos. And languages like R and Python and platforms and tools like, you know, Connect and Workbench are instrumental in making and helping us scale. But it's really the discipline about these contracts, communications, and a product-first mindset that makes scaling stick. And hopefully we'll turn our, you know, initial four-act horror story into kind of a feel-good rom-com that we all want to be part of. That's it. Thank you.

Q&A

Okay. We have — well, there's two questions related to testing. I'll paraphrase the one. It seems like a pretty complex app. So what was your testing process like? And the second one was after migrating functionality from an R package to an API, how do you adjust your testing slash validation strategy? Yeah. And they're maybe both somewhat related. So a lot of the components, you know, which we didn't get in time to talk about, we build kind of harness apps to test individual functionalities. So we can have an app that is strictly focused on testing APIs and response times. We also have a lot of automated testing that we do. So test plans for everything. So when we move from, like, the package — whether it's package or API, we generally can migrate a lot of those tests and they still can run in an automated fashion. So, you know, there's automated things we want to check for, like, kind of unit testing, but then it is either this harness app kind of framework that allows us to test kind of user functionality.

Cool. And then was there any pressure to move the app out of an open source — out of open source software tools and or give it to, like, a pro graphics design group or something like that?

Not yet. So we actually thought we were maybe going to go that way, and maybe we still will. So there hasn't been yet. And honestly, like — so no. We're still very early on in the life cycle, so we don't really know where it's going to land up. But not yet. Not today. Thanks, Blake. Thank you.