Resources

AskRADS: An AI Recommendation Agent for Maximizing ROI of Data Science Collaborations (Regis James)

AskRADS: An AI Recommendation Agent for Maximizing the ROI of Data Science Collaborations Speaker(s): Regis A. James Abstract: Blockers to crucial data-driven decisions can often be a challenge. To address this, we established RADS, the Regeneron Analysts and Data Scientists, as a Community of Practice for exchanging strategies on eliminating these obstacles. RADS has grown to nearly 500 members, creating a new challenge: avoiding redundancy and helping non-RADS colleagues find the right experts. To solve this, we developed AskRADS, an AI agent on Posit Connect that provides recommendations based on discussions, experts, and relevant resources. It uses R, Shiny for Python, FastAPI, LangGraph, Neo4j GraphRAG, and MySQL. This talk will cover its architecture, AI search solutions, and optimization techniques. posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

So, I am an Associate Director of Data Science and AI at Regeneron Pharmaceuticals, and today I'm going to be talking about the AI Agent platform that I built, and more importantly, the thought processes that went into my building it, because I think it actually could be really useful in a limitless array of domains that you all work in.

And so the platform that I built is called ASKRADS, which is an AI Recommendation Agent for maximizing the ROI, or return on investment, of data science collaborations.

So I've spent my career designing and deploying enterprise-scale AI systems and unblocking a range of critical data-driven decisions that exist in science and also in business with many of these systems. And I've also been able to bring people together over the last nine years that I've been at Regeneron. I built a 500-person data science and AI community of practice, which I call RADS, Regeneron Analysts and Data Scientists, and together we've worked to figure out how to eliminate blockers to decisions via these collaborations across the RADS network.

So today we're going to be talking about the work that I did with some of my colleagues, Maryam Ivari and Eric Prager. So there's an elephant in the room that all of us are aware of, which is the fact that institutions are demanding AI yesterday. Everyone wants it, and a lot of the times the people demanding it might not fully know exactly what it is, but they know they want it. The problem, though, is that building dependable AI takes a lot of time, and today I'm going to walk through an end-to-end anecdote of how we did it and how you also can as well.

Identifying impactful AI use cases

I'm going to talk about what actually impactful artificial intelligence use cases are. And so in order to do that, we have to recognize what achievable opportunities could exist for artificial intelligence, and some of the ways that you can really maximize the output that you get for giving your finite input is starting with pain points and then going to identify a vision for a possible future and then building that vision instead of the other way around, walking around with a hammer looking for a nail.

So what exactly do I mean by this approach? So identifying KPIs. KPIs usually mean key performance indicators, but you can also kind of steal the P and use it to mean pain point indicators as well. So if you always start with the problem and not with the solution, and here as an example, there's a sign that is indicating that there is a potential issue when you're driving down the street, you could end up falling off of the edge of a cliff. So this is an example of a type of concern that you want to eliminate.

So there are a wide range of blockers to decisions that pop up all the time in our work, and these blockers can exist due to issues with complexity in data or volume in data or both, and it's even better, at least when we're thinking about an AI solution, if some of these blockers and pain points are recurring, because that means if you solve it with an AI solution, then that solution just multiplies every time that problem that would have continued to recur happens, because now it's no longer an issue.

Vision telling

And so this is kind of where I get into the idea of vision telling, which is a term that came to me when I was having a conversation with Tarif Khawaf a couple months ago, and essentially just imagining a future world without these pain points. So a little bit more detail about this. So vision telling is defining your future state, whereas storytelling, which we've all heard this term, is describing the past, and thanks to Isabelle Zimmerman for doing these doodles here.

So a story would be an example of, let's see, maybe bowl cut Bonnie, because we're haircut, was one day noticing that there's this beautiful plane, right, and then suddenly there was some sort of explosion that caused the plane to blow up, right, and now Bonnie sees that there is a gap. And vision telling is if Bonnie is seeing that gap, and then she's thinking, what if we didn't have to have these two parts separated? What if there were a bridge that could cross this gap? And then we could still go from side to side and traverse what used to be a plane with no issue, right?

So vision telling is describing a possible future where the blockers and the bottlenecks that you all may be facing in your day-to-day work or your colleagues may be facing, imagining that world in which they're eliminated. And it really helps to clarify what the right problem is, because if you do it in the opposite way, and you're like, okay, I'm going to AI this, I'm going to AI that, but what actually are you AIing, though? But if you start with the problem, then now you can identify, okay, so this is the actual problem, so then it makes it possible, once you've identified what that problem is, what's the thing that would be necessary to make that specific problem go away? And it also helps to crystallize measurable returns on investment that align with stakeholder value.

And it also helps to crystallize measurable returns on investment that align with stakeholder value.

So when you work to make your AI vision a reality, then you can work to close or bridge that gap, and it makes it possible to identify which AI strategies it would take to implement the vision, but also as you do this, you have to make sure that you balance ambition with practicality, trustworthiness, and measurability in terms of acceptable performance thresholds, transparency, and explainability, governance, and adoption readiness, and so you can actually build this bridge to close this gap, and you're doing it building only the achievable solutions in terms of which data, people, process, and technology that you might actually have access to.

The RADS use case

So I'm going to bring that back to our AI use case, but this entire time as I'm talking, think about your own use cases as well. So before, we actually had challenges due to RAD's success, the success of the data science community of practice, Regeneron analysts, and data scientists, because there could be, you know, burning houses, there's fires that can happen all over a company, or whatever sort of institution you may work at, and you want to be able to put that fire out, and we've got a bunch of people in the group, but the problem is the expert practitioners might be worried about reinventing the wheel, because they don't know if other people figured out specific ways of putting out those types of fires, and also they can be irritated by redundant requests from colleagues who actually are not even speaking to them in an efficient way.

So a lot of times colleagues can come to you and say, hey, can you make a thing that does this, and then you have to spend an hour or weeks figuring out what they're actually asking for. So that's a problem, and for the non-expert collaborators, they're not necessarily sure of who to approach or how to communicate with them. There's different languages that happen all the time. We have expertise in this community, at this conference, in data science and AI, but that doesn't mean that just because we have the expertise, other people understand what we do, and we can end up talking past each other all the time. And so the non-experts may have no idea how to identify or explain the problems, like imagine they don't know the word for fire, but they're asking for help for putting out a fire, then you have to spend a long time figuring out what they're actually saying. So without the smart tools that could be built, you end up with redundancy and difficulty finding the right expert.

Now if I think about this in terms of what we have inside of the data science community of practice, and you can do it for whichever types of resources that you have access to, so we are able to leverage potentially the RADS library to identify the right computational approach, which can help avoid reinventing the wheel in terms of finding the right expert, translating between experts and non-experts, explaining pain points effectively, and structuring collaboration requests effectively.

So to magically take people at a given institution or ours at Regeneron from the state of I don't know what to do, and then you can just wave some sort of magic wand, some AI wand, right? And then takes people from I don't know what to do to I know what to do. Now the decisions become a lot clearer to them. And so this could be done by recognizing the relevant people and or their approaches in the RADS library, which I will get to in a bit, but that's a huge component that we have, the RADS library, and explaining the backgrounds of the match experts and their requesters, or identifying the true underlying blocking blockers to progress, and also requesting expert help the right way, and with respect, which is important as well.

Building the magic wand

So in order to close that gap, so now we've kind of thought through the vision, it would be great if we had a magic wand that could do this. Well what ingredients would we need for this magic wand? Well if we think about our shopping list at a grocery store, for example, how do we build that magic wand? What would it require? Well it would need data, people, process, and technology. And we actually do have this in our RADS grocery store. We have the data, the human beings who are experts, like you guys, the data science and AI community of practice, and actually 500 people are in the group and have joined since I created it a little over nine years ago.

And then over time I've recorded 50 episodes of these meetings, and so I have data about the titles, the abstracts, the keywords, the presentation files, the text transcripts, and there are, I think, yeah, also introductions for these people as well. So they all talk about what they're good at, what their background is. So there's a lot of data that is already available. And then we also are using the Posit ecosystem in the cloud, and we've got single sign-on governance controls already implemented. So we can and we did use it to build an AI platform.

So what did I use to actually build it? Now I went through the vision to imagine a magic wand that could fix everything, and then I thought about, well, what ingredients would I need to go to the grocery store to build that magic wand? And now that I have those ingredients, thinking about, now what do I need to do to stitch everything together? And this aspect comes after those first steps.

So tool selection principles. So integration with the Posit ecosystem makes it really easy to deploy things and keep everything connected. So using Python and R and Shiny, AI agent orchestration, which I'll touch on agents in a bit, and Rebecca did a great job earlier, but you can use things like the Elmer package and the chat list package and Langraph, and then semantic and structural reasoning using GraphRag and Neo4j, and then performance and scalability to make sure that the results are retrieved pretty quickly. So there's caching in the Postgres database and the use of fast API. And many of these things are works in progress, because you can do an initial iteration, and then you can always improve different components over time, especially as a function of feedback or looking at performance of when people use them.

What is an AI agent?

So I've been talking about agents a lot, and we've all heard it in 2025 a lot. So what exactly is an agent? Well, it's not just a large language model. It's actually an orchestrator of planning. So it enables reasoning workflows instead of just outputs. It is an orchestrator of the use of tools. It calls tools, which are just functions. It gives recommendations. It just outputs text, but the text contains recommendations of what a programming language should execute. And then it also has memory, both in terms of short-term context across interactions within a given chat thread, but also long-term in vector or other types of databases. And the key benefit of this is that agents can be tuned and trained and guided without actually changing the model weights. Although you can fine-tune models, but it's not always necessary to do it.

And so why exactly was an AI agent the right approach? Well, it's because there's unstructured data in terms of stored paragraphs that are stored, not tables. So apples can be red and peppers can be green. That's unstructured, and that's the kind of thing that's in the transcripts and titles and things like that, where structured is more of a table. And that's perfect for parsing with large language models. Semantic meaning. So you can have exact keyword searches, but that's suboptimal because that's not really how people talk usually. Apple versus red fruit from a tree. Red fruit from a tree has a semantic meaning that is very close to apple. And LLMs, again, are optimized for this. But then you have the issue of zero-shot insufficiency, which is where it's unlikely to get it right the first time when you ask a question, given the big data set that we have. And so you can iterate on that using the chat conversational type of iteration, which facilitates the right recommendation.

The broken wand problem

Which brings us to the broken wand problem. So what exactly do I mean by the broken wand? Well, so agent LLM, so large language model variability, can be both a blessing and a curse. So in terms of the blessing, it can handle a wide range of differently worded questions that mean the same thing. So it's got semantic power. We all know this at this point, right? It just basically magically understands unstructured data. But the curse, though, is that there are hallucinations, right? It can say things confidently, even though it doesn't really know what it's talking about or may not have the data. And then there's also inconsistency and unpredictability in the reasoning. And then there's the chain of thought. So this is what I extracted. And based on this, then I should do this. And then this output some answer. And based on that, I can reason. So the connecting of the dots, right? It can get off track and give incorrect answers because of some of the earlier issues. Which means it can end up looking like a broken wand instead of a perfectly functioning magic wand.

Which brings the question, well, then how do we fix the wand? There are a range of approaches that you can use to fix the wand. To control the magic, essentially. But some of them include constraining the outputs. So controlling the tool calling, the templates, and validation. Using graph rag for grounded consistency. Retrieval augmented generation. So dynamically having it generate the SQL or Cypher or whatever other programmatic extraction and then executing that so it doesn't have to make things up. And then HITL. Human in the loop. And so I've built interfaces before where you can kind of like with cursor and other things, you can provide it with rules that it must operate according to. And so human beings can go in and edit that. And then every time you ask it a question, it hits that and uses that in the context thread to make sure it does things accurately.

And so one thing that I've learned, especially in 2025, and it's only going to get better as time progresses, is that a lot of these models are really good at this point. And the problem isn't necessarily that it is the model. It's more the agent orchestration. Which brings me to another concept I've realized, which is train the agent, not the model.

Which brings me to another concept I've realized, which is train the agent, not the model.

AskRADS in action

So the result of this is this interface where you can ask it a question. You can say, hey, who's the person I need to talk to about machine learning? I'm not exactly sure of the right way to do this. Or not necessarily who's the person, but what's a thing that someone has done before in a situation similar to the situation that I'm facing right now? And it can extract things from the transcripts from our previously recorded episodes and then make a recommendation and then suggest that maybe you could talk to that person if you need to do any follow-up.

So as a result of Ask RADS AI, expert practitioners now can find and use existing solutions rather than reinventing them. And colleagues can approach those experts with novel and well-formulated, which is really important, requests. And then non-expert collaborators are now more certain about who to approach and how to communicate with them, and they're provided with clear, actionable formulations of their problems. So that's all I've got. I went through it a little bit quickly, but I'm open to questions if there's time.

Q&A

Quality of the responses the AI is giving.

So those are a couple of different kinds of questions. So I know that Posit has a tool called Chronicle, where you can look at the usage of things that have been deployed to Posit Connect over time. A couple years ago, I built something that could look at all of the usage records as well, and I joined it onto the departmental tree from Workday. But that's in terms of external operational uses. In terms of internal things, it is possible, I have not built it, but it is possible to store the questions and the responses since it's company internal things, and then improve on that, and you can do analytics to extract ontological representations of what things are talked about. So I haven't done that, but it's absolutely possible to do that, and it might make sense to do it, of course, as long as you have consent from all the parties involved.