The Reluctant Admin: motorsports, data science, & IT | Brijesh Chejerla | Data Science Hangout

Transcript#

This transcript was generated automatically and may contain errors.

Hey there, welcome to the Posit Data Science Hangout. I'm Libby Heron, and this is a recording of our weekly community call that happens every Thursday at 12pm US Eastern Time. If you are not joining us live, you miss out on the amazing chat that's going on. So find the link in the description where you can add our call to your calendar and come hang out with the most supportive, friendly, and funny data community you'll ever experience.

I would love to introduce our featured leader today, Brijesh Chejerla at Florida Blue. Brijesh, could you introduce yourself? Tell us a little bit about you and something you like to do for fun.

Hey, everyone. I'm Brijesh Chejerla. I am a data scientist at Blue Cross Blue Shield of Florida. I have a PhD in computer science. And by way of introduction, I'm an enthusiast in many things. Data science happens to be one of the conduits to help prosper my enthusiasm in those many things. Machine learning is at my heart, it's at my core. And I chose to be a data scientist way back in 2016, because after I graduated from my university with a PhD, I was like, what do I do with this?

And I was thinking about where do I go next? Because I was pretty particular about what I would do and which field I would be in. And then I realized being a data scientist, you are a plug and play model. You can be in different fields. You could be in sports, you can be in medicine, you could be in healthcare, you could be in construction, you could be whatever. And then they all kind of fit together. You do very similar stuff. But at the core of that, it's still the machine learning.

So that's kind of the reason why I chose to be a data scientist. And yeah, I'm here. I graduated from being a data scientist to a machine learning engineer, I would say. As for fun, I love to watch football, soccer for American audience. And I spend a lot of time, I would say, going through the analysis and all of those things, you know. So that's kind of my fun activity for the most part. Generally, I love to listen to podcasts. Most of these podcasts these days are football related for me, because I'm in that space right now. But then otherwise, so to speak, I just love to converse with people. You know, that's a fun activity for me.

Develop new perspectives, because once you think like an architect, once you do system design, once you get into data engineering, your perspective about data science will change.

Tools, the Posit platform, and being a reluctant admin

Our tech stack for data science machine learning at Florida Blue is Posit Workbench, Posit Connect. So, we develop in Posit Workbench, we deploy in Posit Connect. I was a reluctant admin to begin with because I didn't know anything about it, honestly speaking, but then I'm glad I got into it because being an admin changes the way you look at how you want to deploy stuff, how you want to make your models available. So, that helps you architect your solutions better. That helps you become a better solutions architect for that matter.

Now, I think we had around 150 developers and about 800 to 1,000 consumers of the content. That's the license that we have for it. And about 150 active developers and increasing. The last I checked was 150. So, what I did was, and then, you know, we kind of used to work in silos and then people, we would get emails back and forth saying, okay, this is not working, that is not working. So, what we did was we just created a Teams channel and said, if you have any questions, post it here. And if somebody else also knows the answer to that, go ahead.

So, what we did was we then have a set of packages that are internally developed to our own needs that we use. And those are specifically used on the Posit platform, you know, like, you know, to connect to databases, to deploy, and to do all of these things. So, that's kind of how it started. And then it's at full maturity right now. We are looking to move from on-prem, our solutions are basically on, sorry, our applications are basically on-prem. And we are now looking to move into cloud.

If I had a DevOps team, I as a developer would just want to wake up in the morning, make my coffee, sit, just turn on my computer and log in and then just get away, you know? So, when I was the admin, that's kind of what I wanted to provide to our users so that they have least amount of friction to do the stuff that they wanted to do. But at the same time, that was a lot of work for me outside of my data science day-to-day work.

So, if you ever want to get into admin, you have to learn Linux. You have to learn how that works. You have to understand how to solution some things from a security perspective, which is, again, a different difficult thing depending on the figure that you're in. So, in healthcare, security is paramount, right? So, you have to be extremely specific. Users don't, especially data scientists and analysts, they aren't necessarily software developers. So, you don't go through the STLC process all the time. So, you'll have to guide them through those things. So, that got me into the security. That got me into being a cyber secure software developer hack. And so, that's kind of how my journey grew.

Yeah, it seems like people want to write a book called The Reluctant Admin as well. But I was wondering, how did you so successfully transfer that IT admin ownership over, especially with so many people being reliant on the tools? Like, what was that transition like?

So, the transition period was not easy, honestly, if I'm being very honest. Posit admin is a very niche-specific admin role. It's not your general systems admin. It's not your general Hadoop admin. It's not your general Linux admin. You need to have the knowledge of all three of them, and you need to know how Posit Workbench works. You need to know how the config files are set up. You need to know how Posit is set up or is expected to be set up. You need to know all of those things, and then you need to know the connect side of things as well.

And being a Posit admin also comes with, hey, I have these R-related questions. Can you help me? I have these Python-related questions. Can you help me? So, I'm not a R person. I've never tried to be an R person. I would just say I'm literate at R. So, any of my R questions, I kind of diverted to my teammates who are experts in R. Any of the Python-related questions, I handle. So, it's a very niche kind of role where you have to be extremely good at many things at the same time.

So, the new admins who have come in, they came from not this kind of background. They weren't R developers. They weren't Python developers. So, it took us a while. So, we kind of developed an internal mechanism where it made it easier for them to kind of know what sort of issues that come up. So, what we did was we have a Git issues and we created a dashboard of, you know, a Git board. And then, basically, if you're a new admin, you can search for an issue and that issue will have been resolved somewhere. And then, you know, if, let's say, I didn't exist in this company tomorrow, you can actually look at how that issue was resolved and then go about resolving that issue. That's kind of how we designed it and set it up.

AI's impact on data science roles

With AI, this question is pertinent to a lot of us. How do you think AI is going to impact data science roles? And then, you know, do you use AI as a productivity tool in your day-to-day? And if so, how?

Yeah. Very pertinent question. So, yes, AI is going to impact not just data science. AI is going to impact even software development for that matter. AI is going to impact to a reasonable degree now architecture in the future. Probably, you know, that's where we're going to go. I think the only safe stream right now is data engineering. So, if you are a good data engineer and, you know, you're worth your salt, then you're safe for now is what I would say.

I'm pretty sure most of us have seen that, you know, you throw a CSV, let's say, at the model and you ask it questions, it's going to generate X, Y, and Z. If you are a Pandas user, you know that Pandas AI is a package that you can actually load to just ask it questions, you know. So, it just depends on how it's being used. California is a different bubble in itself altogether. So, outside of California, all of the other companies are still trying to get up to speed with this. In many companies like ours, there are a lot of, you know, compliance and regulatory issues that we have to overcome before we start doing all of these things because of where the data sits and where it goes and what you can ask and what is quote-unquote touched by AI.

So, for now, it's okay, but then very soon, there's going to come a place where many of the jobs are going to be redundant because you can then do more with less. Now, but that also means that you're not necessarily going to get more productive. It means that you're going to have to work on a lot of things because the assumption is that, hey, you don't now have to write the code that you otherwise previously would have to. So, those assumptions are changing. I am seeing that in multiple organizations, you know, the whole idea of trying… For instance, we have a AI coding agent, like we have Windsurf, you know, this thing, you know, as part of the organization. So, many people use that. I use it to… So, I set up a few things so that I use it for code review, you know. It helps me review the code much faster because with all of the other things that I do, I won't have enough time to actually do enough, like, detailed code review.

So, I ask it to do X, Y, and Z, and it comes back with whatever. But then I still rely on my own know-how to actually go and look at the logic that's actually written for the business. It's not just about the correctness. It's also about, is the business logic correct? Are you reading the right kind of columns? Does it make… The transmission that you do, is it clinically relevant? All of those things, I don't think AI can do yet, you know.

In order for you to hedge yourself against that risk, I would still say, be fundamentally thorough in your machine learning, deep learning, statistics, mathematics, whatever you have it, you know, and then coding is kind of out of our hands. You know, you tell people I'm a better coder than a coding agent right now. It's a hard thing to prove these days. But you can always still say that I'm a much better data scientist, or I'm a much better machine learning engineer than a coding agent or, you know, the AI can do right now. And you would still get away with that, and people will trust you.

In order for you to hedge yourself against that risk, I would still say, be fundamentally thorough in your machine learning, deep learning, statistics, mathematics, whatever you have it.

System design and thinking like an architect

It's not just about data science models that you build. So we also build up the applications in our team, right? So that brings in itself its own level of system design that you'll have to think about. You'll have to understand what works, what doesn't work. That's another part where AI kind of falls short, because you are the one who has to think about you. There's practically no way for the AI to be cognizant of all of the existing stuff that's going on in your organization and how things are set up and all of those things. So your system design is basically your architecture at the end of the day. Any architect's fundamental thing is system design.

When you architect your solution, it's not just about which algorithm I use or how I input the data. It's about how many users are going to use it. How am I going to connect it to different things? How am I going to fetch from this database at whatever pace? Is it going to be given to me in a stream? Is it going to be given to me in a batch? Is it going to be given to me in a JSON format? Is it just going to be a row by row fetch? All of those things you'll have to think about when you build your model, because your model outputs are going to then be dependent on the speed at which your system integrates. So system design is definitely something that's a must for most machine learning engineers.

Becoming a data expert and growing your career

I'm a little young in my career as a data scientist. So I wanted to ask, it seems like you've switched a lot of careers where data is vastly different. The systems that you're using are vastly different. And as data scientists, we're expected to know, be the experts on our systems and data. How do you quickly become an expert on the, when you walk into a new job, how do you quickly become the data expert?

I don't think there's any such thing as quickly becoming a data expert. So the way I grew into my role is I, so if this is a funnel, I'd say here, if this is a funnel here is where data scientists usually operate. And I'd say, stop just being here. This is a local minima for you. Just don't be here. Which is why I say, if you think like a architect system design comes into the picture, you have to know where your data exists, what's your data pipeline. So depending on whichever company that you work for, try and get to know the data pipeline.

And then you'll have to start thinking about, if I have to collect more features for whatever data science work that I'm doing, can I extract features only from this relational database or from NoSQL, or can I also go above and beyond and say, you know what, from this particular WAV file, I can actually transcribe this document, this audio file, and then do some NLP on top of that. That is kind of how you have to think, because you should never stop yourself in this here. You also have to think about a data engineer. You have to think about, okay, but if I have to connect these two different databases, how do I go about connecting them? Am I writing the most optimal SQL?

And then most importantly, I would say you'll have to think like the business or like your end user. If you were a recipient of the analysis that you were given, what would you want or what else do you want? How deep would you want to go? And so that's kind of how you have to switch your thinking. So your data science, with all of the machine learning and the statistics background that you must have learned in your school, or even when you're in your job that you're learning, that you're being an architect or a system designer, and then being a vertical data engineer is fundamental to being a good machine learning stack.

So like I said, go and look for data that you probably don't even think you need for right now. Just look at the data, like see what's available. And then you'll know that, okay, when the time comes, oh, I have this data over here, would that be useful? Or there is data over here, which is somewhat translatable to the feature that I'm looking for. Let me see and add that. That's kind of how I would go about it.

GitHub portfolios and job seeking

I've been told that I need to have a portfolio and I need to have done a bunch of projects on GitHub to be a stronger job-seeking candidate. How true is that?

It is true to a certain extent, especially if you're in a tie break situation. That's my first thing that I look for is, do they have a GitHub profile? And what kind of work they've done? I'd like to see what kind of changes they've made over time, the comments that they do and stuff like that. But at the same time, it's more relevant if you are very early in your career.

I just say, start on the side, building out projects that you think you're interested, not to show somebody else that you can do X, Y, and Z. But if it's a project that you think that you're genuinely interested, automatically you will go very deep into that. And when you talk about that in your job interview, you will then tell them how you thought through the problem, how you thought through the solutions, how you thought through the system design, if there is a system designed to that, how you thought through the scale of problem, how you thought through the lack of data. All of those things will automatically happen. So even if it's just one project, the depth of the project is sufficient.

The way I look at interviewees is how they are able to think, especially in this day and age when you have coding agents and the need to be an exceptionally good coder is kind of reduced. It's all about the way you think about a solution and the way you come up with the solutions. Have you been able to put out a dashboard of sorts? And all of those things do matter.

Career advice

Like I said, when Logan was asking, don't put yourself in a local minimum. If that's what you want to do and you're happy in that space, that's fine. There's nothing wrong with that. But if you are a curious person, if you want to grow up the ladder, even don't just put yourself in a local minimum.

I think the first thing that you need to develop is perspective as a data scientist or a machine learning engineer or whatever. It's about developing new perspectives. You will have to spend your time and actively curate what perspectives that you develop and how that translates to what you do in your job. I very consciously built myself into thinking like a data scientist. I was a researcher, so I used to think like a researcher. And then I quickly realized that just thinking like that wouldn't help because there are deadlines to reach. There are so many other things, not every job, not every problem is a research problem. So you'll start having to think about what is the simplest way I can get to the solution.

If there is a simple way, do I even need to go and think about a more complex machine learning solution? Take it to the business. If the business is happy with that, okay, so be it. But then you kind of get a buy-in because I think a lot of people's issues are we are not really given enough time to implement X, Y, and Z. Initially, you will never be given enough time, but then you get a buy-in after producing some results and then tell them, hey, usually this is not a good thing for X, Y, and Z reasons.

First and foremost, be kind. That's the first thing that you should do. Be kind to people. And as a data scientist or a machine learning engineer, you have to train yourself to be unbiased. What that means to say is you have to stop thinking of this works. I know I have so much experience, blah, blah, blah. Put all of that aside when you get into a meeting with someone. Hear them out. Don't talk over them and hear them out and understand what they're trying to say.

Many times when we go and sit with the business is we try and get the requirements. The getting the requirements part is more like the business says these nine things are my requirements. Am I delivering on that? No. There's a lot of in between the lines. So try and understand what the business really wants. Sometimes the business doesn't really know what they want. So try and talk to them about it.

If you're interested, be a mentor, becoming a mentor or becoming a teacher changes the perspective altogether in the way you actually go about your own job. Because then there's a difference between you understanding something and there's a difference between you explaining something. And then like Feynman or Einstein, either of them said, if you're not able to explain something adequately, you haven't understood it well enough. So being a mentor to somebody who's a junior or not as much experience as you are always helps. And they bring in new perspectives that you probably never would have thought of.

And I think this is extremely important for data scientists, develop rapport with people and get them to invite you into meetings that you have no business being in. Just sit there and listen to what they are discussing, especially if it is business related, because you will then understand why business is asking for something rather than what they're specifically asking for. So just go there. I used to do this. I used to just say, Hey, can I just join here? And then I just, I used to say, can I be a fly in the wall? And they'd be like, okay, CC Brijesh. And I just go and listen. And I developed so many perspectives. That's kind of how I developed my know-how of some of the domain knowledge. Otherwise you won't get that domain knowledge. You will then hear more problems in the business than you think, Oh, you know what? I can actually solve for this. That gives you some more acceptance and buy-in as well in the company that you are working for.

Develop rapport with people and get them to invite you into meetings that you have no business being in. Just sit there and listen to what they are discussing, especially if it is business related, because you will then understand why business is asking for something rather than what they're specifically asking for.