From Physics PhD to MLOps builder - Julia Silge - The Data Scientist Show #087

Transcript#

This transcript was generated automatically and may contain errors.

Hello everyone, welcome to The Data Scientist Show. Today we have Julia Silge . Julia is a data scientist and engineering manager at PositPBC, formerly known as RStudio , where she leads a team of developers building fluent, cohesive, open-source software for machine learning and MLOps. Before Posit, she finished a PhD in astrophysics, worked for several years in the non-profit space and was a data scientist at Stack Overflow, where some of her most public work involved the annual developer survey.

Today we'll talk about MLOps tools, challenges in survey data, text analysis and balancing her interest in data science and engineering. If you like the show, subscribe to the channel, leave a comment and give me a five-star review. Welcome to the show, Julia. Thank you for having me. I'm really glad to be here.

From physics to data science

So Julia, how did you get into data science? Great, like you mentioned, my academic background is in physics and astronomy, and I was working in research and was realizing that the academic world was not gonna be where it was for me, not as fit for the long-term. My path at that point was a bit circuitous. I worked for an ed-tech startup. I actually was a stay-at-home mom for a few years, but eventually I started to see some people who were similar to me in background, like with this physics and astronomy, kind of making a transition into data science.

And I thought, I wondered, I talked to them and I was like, wait a minute. So there's a job out there where what I do is make plots, talk to people about analytical results, analyze data, and that's your actual job. That was my favorite part of when I was doing astronomy. So I thought this is gonna be a pretty good fit for me, and I decided to try to make this transition.

I, at the time, was a little bit underemployed. I actually had been laid off just a few months before this, and I was employed as a contractor, and I took about six months, and I took basically every MOOC that exists, every massive online course that exists to really learn some of the modern data science languages. I had a pretty strong background and programming background from the work in physics and astronomy, but I had never taken any formal stats course. I didn't know modern data science languages like Python, SQL, R. I didn't have machine learning experience because when I came up through physics and astronomy, it was not as common for people to use those kinds of machine learning tools.

I took a bit of time to do a lot of self-study, and then as part of that sort of self-study to try to make this transition, I started writing a blog. And my vision for this blog was that it would be projects that I could talk to people about during, say, interviews or for jobs, like job interviews, because my resume looked a little weird, and I thought I really need to have evidence that I can do this job, like evidence that I can do this job. It turned out that blogging opened huge doors for me, both in terms of jobs, but also with open source collaboration. I met people that I eventually wrote books with. It really was a big part of my transition was doing this public-facing kind of work.

It turned out that blogging opened huge doors for me, both in terms of jobs, but also with open source collaboration. I met people that I eventually wrote books with. It really was a big part of my transition was doing this public-facing kind of work.

I did get a job. So the first job that I got was in the nonprofit space, which I think is a really interesting way for people who are transitioning in, because often nonprofits have quite a lot of data, but they don't have as many resources for how to best use that data, how to best take advantage of that. Of course, salaries are not high in the nonprofit space, but it was really a great place for me to get that first data science title, that first job where that was my title, and to demonstrate that I could do this in a real org. So that's a little bit of how I got to that first kind of data science role. After that, I worked as a data science practitioner, like with the title data scientist, for several years at different kinds of orgs, moved from the nonprofit space into tech itself, and then moved to Posit about four years ago now.

For that, I feel like what the superpower that gave me was the ability to quickly generate reproducible reports. Like being able to generate reproducible reports really quickly, being able to make interactive apps really quickly, like it made people view me as incredibly productive and helpful to the org.