Open Source Property Assessment: Tidymodels to Allocate $16B in Property Taxes

Transcript#

This transcript was generated automatically and may contain errors.

We are so excited to be here. I'm Nicole Jardine. I'm the Chief Data Officer for the Cook County Assessor's Office. Soon I'll introduce Dan, our Director of Data Science.

And right off the bat, does anyone here, does anyone currently live in Chicago ever lived in the Chicagoland area? Oh wow, okay, so about 10% of you, okay. We're excited to answer questions. We love to talk about this topic. We have time at the end to talk about anything related to Chicago home ownership.

Today Dan and I are going to talk about something that affects equity, housing affordability, and generational wealth for a couple million people here in Chicago and the surrounding suburbs. And that thing is property taxes.

Property tax bills here in Illinois depend in part on the government's assessment of how much your property is currently worth. Just to set the stage a little bit, in Chicago in 2015, the average home was assessed as if it was worth roughly $237k, and the average homeowner paid an average of about $3,600 in property taxes.

The tax divide problem

But starting in 2017, researchers at the University of Chicago and journalists at the Chicago Tribune, ProPublica, and the New York Times identified a problem, and that problem was about property taxes, and specifically about the way that assessors throughout the country were assessing property values.

The problem was identified using something that's called a ratio study. So the assessor's job in Illinois is to have assessments that follow the market every couple of years. The assessor assesses property values and adjusts them so that they're roughly in line with where the real estate market has actually gone. And in a ratio study, you take the properties that have actually sold, which is roughly 5% or so of homes here in any given year, and you compare the estimated value of that property to its actual sale price.

And this is a New York Times graphic, we've done some light modifications, and that ratio should be roughly 1. And what the New York Times and Chicago Tribune and others have found is that in the past, an average home was right around that 1 mark. That's about where it should be.

But what about other homes? It's really important to think about homes on the other side of the distribution, not just the average homes. So here's an example of a modest home on the south side of Chicago. Let's say in 2015, it sold for about $50K. On the other hand of that distribution, we have a nice home on the North Shore that maybe sold for more like a million dollars. And the fundamental question is, what's that assessment ratio? It's going to be close to 1.

So what that would mean, just to unpack that ratio a little more, if that 50K home actually was assessed at 75K, that would mean a ratio of 1.5. And if that home that sold for a million dollars was assessed as if it was worth 800K, that would mean a ratio of 0.8.

Here's what the New York Times actually published. There was a higher ratio of assessments to sale price for the homes with lower sale values. So these homes that are on the more affordable side of the price spectrum were historically, according to this reporting, over-assessed. That can lead to over-taxation. And on the flip side, these homes that are these higher-priced homes were also being under-assessed, leading to under-taxation. That's the problem.

The Chicago Tribune referred to this as the tax divide. And that tax divide is why we're here. So again, the issue, according to the Tribune, as they stated it, is Cook County failed to value homes accurately for years. And the result was a property tax system that harmed the poor and helped the rich. The problem lies with the fundamentally flawed way that the County Assessor's Office valued property.

And the result was a property tax system that harmed the poor and helped the rich.

So what's the cause? It could actually be a number of interesting causes. Could be issues with the characteristics data that the Assessor's Office uses to estimate property values. It could be issues with the way other offices actually adjust assessed values. But we're here today because we want to talk about one of those issues, and one of those is about the modeling.

So recall that the Assessor's Office uses sales, again, 5% of sales in a given year, to estimate property values for everyone. In the past, this was done with regression modeling. This is actually a snippet of code. It was closed source, SPSS, OLS-based models, and, you know, we're talking 50,000 lines of manual fixed effects, property-specific overrides, and more.

After the Tax Divide was published, voters elected a new Assessor, a new County Assessor, Fritz Kege, and Fritz started the Data Department. The Data Department's first Chief Data Officer was Rob Ross. I'm the current Chief Data Officer, and Dan and William and I have been on this journey together for about four years. And we're here because tech people and academics recognize that this is a data and a modeling problem, and that maybe we could actually help solve it. So our goal was to improve the model and transparency, and right now I'm super excited to hand it off to Dan, who's going to tell us about the model.

And I would encourage anyone on the fence to try joining us because data skills are very desperately needed inside of state and local governments.

And if you are really a glutton for punishment, you can join us in the assessment industry because this problem that we described about Chicago, this is nationwide. These are other graphs from that same New York Times article from other cities. And you can see that their ratio curves look equally bad.

So if you're interested in learning more, I will take questions, and Nicole and I will stick around after the talk to chat. I also have some rare LightGBM hex stickers, if anyone wants them. So thank you, and I will be happy to take questions.

Q&A

Big thank you, Dan and Nicole. And we have time for questions. And we have a lot of questions.

So, how much were you inspired by private companies like Zillow and Redfin, who also do assessment? Or where did you go to figure out how to do this?

Oh, that is a great question. I mean, significantly inspired. They're building very similar predictive models. They're not predicting exactly the same thing. We are predicting the universe of all properties. Zillow is sort of predicting the universe of properties that are going to sell. And so they're different markets, not necessarily perfectly analogous. I think that our job is actually harder, personally, because, yes, we just have to cover a much wider swath of properties. Their blog is cool, though. Their blog is very cool. And they have a neat model.

So when switching from the old SPSS model to the new tidymodels, did you discover that certain variables were placed with too much or too little importance in the past?

No, I don't think so. I think generally the things that you would expect to be important to property values stayed about the same in terms of their importance. It is exactly the things I said before. It is the size of the property, its relative age, and where it is. Those are sort of the big determinants of property value almost anywhere.

One more question. Do you use any model explainability methods, for example, to help with conversation with public stakeholders?

We do, although we have not made a lot of those results public, and we are sort of working on that. I'm happy to answer questions on that afterwards. Thank you so much.

Open Source Property Assessment: Tidymodels to Allocate $16B in Property Taxes - posit::conf(2023)

Transcript#

The tax divide problem

Rebuilding the model with R and tidymodels

How tidymodels powers the pipeline

Results

Transparency and open data

Q&A

Featured software#

tidymodels