Resources

Use Your Data Skills for Good (Sharon Machlis) | posit::conf(2025)

Use Your Data Skills for Good: Ideas for Community Service Speaker(s): Sharon Machlis Abstract: Community service doesn't have to mean finding a group where your skills match their needs. You can also see a data-related need and jump in to fill it. Are there interesting things to do in your town, but no one place to find them? You could create a searchable, auto-updating local events calendar. Do you live someplace where local election results aren't published the way you'd like people to see them? If data is publicly available, you could analyze and publish it yourself. You'll leave this session with some ideas on how your data skills can help your community -- even if you're not ready to commit to formal volunteering with an outside organization. Materials - https://github.com/smach/positconf_2025 posit::conf(2025) Subscribe to posit::conf updates: https://posit.co/about/subscription-management/

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Can being a little annoyed when trying to access local data end up being something good? Hi, I'm Sharon Machlis, longtime tech journalist, now retired mostly, here to hopefully inspire at least some of you on how your data skills can help your local community. And some good news for the introverts among us. No meetings or long-term commitments required. Instead, it's just see something, code something. It's combining the fun of solving coding challenges with the satisfaction of doing good. What's not to love?

So, you'll notice some common themes among my projects. These are what I like to do, help people search and answer questions, and make the web a little easier to use. Maybe you'd like to do something else, but let's get started.

Searching local history newsletters

So, there was a local community group that had a treasure trove of local history in decades worth of PDF newsletters, but no way for people to really, like, access that. So, they did finally post them online, but like this. PDF files in a public Google Drive, not really the friendliest of user interfaces, but with data skills, we can turn something like that into something like this. Shiny app with full-text searching of the newsletter text. And since it's 2025, we can even add a chatbot, where people can ask natural language questions of the newsletters, and always I like to include links to the source documents that are used to answer the questions, so people can check and make sure that the LLM didn't just make something up.

used to answer the questions, so people can check and make sure that the LLM didn't just make something up.

So, this is the text stack I used for this. For the PDF to mark down text, I used LlamaParse. It's a commercial service, but they have a generous free tier, and I like the results, but there are other open-source options available. Lots of R packages, thanks to all the authors of these. Ragnar and Elmer for the AI, and Shiny, and ShinyChat, and lots of others. And of course, it's 2025, so I did use LLMs to help me write some of the code.

City meeting agendas and election data

I did something similar with our city's meeting agendas and minutes. These already are full-text searchable, but by individual group. So, if you want to ask a question like, what's happening with rail trails around the city, you'd have to like look in each individual thing. A little bit annoying. So, with data skills, we can make something like this. Incredibly non-beautiful, but usable. And in fact, one of our city councilors told me he actually uses this more than the official city website, because it searches across. And of course, yes, it's this year. I'm trying to add a chatbot also with the source documents.

Another interesting source of local data can be local election data. Now, if you live in a large city, chances are there's a newspaper that already does a great job with this data, but if you live in a small community, maybe not. This is how my city posts its election results. Yes, this is the PDF, and this is the format they use. And you'll see this in local media just basically turned into a regular table, but with data skills, we could turn that into a map showing where candidates were strongest and weakest. A searchable table, maybe with a few dataviz elements, maybe even a heat map. So, the tech stack I used for this, Quarto for publishing, and again, lots of R packages, thanks to all the great authors of those packages.

Local events, recycling, and farmers markets

Maybe your city has a lot of fun things to do, but no one place where you can search and sort them. Can you web scrape? You could make something like this. Again, not beautiful, but usable, searchable. Tech stack for this, I'm using Rvest for the web scraping, and since I'm reading an RSS feed, also tidy RSS, React table for the table display, Quarto for publishing, and I'm using a Linux cron job to keep it up to date, but GitHub actions would use as well, which are free and you don't need your own server.

Especially in the US, recycling programs can be kind of complicated. Can I recycle this? How do I recycle that? You can feed the whole thing into a chatbot and let people ask natural language questions. This actually can take questions in other languages, since I live in an immigrant community. Again, always including the source documents, especially when we're doing something, an unofficial app for official data, to make sure people can check the data.

Farmer's markets, who's going to be there? What can I buy there? Maybe there's a list of vendors and what they sell. You can make a chatbot and let people do that. So I hope this gave you some ideas about how your data skills can help your community. Thanks a lot. Have a great conf.