Resources

Vicki Boykis | Your public garden | RStudio

Vicki will discuss how that as people who can write code and analyze data, we have a lot of input and power over what our digital and work worlds looks like, and therefore can act as agents of change and repair. About Vicki: Vicki Boykis is a machine learning engineer at Automattic, the company behind Wordpress.com. She works mostly in Python, R, Spark, and SQL, and really enjoys building end-to-end data products. Outside of work she publishes the Normcore Tech newsletter (https://vicki.substack.com) and blogs at https://veekaybee.github.io/. In her "spare time", she blogs, reads, and writes terrible joke tweets about data

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hi, I'm Vicki Boykis. I'm a machine learning engineer at Automatic, the company behind WordPress.com. But today I don't want to talk about machine learning or data. What I do want to talk about is building a digital garden. And I want to explain why this is important by talking a little bit about what my daily routine looks like in the longest year of our lives in 2020. And I'm hoping that as a data scientist, it extrapolates and generalizes to what everybody's environment looks like and what we can do about some of that.

So my day usually starts around three o'clock in the morning. That's when my youngest son wakes up. He's very little. So I usually have to put him back to bed and wait on him to settle down. And while I'm doing that, I do my favorite activity, which is doom scrolling through Twitter. Nothing's really changed on Twitter since the last time I read it, maybe three, four hours ago. But it always makes me feel worse coming away from it than I do going into it. And so I'm filled with this endless anxiety of news items that have happened.

So then he settles back down. I go back to sleep. But in the meantime, I'm woken up around four o'clock in the morning by a text message from my state, which is an alert to be careful of COVID, for the hospital restrictions to keep masking and to be aware that the rates in our area are on the rise. This is kind of helpful information, but not really because it also makes me feel more anxious than relieved. And it brings me more questions than answers, really. For example, when's the next time I'll get to see my elderly relatives? Will schools stay open for the next couple weeks or so? How are we doing as a country? And when will it get better? And of course, none of us can answer that. So I'm left feeling even more anxious.

Of course, by the time all that's over, I go to the numbers. I look at the COVID tracking project, which is a really wonderful project put together by a group that tracks totals across each state. Although it doesn't make me feel any better either, even though there's very good information because the COVID rates are on the rise, it's exponential, and it doesn't seem to be ending anytime soon.

So by the time that's all over, it's usually around seven o'clock in the morning, which is when my older child, my daughter, wakes up. And she also has a lot of pings and needs. Usually hers are more like, what hippopotamuses eat? And what are the phases of the moon? So I help her get ready and get dressed, and then we go downstairs to breakfast. Breakfast, I use the term very loosely because as a family with small kids, it's just pretty much chaos. So during that time, I get some time to catch up on what's been going on at work.

I usually don't have 99 Slack notifications, but since we do work in a distributed, remote, and asynchronous way, there's usually someone working all around the world, even when I'm asleep. So there's definitely writing and reading to catch up on. So when that's all over, I go to my favorite part of Twitter to catch up on, which is data Twitter. And it seems like overnight, there have been five million more data analyses published, 10 million more new deep learning frameworks, and 15 million more data controversies. And it's a lot to keep up with, and I feel like I never can. So it all adds to the compounding anxiety of the things that I have to keep up with in 2020.

Radio 2020 and the noise of digital life

So after being pinged all morning and exposed to the stressful world of 2020 during COVID and the amplification of that world in my digital media, I start hearing not the individual noise, but the steady stream of Radio KFKD. Radio KFKD is a concept Anne Lamott talks about in her wonderful book about the creative process, Bird by Bird. Lamott says that KFKD plays for all creative people. She says there's two speakers. The left is the noise that you hear when you think you're better than everybody else, and the right is the noise when you think you're worse than everybody else. And I count data professionals, data professionals, within creative people too, because we have a lot of technical work that we have to do, and a lot of very judgmental-based work that is more of an art form than a science.

Today we're living in Radio KFKD, but over top of that is the pandemic. So it's what I like to call Radio 2020. The right speaker is everything we've mentioned. The noise of other people's perfection on social media, the noise of controversy, of amplified fears, of heighten everything, a platform that is constantly on, and a black mirror that reflects our worst fears and beliefs. The left speaker is all the news we're constantly processing about living in uncertain times, and over top of all that is the never-ending static of the pandemic that won't end. Fears for us and our loved ones, constant restrictions, and when we need it the most, isolation.

Being isolated is hard enough as a human. Being isolated and bombarded by the radio of 2020, the noise of clickbait doomscrolling, constantly being surrounded by information that is largely negative is impossible. So what can we do about all of this? Well, we can't really change our physical space. We can't change the state of the world right now. We can't change the people that rely on us. We can't change the human interactions that we have. What we can change is the digital world that we operate in. And so I want to talk a little bit about what the aspect of that digital world are, and how we can change it.

Why is radio 2020 so good at derailing us? And why is it constantly creating noise for us? There's three reasons. First of all is that online it's noisy, and this has been the case since even before 2020. In a conference that I attended and spoke at, Skeptak, several years ago, one of the speakers that really stood out to me was Natasha Daushul, who is a sociologist and professor at New York University. And she wrote a book called Addicted by Design that was talking about all the different ways that our online universes today are a lot like casinos. The summary of her research is that much of the online world, particularly social media companies, much like casinos, are designed to draw our attention away in small bursts. As a result, to get our attention and retain you on the platform, sites often do things like send you a combination of mobile alerts, notifications, and emails.

Sometimes this engagement is good, like if we're discovering interesting blog posts, or donating to creators, or watching original content. But oftentimes it can simply be noise. The online noise interferes with our ability to concentrate on longer-term projects and engage in longer-term thinking. And the problem is that there's always new noise and always new content online.

A second problem with this content is a lot of it is very superlative. What do I mean when I say superlative? Usually you only get the very worst or the very best, because that's what actually filters through the noise and creates engagement. So as an example, if someone is posting their pictures of vacation online, although not recently these days, but if they are, you'll usually only see the best pictures for their hotel room. You won't actually see how they were late for their flight, or had problems at the airport, or had transportation issues. You only get the highlights. And the same is true with all the worst news stories of 2020 especially, because those are the ones that are optimized to get us to click. So we'll only see the very very worst, and we won't see what's progressing more in the median and in the long term.

The third problem is that online is worse in the context of the current state of the internet. The enormous amount of additional pressure 2020 has put on both of our physical and digital lives has pulled us apart as humans, which means that we can't meet with anybody and we're left to face the crisis in the context of the current state of the internet on our own.

Building your own digital garden

So we have this crisis that's occurring both in the physical world, with everything that's been happening during the pandemic, and in the digital world with a crisis for our attention as well. How do we fight this? One of the best ways that I found personally is to be able to control it a very small bit of the time, and the best way to do that is to build your own digital garden.

Anne Lamott in Bird by Bird says that the best thing to do is to take it piece by piece, and the title of the book comes from a book report that her brother had to write. 30 years ago, she writes, my older brother, who was 10 years old at the time, was trying to get a report on birds that he'd written that he had three months to write. It was due the next day. We were out at our family cabin in Bolinas, and he was in the kitchen table, close to tears, surrounded by binder, paper, and pencils, and opened up books by birds, immobilized by the task at hand. Then my father sat down beside him, put his arm around my brother's shoulder, and said, bird by bird, buddy, just take it bird by bird.

The idea is that we can't change everything, but even changing a little tiny bit of our environment is better than changing nothing. So what's a way that you can fix the world and change it bird by bird? The best way to do that is by planting a garden.

The Secret Garden is a book by Frances Hogson Burnett, who lived in America and England in the 19th century. The story is about a girl named Mary, whose parents died in a cholera epidemic in colonial India. It's kind of sad, but her parents never saw her much anyway. She was cared for by servants, and as a result, she had grown spoiled, selfish, and extremely lonely. She sent to England to her uncle's estate on the Moors. The Moors are a beautiful, lonely, and foreboding place. And just like the Moors, the house where Mary stays is dark and cold, as is often the case in British literature, full of secrets. It's revealed that Mary's aunt, who died, had a secret garden in the house that was then locked up. When Mary comes to the garden, it's neglected. She begins to tend to it, and in the process of caring for it and watching things grow, she forgets to feel sorry for herself and her parents. The process of caring for it brings her back to life.

So right now, we're all a little bit like Mary, at the start of a garden that's fallow. We're lost. We're not quite sure what to do. So there's some things that we can do, and next I want to talk about things that people have done, and then end with some suggestions for maybe smaller things that you can do to get started.

Ways people have cultivated their digital gardens

So how do we fix this? How do we create something that's for our public garden and makes the internet better at the same time? I'll talk about a couple of different ways that people in the community have done this, and that I've done this, and then offer some more lightweight suggestions if these seem overwhelming to you.

The first and best way, in my opinion, is to write something from your perspective. David Robinson has a really good blog post about this. He writes a bunch about statistics in R as well. He said that if you write something, if you tell somebody something three times, then you should write a blog about it. And he says that it's important to write a blog for a data science career portfolio. And I think it's important not just for that, but because it allows you to create something that you're in control of on your own platform that doesn't change, and that counteracts the way that social media works, and that you can come back to again and again and have it always be the same.

You can write about anything from analytics, from your opinion on certain trends, certain packages you've done. So these topics have been written about hundreds of thousands of times before data analysis and tidyverse, but the important thing is that it's coming from your perspective. Everybody comes to the story with a completely different perspective, completely different set of life skills, and history, and work skills, and the internet wants to hear about them.

Everybody comes to the story with a completely different perspective, completely different set of life skills, and history, and work skills, and the internet wants to hear about them.

Two really great recent examples of this from our blogger are this recent article about pipes, about the pipe tool, and a spoonful of Hugo. How much Hugo do I need to know? Neither of these are very long and complex topics, but they break down what you need to know in very specific ways, and they're talking about it from the author's experience. So these are both fantastic.

The way that I've done this is in a couple of different ways. One of the places I'm active on is my technical blog, and I write about machine learning, and data, and metrics. My main goal on my blog is to write about things that I don't really see anywhere else on the blogosphere, or write them from my perspective. And the other thing is that in the process of writing more technical posts, you get more insight into what's going on and a better understanding as well. So some things I've written about are getting machine learning to production, the Spark API using R, and in a very serious examination, breaking down the iOS screen time report. None of these are breakthrough topics. In particular, getting machine learning to production is a topic that has been covered over and over again. But this is my take on the topic. These are things that are interesting and important to me, and now I have them to refer back to in my blog.

Another place I've been writing is my newsletter, Normcore Tech, where I look at machine learning and data, but more from a human-centered perspective. I started again because I didn't see anyone covering these topics in quite the same way. Sure, we all talk about Google, and Facebook, and social media every week, but there were things that I wanted to understand better. Here are some recent things I've talked about. How a lot of the products that we use are built by consensus, by an average of 10,000 people in San Francisco, and what impact that has for our online world. What happens when the internet stops in a country, as it did in Belarus in the middle of 2020, and how the internet actually gets from country to country. And how Google Drive, which is ubiquitous in most of the companies I've been at, is actually production because we store really sensitive data in there, and when it goes down, it can be a problem.

Another interesting thing that happens when you write about things that are of interest to you is you start getting people who are also drawn to these topics. So when I first started my newsletter, I didn't get any emails, but then slowly and surely, I got more people talking to me about these topics from their perspective as well. So it's a wonderful way for building a community, even if you didn't initially intend it to be.

The second way, if you want to go a little bit of a more technical route, is to empower people to understand data. Data literacy these days is really, really important with how confusing and clickbait-oriented the internet has become. And anyone who helps make sense of that data and make it trustworthy is a superhero. A fantastic example of this is the work of Julia Szilagyi, who has created books, courses, and writes on her own blog about topics like Tidy Tuesday, as well as showcases videos. She does a fantastic job of explaining and making it accessible and has that content on her site to own.

Another wonderful example is the R-Lady Sydney course created by Lisa Williams, who was inspired by her own frustrations of getting started with R. The course goes through getting started, packaging, and R markdown, and it's another great way to work through what you're interested in, as well as share it with other people.

Another person who has done a wonderful job empowering people from the Python community this time is Ines Montagny, who works in the natural language processing space on the Python package Spacey. And not only does she have clear explanations how to use the package, she open-sourced a whole library that allows you to create your own course.

Finally, cultivating your own garden can simply be about being human and creating connections. One of my favorite examples about this is the illustrative work of Alison Horst, who creates illustrations both for her own talks and for use in the R community. And they're just beautiful, they're fun, they're interesting, and they're creative ways to think about a topic that hasn't really been covered in the same way before.

One of the recurring themes of Norm Core Tech, my newsletter, has been that even though we all sit in machines and write code or munch data for a great deal of the day, we ultimately crave human connection, as we do during this pandemic and isolation. We don't have a way to hand off physical things, especially now. Cultivating our own garden that's not subject to the radio frequencies of radio KFKD is a wonderful way to do that.

Smaller ways to get started

But what if you're overwhelmed by the idea of getting started and don't know how? There's a lot of really small ideas that you can draw from as well, that don't necessarily involve creating a course or making videos or creating your own blog. And so I want to cover some of those things as well. If you're feeling intimidated by these very large scale, broad ranging projects, there's much smaller ways to get started. And in fact, one of the ways that I got started was by creating a blog and drawing memes, something like 10 years ago. And I've been hooked on content creation ever since.

So there's some much smaller ways that you can approach things. One of the first things you can do is just make a commitment to write 200 words once a month. The most important thing is that you're writing in a space that you feel comfortable writing and that is not an impediment. As an employee of Automatic, I can recommend WordPress.com. But it doesn't really have to be that. Just any place where the technology is not impeding you for making small progress on a bunch of words every day.

The second thing that you can do is share a piece of code. If the idea of blogging seems overwhelming, sharing code and talking about code can be a stepping stone to cultivating an online place of your own. It's extremely easy to write markdown posts in GitHub. And as an added bonus is learning an industry tool that's very widely used. Something that I started that I copied from Simon Willison is a directory called TIL, today I learned, full of random code snippets and pieces. The benefit of this is that you come back to it again and again. I know that's true for me. And you won't lose it. And you have it for others to share as well. In a world of incomplete stack overflow answers, think of this as your stack overflow that's filled with your specific questions and answers for your needs.

You can also make a comic. It might not seem like it, but memes, comics, TikToks, and the like serve as a really important part of content creation. And I should note, my Twitter feed is chock full of them. Just because something isn't serious doesn't mean it doesn't have educational value or importance as content. I'm thinking specifically of the wonderful zines that Julie Evans puts out that include things like learning Linux and Git. But in a fun way. You don't have to create an entire zine. Even a fun meme about R, statistics, or drawing like Alex and Horst are all great ways to share what you know and feel like you have some sort of control over the creative process.

You can also do something like make a bot, which is what I did. If you like and have the opportunity to write code in your spare time, building a bot based off of an API is a fun way to learn about how the web works and cleaning your little piece of it. A few years ago, I built something called Soviet ArtBot, which as it says on the tin, scrapes wiki art for socialist realist images and tweets them out once a day. I learned a lot during this process, but what I love about it is that I get to control my online environment through code. I'm creating and changing what people see on their timelines. Art is something I always appreciate on my Twitter feed. And as an additional bonus, it gave me the time to learn about Twitter APIs and cloud computing paradigms.

The three things that I just talked about are very tangible ways, but there's also very intangible ways to do so. The first one is sending an email appreciation, either an email or a note on GitHub or a comment on someone's blog. It can be to the author of your favorite art package, someone you're working with, or just someone you've seen do cool stuff online. People love getting appreciation, and in an increasingly negative and polarized world, that kind of feedback is really hard to come by. You might even get into a conversation with the author about something and create something together, which has happened to me before. What I love about this, especially with the email aspect, is that it's private, but it's still a way to send out really great positive energy out in the world.

And finally, start your own community. One of the reasons that we feel overwhelmed is that big communities are intimidating. I've often talked about this theory I have called good things don't scale, where one of the reasons we get stressed out about social media is that humans are not meant to carry out as many relationships as are in Dunbar's number, 150 people. We simply can't hold it in our heads, and so lose context with all the conversations we have. The best way to fight that noise is to create a small community that can, again, be in any medium, as long as it's a medium that you and the other people in the community keep up with. For example, if you know you hate text messages, use Signal or WhatsApp. The ideal community for this is under 10, 15 people, and these people should be fairly active. A community of like-minded people who support each other is the key to fighting the bigness and the noise of the internet, because the process for caring for a garden is an important one, and it's a long-standing process, and it's made up of very large, broad range of multiple things.

Every time you create something positive, you're doing a small amount of work outshining all of the negative work in the world in the internet today. You're taking away one piece of noise, one ping from your life. There are many reasons to do it, including the ones I talked about at the beginning of the talk, but mostly, for me personally, it's a wonderful way to manage my own 2020 anxiety and change the world bird by bird, because when you're tending to a garden, a thistle cannot grow in that place, as Frances Hogson Burnett writes, and maybe that's just as much of a reason to create a garden as any other.

Every time you create something positive, you're doing a small amount of work outshining all of the negative work in the world in the internet today. You're taking away one piece of noise, one ping from your life.

It's my hope that Radio 2021 is going to be tuned to a much, much better frequency than Radio 2020. Thank you so much for your time, and I hope to see your digital gardens soon.