Resources

JooYoung Seo | Accessible Data Science Beyond Visual Models | RStudio

Full title: Accessible Data Science Beyond Visual Models: Non-Visual Interactions with R and RStudio Packages Data science is full of vision-dominant practices, and most data scientists rely heavily on visual models. However, data science itself should require insight and computational thinking beyond what is just seen by eyes. JooYoung Seo, who is a blind data scientist and who was working for RStudio's accessibility projects over the summer 2020, will talk about his experience with some non-visual techniques to interact with data. If you would like to know more about various ways of making data science accessible via R, and new accessibility features introduced in RStudio IDE and Shiny, his demonstration without sight will be thought-provoking. About JooYoung: JooYoung Seo is a Ph.D candidate in the Learning, Design, and Technology program at the Pennsylvania State University, and internationally certified accessibility professional whose research and development focuses on accessible computing for all. As an RStudio's double-certified data science instructor (i.e., Tidyverse + Shiny), who is blind, he is committed to making data science ecosystem more accessible to people with and without dis/abilities using R. To this end, he has been actively contributing to R open-source projects including Shiny, RMarkdown, bookdown, and distill for accessibility, and interned on the RStudio IDE and Shiny team as an accessibility engineer in summer 2020

image: thumbnail.jpg

Transcript#

This transcript was generated automatically and may contain errors.

Hello, I'm Jiyoung Seo, a doctoral student in Learning Design Technology program at Penn State, and I was one of the interns at RStudio in 2020. I've been using R without sight, because I'm blind.

Yeah, I understand that you might have a big question mark in mind. How a blind person could do data science, or even use computer? In this talk, I will answer the fundamental questions that you might have by demonstrating some strategies that I've used, plus by introducing some accessibility improvements made for RStudio IDE and Shiny that I was involved in for my internship.

How a blind person uses a computer

Before talking about R and data science, let me address this question first. How do I use computer? Well, I need to ask you back, then how do you use your computer? In other words, how do you interact with computer?

We could break down the way we interact with computer into input and output. That is, how we pass information to computer, and how we get information from computer. What if we would remove visual aspect from the two modes of interactions?

Undoubtedly, using mouse would become almost impossible, or even useless. I doubt no one could point and double click RStudio icon on their desktop in a stable fashion when blindfolded, but you can still use keyboard as long as your muscle memory remembers your keyboard layout. What about output? We need some workarounds alternative to visual monitor. How about using sound output instead?

I use computer via keyboard-only input and sound and tactile output, just like this. Isn't it too fast for you? No worries, let me slow it down for you.

Data science beyond vision

Now that you got how I interact with computer, let's talk about data science. Data science involves a cycle of interaction with data, such as importing, tidying, transforming, visualizing, and modeling. And it feels like data science is highly vision-dominant. Yes, it could be.

But data science is more than what is seen. If data science was restricted to only vision domain, that would mean that people who cannot employ vision sense can't do data science at all. Very fortunately, however, that is not true, because we have R, which enables blind people see data beyond visual models.

But data science is more than what is seen. If data science was restricted to only vision domain, that would mean that people who cannot employ vision sense can't do data science at all. Very fortunately, however, that is not true, because we have R, which enables blind people see data beyond visual models.

I believe R is one of the most accessible data science environments because of the following benefits. First, command line interface. As opposed to point-and-click graphical user interface, which requires mouse input interaction, R is a command line environment. This means that you can do everything within terminal using keyboard only. This is a good benefit for keyboard users like me.

Second, the beautiful interplay between reproducibility and accessibility. R Markdown is a plain text-based document for reproducible report that can be written with any text editor and can be rendered into multiple formats, including HTML, Word, PDF, EPUB, and more. These outputs, in many cases, are made readily accessible to assistive technologies. Especially, the HTML output is super accessible.

Last but not least, there are a number of accessible packages out there, thanks to its open-source nature.

RStudio accessibility improvements

There are two either accessible or semi-accessible integrated development environments supporting R programming. One is RStudio IDE, version 1.3 or later. The other one is Visual Studio Code with some extensions for R.

It was such an honor for me to be part of the RStudio accessibility project for my summer 2020 internship, under the great double mentorship of Gary Ritchie on RStudio team and Winston Chang on Shiny team. Since RStudio has just started its accessibility support, there are still rooms for further improvements, but we can at least recommend trying out RStudio Server for keyboard or screen reader users to benefit from its accessibility enhancements.

As RStudio Server requires Linux system, Windows users need virtual technology called Windows subsystem for Linux, and we published the technical details for this. Just check it out.

Here are some accessibility options currently supported in RStudio Server version 1.3 or higher. We have added screen reader support, animation reduction, tab key focus control, etc. And previously, it was very challenging for keyboard users to move their focus around, but now you can get your focus moved around by pressing tab key from menu bar all the way down to workbench area. You can send any inquiries or suggestions directly to accessibility at rstudio.com.

Accessible R packages

Now it's time to introduce some accessible and useful R packages that I've used. I used GT when interacting with DataFrame and other tidy data table, and I used Sonify for two-dimensional scatterplot or line charts. Braille R package is very useful for understanding histogram, bar plot, and box plot. Shiny, it used to be not very accessible, but it's getting accessible. So of course, I use it for interactive data science work. R Markdown is such an accessible Swiss army knife because you can do almost everything possible in R and turn it into accessible outputs.

All right, let me walk you through each of them from how I interact with DataFrame. I assume view is one of the most widely used basic functions among R users to get a better sense of DataFrame structure. Unfortunately, the default data viewer called from urls package is completely inaccessible. This means that screen reader does not read anything for you from the default pane. RStudio has replaced this default one with their one enhanced data viewer, which is accessible. However, that's not universal solution for those who do not use RStudio IDE.

My simple go-to is GT, developed by Rich and Joe. GT stands for grammar of tables, and it has a lot of number of good features that helps you create nice looking tables. But I use its core function, GT colon colon GT in lieu of data viewer. Why I use this? Because it generates and opens on HTML table, which is fully screen reader accessible and keyboard users.

Data sonification for scatter plots and histograms

Next, I'll talk about how I interpret two dimensional scatter plot. On the left hand side, we have a simple scatter plot with x and y axes. How would you be able to represent this visualization in a way that blind people could also digest? How about using sound? Say for example, we could represent values on x-axis using stereo panning sound from left to right. And we could map y-axis values with different pitch. Data points that are higher on the graph have a higher pitch. Data points that are lower on the graph have corresponding lower pitch.

It makes sense, doesn't it? This technique that represents data using sound is called sonification. I use data sonification alternative to data visualization. We can also try this out with sonify package developed by Stephan Siegert and Robin Williams.

Next, I will show you how I interact with histogram. Here is a visual histogram. How could we make it accessible to people who are blind? Yes, of course, we can use data sonification again by mapping x and y-axis to sounds. But I would like to introduce another way of data representation this time. That is using text description.

The braille R package developed by Jonathan Guthrie has a function that translates some R graphs into alternative text description. You can call this function by vi, which stands for vision impairment. This package is currently available on GitHub that you can install via remote colon colon install on the bar GitHub function.

For box plot interaction, we can apply the same method to this visualization. Just like this. I know we got a quite long description this time, but it's cool that we can have auto-generated alt text for some basic R graphs. If you are an R developer, please help Jonathan Guthrie on GitHub to support more types of graphs.

Accessible math with R Markdown

Okay, I want to turn into accessible math content interaction. I kept mentioning that R Markdown contributes to accessibility. One of the most pleasant supports is accessible math. For HTML output, R Markdown uses MathJax JS library by default for latex math expressions. And this is rendered into accessible math markup language content that is fully accessible with modern screen readers.

Shiny accessibility improvements

I'm going to talk a little bit about Shiny. I'm super excited to announce that Shiny is getting accessible. With the amazing support of Winston, Carson, Barrett, and other Shiny team members, I was working for making Shiny accessible for my internship activity at RStudio. As a result, you will see many accessibility improvements in the next Shiny release, version 1.6 or higher. If you want to test the development version, that's available on GitHub, and you can install it through remote package.

Some significant improvements are as follows. First, we've added PayPal's bootstrap accessibility plugin under the hood. So alert, tooltip, popover, model dialogue, dropdown, tab panel, collapse, and carousel elements are made keyboard accessible.

Next, there are many significant improvements made for Shiny widgets. Select input, especially selectize input, file input are now fully keyboard accessible. Font awesome and glyphic icon now produces appropriate labels based on icon names for screen readers. Media buttons and checkbox group input are properly grouped together for assistive technologies. Date input and date range input are properly labeled. And now dynamic contents within all the output and update input are auto announced to screen readers and refreshable bread display.

There are also some great enhancements for semantic accessibility. We can pass alt text to render plot function to specify alternative text description for screen readers. It could be a static text. Not object is given by default. Or you can even use reactive function for alt term to create dynamic text that is useful when you want to create reactive alt text in response to user input.

And you can explicitly set the lang code, language code, used within your Shiny app by passing lang parameter to float page function. This helps search engine parsers and assistive tech better identify document language. Last but not least, semantic landmarks have been applied to main panel as well as sidebar panel for screen readers to quickly navigate through Shiny apps in a logical manner.

Closing thoughts

It's time to close my talk. I have touched upon some fundamental questions that you might have. How could a blind person do computer? And how could a blind person do data science? As my presentation title suggests, we can make data science accessible beyond visual models because data science requires insight, not sight. And all is possible with R.

As my presentation title suggests, we can make data science accessible beyond visual models because data science requires insight, not sight.

According to the World Health Organization's 2018 report, globally, more than 1.3 billion people have some varying degree of visual impairment and 36 million of whom are blind. I'm just one of them. Why not we invite more people with diverse abilities to use R and to enjoy data sciencing? Thank you.