Brooke Watson | R at the ACLU Joining tables to to reunite families | RStudio (2019)
Last year, over 2500 immigrant children were separated from their family while in government custody. Information about their status is scattered across several government agencies, and throughout the national class-action lawsuit “Ms. L vs ICE,” the Analytics team of the ACLU has been using R to join, deduplicate, validate, and analyze it. Using specifics of this case, this talk will address common challenges arising from human-generated data in spreadsheets. With generalizable examples, I will discuss data tidying, standardization, deduplication, and validation using the tidyverse, janitor, assertthat, and other packages. Finally, I will share best practices for requesting useful data from non-quantitative subject matter experts.
About the Author
Brooke Watson
I am a Data Scientist at the ACLU, where I use code and statistics to support civil rights litigation and advocacy. Previously, I worked in public health and disease research, most recently as a Research Scientist with the EcoHealth Alliance. I completed my Master’s degree in epidemiology from the London School of Hygiene and Tropical Medicine and swam for Tennessee’s Lady Vols as an undergrad
rstudio
tidyverse
Brooke Watson
ACLU
RStudio
Data Science
Machine Learning
Python
Stats
Tidyverse
Data Visualization
Data Viz
Ggplot
Technology
Coding
Connect
Server Pro
Shiny
Rmarkdown
Package Manager
CRAN
Interoperability
Serious Data Science
Dplyr
Forcats
Ggplot2
Tibble
Readr
Stringr
Tidyr
Purrr
Github
Data Wrangling
Tidy Data
Odbc
Rayshader
Plumber
Blogdown
Gt
Lazy Evaluation
Tidymodels
Statistics
Debugging
Programming Education
Rstats