ZJ | Easy larger-than-RAM data manipulation with {disk.frame} | RStudio
Learn how to handle 100GBs of data with ease using {disk.frame} - the larger-than-RAM-data manipulation package.
R loads data in its entirety into RAM. However, RAM is a precious resource and often do run out. That's why most R user would have run into the "cannot allocate vector of size xxB." error at some point.
However, the need to handle larger-than-RAM data doesn't go away just because RAM isn't large enough. So many useRs turn to big data tools like Spark for the task. In this talk, I will make the case that {disk.frame} is sufficient and often preferable for manipulating larger-than-RAM data that fit on disk. I will show how you can apply familiar {dplyr}-verbs to manipulate larger-than-RAM data with {disk.frame}.
About ZJ:
ZJ is a machine learning developer based in Melbourne, Australia. He regularly contributes to open source projects. He has more than 10 years of experience in banking before joining the tech sector. In his free time, he enjoys playing Go/Baduk/Weiqi
dplyr
rstudio
RStudio
Data Science
Machine Learning
Python
Stats
Tidyverse
Data Visualization
Data Viz
Ggplot
Technology
Coding
Connect
Server Pro
Shiny
Rmarkdown
Package Manager
CRAN
Interoperability
Serious Data Science
Dplyr
Ggplot2
Tibble
Readr
Stringr
Tidyr
Purrr
Github
Data Wrangling
Tidy Data
Odbc
Rayshader
Plumber
Blogdown
Gt
Lazy Evaluation
Tidymodels
Statistics
Debugging
Programming Education
Forcats
Rstats
Open Source
Oss
Reticulate
ZJ
Disk.frame