We are pleased to announce that forcats 0.4.0 is now on CRAN. The forcats package provides a suite of useful tools that solve common problems with factors in R. This version benefited from the hard work of contributors new and old at our first tidyverse dev day . For a complete set of changes, please see the release notes .

To install the latest version, run:

1
install.packages("forcats")

As always, attach the package with:

1
library(forcats)

New functions#

fct_cross() creates a new factor containing the combined levels from two or more input factors, similar to base::interaction().

1
2
3
4
5
fruit <- factor(c("apple", "kiwi", "apple", "apple"))
colour <- factor(c("green", "green", "red", "green"))
fct_cross(fruit, colour)
#> [1] apple:green kiwi:green  apple:red   apple:green
#> Levels: apple:green apple:red kiwi:green

fct_lump_min() preserves levels that appear at least min times (can also be used with the w weighted argument).

1
2
3
4
5
6
7
8
x <- factor(letters[rpois(50, 3)])
fct_lump_min(x, min = 10)
#>  [1] Other b     Other b     Other Other Other b     Other Other b    
#> [12] Other Other Other b     Other b     Other Other b     b     Other
#> [23] Other Other b     b     Other Other Other Other Other b     Other
#> [34] Other Other b     Other Other Other Other Other Other Other Other
#> [45] Other b     Other b    
#> Levels: b Other

fct_match() tests for the presence of levels in a factor, providing a safer alternative to %in% by throwing an error when there are unexpected levels.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
table(fct_match(gss_cat$marital, c("Married", "Divorced")))
#> 
#> FALSE  TRUE 
#>  7983 13500
table(gss_cat$marital %in% c("Maried", "Davorced"))
#> 
#> FALSE 
#> 21483
table(fct_match(gss_cat$marital, c("Maried", "Davorced")))
#> Error: Levels not present in factor: "Maried", "Davorced"

Other improvements#

  • fct_relevel() can now relevel factors using a function that is passed the current levels.

    1
    2
    3
    4
    5
    6
    7
    
    f <- factor(c("a", "b", "c", "d"), levels = c("b", "c", "d", "a"))
    fct_relevel(f, sort)
    #> [1] a b c d
    #> Levels: a b c d
    fct_relevel(f, rev)
    #> [1] a b c d
    #> Levels: a d c b
    
  • as_factor() now has a numeric method which orders factors in numeric order, unlike the other methods which default to order of appearance.

    1
    2
    3
    4
    5
    6
    7
    8
    
    y <- c("1.1", "11", "2.2", "22")
    as_factor(y)
    #> [1] 1.1 11  2.2 22 
    #> Levels: 1.1 11 2.2 22
    z <- as.numeric(y)
    as_factor(z)
    #> [1] 1.1 11  2.2 22 
    #> Levels: 1.1 2.2 11 22
    
  • fct_inseq() reorders labels numerically, when possible.

Thanks to Emily Robinson, forcats also has a new introductory vignette .

Acknowledgements#

We’re grateful for the 35 people who contributed to this release: @ahaque-utd , @AmeliaMN , @ashiklom , @batpigandme , @billdenney , @brianwdavis , @corybrunson , @dalewsteele , @ewenharrison , @grayskripko , @gtm19 , @hack-r , @hadley , @huftis , @isteves , @jimhester , @jonocarroll , @jrosen48 , @jthomasmock , @kbodwin , @mdjeric , @orchid00 , @richierocks , @robinsones , @rosedu1 , @RoyalTS , @russHyde , @Ryo-N7 , @s-fleck , @seaaan , @spedygiorgio , @tslumley , @xuhuizhang , @zhiiiyang , and @zx8754 .