R/crossover.R
crossover.Rd
crossover()
combines the functionality of dplyr::across()
with over()
by iterating simultaneously over (i) a set of columns (.xcols
) and (ii)
a vector or list (.y
). crossover()
always applies the functions in
.fns
in a nested way to a combination of both inputs. There are, however,
two different ways in which the functions in .fns
are applied.
When .y
is a vector or list, each function in .fns
is applied to
all pairwise combinations between columns in .xcols
and elements in
.y
(this resembles the behavior of over2x()
and across2x()
).
crossover()
has one trick up it's sleeves, which sets it apart from the other
functions in the <over-across family
>: Its second input
(.y
) can be a function. This changes the originial behavior slightly: First
the function in .y
is applied to all columns in .xcols
to generate an
input object which will be used as .y
in the function calls in .fns
.
In this case each function is applied to all pairs between (i) columns in
.xcols
with (ii) the output elements that they generated through the
function that was originally supplied to .y
. Note that the underyling
data must not be grouped, if a function is supplied to .y
. For examples see
the example section below.
crossover( .xcols = dplyr::everything(), .y, .fns, ..., .names = NULL, .names_fn = NULL )
.xcols | < |
---|---|
.y | An atomic vector or list to apply functions to. If a function is supplied, the following values are possible:
Note that additional arguments can only be specified with an anonymous function, a purrr-style lamba or with a pre-filled custom function. |
.fns | Functions to apply to each column in Possible values are:
Note that |
... | Additional arguments for the function calls in |
.names | A glue specification that describes how to name the output columns. This can use:
The default ( Note that, depending on the nature of the underlying object in
This standard behavior (interpretation of
Alternatively, a character vector of length equal to the number of columns to
be created can be supplied to |
.names_fn | Optionally, a function that is applied after the glue
specification in |
crossover()
returns a tibble with one column for each combination of
columns in .xcols
, elements in .y
and functions in .fns
.
If a function is supplied as .y
argument, crossover()
returns a tibble with
one column for each pair of output elements of .y
and the column in .xcols
that generated the output combined with each function in .fns
.
For the basic functionality please refer to the examples in over()
and
dplyr::across()
.
If .y
is a vector or list, crossover()
loops every combination between
columns in .xcols
and elements in .y
over the functions in .fns
. This
is helpful in cases where we want to create a batch of similar variables with
only slightly changes in the arguments of the calling function. A good example
are lagged variables. Below we create five lagged variables for each
'Sepal.Length' and 'Sepal.Width'. To create nice names we use a named list
as argument in .fns
and specify the glue syntax in .names
.
iris %>% transmute( crossover(starts_with("sepal"), 1:5, list(lag = ~ lag(.x, .y)), .names = "{xcol}_{fn}{y}")) %>% glimpse #> Rows: 150 #> Columns: 10 #> $ Sepal.Length_lag1 <dbl> NA, 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9, 5.4~ #> $ Sepal.Length_lag2 <dbl> NA, NA, 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9,~ #> $ Sepal.Length_lag3 <dbl> NA, NA, NA, 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, ~ #> $ Sepal.Length_lag4 <dbl> NA, NA, NA, NA, 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4~ #> $ Sepal.Length_lag5 <dbl> NA, NA, NA, NA, NA, 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.~ #> $ Sepal.Width_lag1 <dbl> NA, 3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, 3.1, 3.7~ #> $ Sepal.Width_lag2 <dbl> NA, NA, 3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, 3.1,~ #> $ Sepal.Width_lag3 <dbl> NA, NA, NA, 3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, ~ #> $ Sepal.Width_lag4 <dbl> NA, NA, NA, NA, 3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2~ #> $ Sepal.Width_lag5 <dbl> NA, NA, NA, NA, NA, 3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.~
The .y
argument of crossover()
can take a function instead of list or vector.
In the example below we select the columns 'type', 'product', 'csat' in .xcols
.
We supply the function dist_values()
to .y
, which is a cleaner variant of
base R's unique()
. This generates all distinct values for all three selected
variables. Now, the function in .fns
, ~ if_else(.y == .x, 1, 0)
, is applied
to each pair of distinct value in .y
and the column in .xcols
that generated
this value. This basically creates a dummy variable for each value of each
variable. Since some of the values contain whitespace characters, we can use the
.names_fn
argument to supply a third function that cleans the output names
by replacing spaces with an underscore and setting all characters tolower()
.
csat %>% transmute( crossover(.xcols = c(type, product, csat), .y = dist_values, .fns = ~ if_else(.y == .x, 1, 0), .names_fn = ~ gsub("\\s", "_", .x) %>% tolower(.) )) %>% glimpse #> Rows: 150 #> Columns: 11 #> $ type_new <dbl> 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0,~ #> $ type_existing <dbl> 1, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 1,~ #> $ type_reactivate <dbl> 0, 0, 0, 1, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0,~ #> $ product_basic <dbl> 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 1,~ #> $ product_advanced <dbl> 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0,~ #> $ product_premium <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0,~ #> $ csat_very_unsatisfied <dbl> 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,~ #> $ csat_unsatisfied <dbl> 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0,~ #> $ csat_neutral <dbl> 1, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1,~ #> $ csat_satisfied <dbl> 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0,~ #> $ csat_very_satisfied <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,~
Other members of the <over-across function family
>.