library("broadcast")
# balanced acasting ====
x <- cbind(id = rep(1:3, each = 2), grp = rep(1:2, 3), val = rnorm(6))
print(x)
## id grp val
## [1,] 1 1 -0.8838725
## [2,] 1 2 1.7595730
## [3,] 2 1 -0.6746730
## [4,] 2 2 -0.6945246
## [5,] 3 1 0.3042856
## [6,] 3 2 0.1985162
grp <- as.factor(x[, 2])
levels(grp) <- c("a", "b")
margin <- 1L
acast(x, margin, grp)
## , , a
##
## id grp val
## [1,] 1 1 -0.8838725
## [2,] 2 1 -0.6746730
## [3,] 3 1 0.3042856
##
## , , b
##
## id grp val
## [1,] 1 2 1.7595730
## [2,] 2 2 -0.6945246
## [3,] 3 2 0.1985162
# unbalanced acasting ====
x <- cbind(id = c(rep(1:3, each = 2), 1), grp = c(rep(1:2, 3), 2), val = rnorm(7))
print(x)
## id grp val
## [1,] 1 1 1.0941311
## [2,] 1 2 0.6151570
## [3,] 2 1 1.1824790
## [4,] 2 2 -1.9042928
## [5,] 3 1 1.5088913
## [6,] 3 2 1.0015748
## [7,] 1 2 -0.9346742
grp <- as.factor(x[, 2])
levels(grp) <- c("a", "b")
margin <- 1L
acast(x, margin, grp, fill = TRUE)
## , , a
##
## id grp val
## [1,] 1 1 1.094131
## [2,] 2 1 1.182479
## [3,] 3 1 1.508891
## [4,] NA NA NA
##
## , , b
##
## id grp val
## [1,] 1 2 0.6151570
## [2,] 2 2 -1.9042928
## [3,] 3 2 1.0015748
## [4,] 1 2 -0.9346742
# unbalanced acasting with raw array ====
x <- cbind(id = c(rep(1:3, each = 2), 1), grp = c(rep(1:2, 3), 2), val = sample(1:7))
x <- as_raw(x)
print(x)
## id grp val
## [1,] 01 01 01
## [2,] 01 02 04
## [3,] 02 01 03
## [4,] 02 02 05
## [5,] 03 01 02
## [6,] 03 02 06
## [7,] 01 02 07
grp <- x[, 2] |> as.integer() |> as.factor()
levels(grp) <- c("a", "b")
margin <- 1L
(fill_val <- as.raw(255))
## [1] ff
acast(x, margin, grp, fill = TRUE, fill_val = fill_val)
## , , a
##
## id grp val
## [1,] 01 01 01
## [2,] 02 01 03
## [3,] 03 01 02
## [4,] ff ff ff
##
## , , b
##
## id grp val
## [1,] 01 02 04
## [2,] 02 02 05
## [3,] 03 02 06
## [4,] 01 02 07acast
Simple and Fast Casting/Pivoting of an Array
Description
The acast() function spreads subsets of an array margin over a new dimension.
Roughly speaking, acast() can be thought of as the "array" analogy to data.table::dcast().
But note 2 important differences:
-
acast()works on arrays instead of data.tables. -
acast()casts into a completely new dimension (namelyndim(x) + 1), instead of casting into new columns.
Usage
acast(x, ...)
## Default S3 method:
acast(x, margin, grp, fill = FALSE, fill_val, ...)
Arguments
x
|
an atomic or recursive array. |
β¦
|
further arguments passed to or from methods. |
margin
|
a scalar integer, specifying the margin to cast from. |
grp
|
a factor, where length(grp) == dim(x)[margin], with at least 2 unique values, specifying which indices of dim(x)[margin] belong to which group. Each group will be cast onto a separate index of dimension ndim(x) + 1. Unused levels of grp will be dropped. Any NA values or levels found in grp will result in an error.
|
fill
|
Boolean. When factor grp is unbalanced (i.e. has unequally sized groups) the result will be an array where some slices have missing values, which need to be filled. If fill = TRUE, an unbalanced grp factor is allowed, and missing values will be filled with fill_val. If fill = FALSE (default), an unbalanced grp factor is not allowed, and providing an unbalanced factor for grp produces an error.
|
fill_val
|
scalar of the same type of
|
Details
For the sake of illustration, consider a matrix x and a grouping factor grp.
Let the integer scalar k represent a group in grp, such that k \(\in\) 1:nlevels(grp).
Then the code
out <- acast(x, margin = 1, grp = grp)
essentially performs the following for every group k:
-
copy-paste the subset
x[grp == k, ]to the subsetout[, , k].
Please see the examples section to get a good idea on how this function casts an array.
Value
An array with dimensions c(dim(x), max(tabulate(grp)).
Back transformation
From the casted array,
out <- acast(x, margin, grp),
one can get the original x back by using
back <- asplit(out, ndim(out)) |> bind_array(along = margin).
Note, however, the following about the back-transformed array back:
-
backwill be ordered bygrpalong dimensionmargin; -
if the levels of
grpdid not have equal frequencies, thendim(back)[margin] > dim(x)[margin], andbackwill have more missing values thanx.