Binding Explained

 

This page explains some details on broadcasting that are specific to the bind_array() function.

 

The bind_array() comes with the ndim2bc argument (an abbreviation of ” maximum number of dimensions to broadcast”), which allows users to specify the maximum number of dimensions that are allowed to be broadcasted while binding arrays. This way, users won’t get unpleasant surprises.

This should be fairly obvious, but the dimension specified in along is never broadcasted.

 

Consider the following arrays:


x <- array(1:20, c(4, 5))
y <- array(1:5*10, c(1, 5))
print(x)
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    1    5    9   13   17
#> [2,]    2    6   10   14   18
#> [3,]    3    7   11   15   19
#> [4,]    4    8   12   16   20
print(y)
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]   10   20   30   40   50

Binding them together with abind() won’t work:

abind::abind(x, y, along = 2)
Error in abind::abind(x, y, along = 2) : 
  arg 'X2' has dims=1, 5; but need dims=4, X

To bind x and y together along columns, y needs its single row to be recycled (broadcasted) 4 times.

This can be done in a highly efficient way using bind_array(), like so:

bind_array(list(x, y), 2L)
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,]    1    5    9   13   17   10   20   30   40    50
#> [2,]    2    6   10   14   18   10   20   30   40    50
#> [3,]    3    7   11   15   19   10   20   30   40    50
#> [4,]    4    8   12   16   20   10   20   30   40    50

 

But what if broadcasting is explicitly not desired? What if one actually wants this to produce an error, like abind()? Fret not, for that’s what the ndim2bc argument is for. Setting it to 0 will disable broadcasting altogether:

bind_array(list(x, y), 2L, ndim2bc = 0)
Error in bind_array(list(x, y), 2L, ndim2bc = 0) : 
  maximum number of dimensions to be broadcasted (1) exceeds `ndim2bc` (0)

 

Let’s replace x with a 3 dimensional array:

x <- array(1:20, c(4, 5, 3))
print(x)
#> , , 1
#> 
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    1    5    9   13   17
#> [2,]    2    6   10   14   18
#> [3,]    3    7   11   15   19
#> [4,]    4    8   12   16   20
#> 
#> , , 2
#> 
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    1    5    9   13   17
#> [2,]    2    6   10   14   18
#> [3,]    3    7   11   15   19
#> [4,]    4    8   12   16   20
#> 
#> , , 3
#> 
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    1    5    9   13   17
#> [2,]    2    6   10   14   18
#> [3,]    3    7   11   15   19
#> [4,]    4    8   12   16   20

Suppose we don’t want to broadcast more than 1 dimension at a time, out of fear of accidentally broadcasting too many dimensions (like when binding arrays in a loop without knowing a-priori how many dimensions the arrays have).

One can than set ndim2bc = 1L, to ensure no more than one dimensions is being broadcasted when binding arrays.

So trying to bind x with y now will produce an error even with bind_array(), to protect the user from unintended broadcasting:

bind_array(list(x, y), 2L, ndim2bc = 1L)
Error in bind_array(list(x, y), 2L, ndim2bc = 1L) : 
  maximum number of dimensions to be broadcasted (2) exceeds `ndim2bc` (1)