::install_github("https://github.com/tony-aw/broadcast") remotes
Quickstart Guide
1 Prerequisites
First, a basic understanding of is important.
A very, very basic refresher for can be found in the Getting Started in R: Tinyverse Edition document.
2 Installation
To install ‘broadcast’ from GitHub, one may run the following code in :
3 Broadcasting
3.1 Introduction
In the context of operations involving 2 (or more) arrays, “broadcasting” refers to recycling array dimensions without allocating additional memory, which is considerably faster and more memory-efficient than R’s regular dimensions replication mechanism.
3.2 Rules
To paraphrase Numpy’s own documentation, one can summarise how broadcasting behaves using 2 rules.
if the input arrays for an operation do not have the same number of dimensions, a
1
will be repeatedly appended to the end of the dimension of the smaller array, until the arrays have the same number of dimensions.arrays with a size of 1 for a particular dimension act as if they had the size of the array with the largest size for that dimension. This is done by vritually recycling said dimension without making copies.
After application of these 2 broadcasting rules, the sizes of the input arrays must match.
Please read the Broadcasting explained page for a more complete explanation of what “broadcasting” is and how it works.
3.3 Example
Consider the matrices x
and y
:
<- array(1:20, c(4, 5))
x <- array(1:5 * 100, c(1, 5))
y print(x)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 5 9 13 17
#> [2,] 2 6 10 14 18
#> [3,] 3 7 11 15 19
#> [4,] 4 8 12 16 20
print(y)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 100 200 300 400 500
Suppose one wishes to compute the element-wise addition of these 2 arrays.
This won’t work in base :
+ y
x in x + y : non-conformable arrays Error
You could do the following….
+ y[rep(1L, 4L),]
x #> [,1] [,2] [,3] [,4] [,5]
#> [1,] 101 205 309 413 517
#> [2,] 102 206 310 414 518
#> [3,] 103 207 311 415 519
#> [4,] 104 208 312 416 520
… but this becomes an issue when x
and/or y
become very large, as the above operation involves replicating/copying y
several times - which costs memory, reduces speed, and the code is not easily scalable for arrays with different dimensions.
The ‘broadcast’ package performs “broadcasting”, which can do the above, but faster, without unnecessary copies, and scalable to arrays of any size (up to 16 dimensions).
Like so:
bc.num(x, y, "+")
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 101 205 309 413 517
#> [2,] 102 206 310 414 518
#> [3,] 103 207 311 415 519
#> [4,] 104 208 312 416 520
4 Binary operations
‘broadcast’ supports a wide range of binary element-wise broadcasted operations.
This includes arithmetic, relational, logical, string, and bit-wise operations.
The bc.*
functions, like bc.num() and bc.b(), provide great control over these operations.
For example:
<- array(1:20, c(4, 5))
x <- array(1:5 * 100, c(1, 5))
y print(x)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 5 9 13 17
#> [2,] 2 6 10 14 18
#> [3,] 3 7 11 15 19
#> [4,] 4 8 12 16 20
print(y)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 100 200 300 400 500
bc.num(x, y, "+")
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 101 205 309 413 517
#> [2,] 102 206 310 414 518
#> [3,] 103 207 311 415 519
#> [4,] 104 208 312 416 520
5 Overloaded Operators
Sometimes broadcasting is needed in a large mathematical expression, involving multiple variables, where precedence is of importance. For example in an expression like this:
x + y / z
Using the bc.*
functions for that, while possible, may be inconvenient. It may be more convenient to use the base operators directly, whilst still keeping the broadcasting property.
To this end, the ‘broadcast’ package provides the broadcaster class, which comes with its own method dispatch for the base operators.
See the following example:
<- array(1:20, c(4, 5))
x <- array(1:5 * 100, c(1, 5))
y <- array(20:1, c(4, 5))
z
broadcaster(x) <- TRUE
broadcaster(y) <- TRUE
broadcaster(z) <- TRUE
+ y / z
x #> [,1] [,2] [,3] [,4] [,5]
#> [1,] 6.000000 17.50000 34.00000 63.00000 142.0000
#> [2,] 7.263158 19.33333 37.27273 71.14286 184.6667
#> [3,] 8.555556 21.28571 41.00000 81.66667 269.0000
#> [4,] 9.882353 23.38462 45.33333 96.00000 520.0000
#> broadcaster
6 Array binding
The battle-tested abind()
function is often used to bind arrays along any arbitrary dimension.
‘broadcast’ provides an alternative to this, namely the bind_array() function, which allows for broadcasting (obviously), and is also notably faster and more memory efficient than abind()
.
Consider the following arrays:
<- array(1:20, c(4, 5))
x <- array(1:5*10, c(1, 5))
y print(x)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 5 9 13 17
#> [2,] 2 6 10 14 18
#> [3,] 3 7 11 15 19
#> [4,] 4 8 12 16 20
print(y)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 10 20 30 40 50
Binding them together with abind()
won’t work:
::abind(x, y, along = 2)
abindin abind::abind(x, y, along = 2) :
Error 'X2' has dims=1, 5; but need dims=4, X arg
To bind x
and y
together along columns, y
needs its single row to be recycled (broadcasted) 4 times.
This can be done in a highly efficient way using bind_array(), like so:
bind_array(list(x, y), 2L)
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 1 5 9 13 17 10 20 30 40 50
#> [2,] 2 6 10 14 18 10 20 30 40 50
#> [3,] 3 7 11 15 19 10 20 30 40 50
#> [4,] 4 8 12 16 20 10 20 30 40 50
7 Broadcasted General functions
‘broadcast’ provides the bcapply() function, which is a broadcasted apply-like function that applies a function between 2 arrays with broadcasting.
‘broadcast’ also provides the bc_ifelse() function, which is a broadcasted version of ifelse()
.
8 Other notable functions
The acast() function casts an array into a new dimension.
Roughly speaking, it is somewhat analogous to data.table::dcast()
, except that acast() works on arrays (instead of data.tables) and casts into a entirely new dimension (instead of into more columns).
‘broadcast’ offers type-casting functions. Unlike base R’s type-casting functions (as.logical()
, as.integer()
, etc.), the type-casting functions from ‘broadcast’ preserve names and dimensions.