<- array(1:20, c(4, 5))
x <- array(1:5 * 100, c(1, 5))
y print(x)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 1 5 9 13 17
#> [2,] 2 6 10 14 18
#> [3,] 3 7 11 15 19
#> [4,] 4 8 12 16 20
print(y)
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 100 200 300 400 500
âNumpyâ-like Broadcasted Operations for Atomic and Recursive Arrays with Minimal Dependencies in âRâ
Introduction
đşď¸Overview
âbroadcastâ is a relatively small package that, as the name suggests, performs âbroadcastingâ (similar to broadcasting in the âNumpyâ module for âPythonâ).
In the context of operations involving 2 (or more) arrays, âbroadcastingâ refers to recycling array dimensions without allocating additional memory, which is considerably faster and more memory-efficient than Râs regular dimensions replication mechanism.
Please read the Broadcasting explained page for a more complete explanation of what âbroadcastingâ is.
At its core, the âbroadcastâ package provides 3 functionalities, all 3 related to âbroadcastingâ:
- Functions for broadcasted element-wise operations between any 2 arrays. They support a large set of relational-, arithmetic-, Boolean-, and string operations.
- The bind_array() function for binding arrays along any arbitrary dimension. Similar to the fantastic
abind::abind()
function, but with a few key differences:- bind_array() is faster and more memory efficient;
- bind_array() supports broadcasting;
- bind_array() supports both atomic and recursive arrays (
abind()
only supports atomic arrays).
- âbroadcastâ provides several generic functions for broadcasting, namely bcapply() (broadcasted apply-like function) and bc_ifelse() (broadcasted version of
ifelse()
).
Additionally, âbroadcastâ includes the acast() function, for casting/pivoting an array into a new dimension. Roughly analogous to data.table::dcast()
, but for arrays.
đ¤ˇđ˝Why use âbroadcastâ
Efficiency
Broadcasting dimensions is faster and more memory efficient than replicating dimensions.
Efficient programs use less energy and resources, and is thus better for the environment.
Benchmarks can be found on the website.
Convenience
Have you ever been bothered by any of the following while programming in :
- Receiving the ânon-conformable arraysâ error message in a simple array operation, when it intuitively should work?
- Receiving the âcannot allocate vector of sizeâŚâ error message because unnecessarily allocated too much memory in array operations?
abind::abind()
being too slow, or ruining the structure of recursive arrays?- that there is no built-in way to cast or pivot arrays?
- that certain âNumpyâ operations have no equivalent operation in ?
If you answered âYESâ to any of the above, âbroadcastâ may be the - package for you.
Minimal Dependencies
Besides linking to âRcppâ, âbroadcastâ does not depend on, vendor, link to, include, or otherwise use any external libraries; âbroadcastâ was essentially made from scratch and can be installed out-of-the-box.
Not using external libraries brings a number of advantages:
- Avoid dependency hell: Every dependency that is added to a software package increases the likelihood of something breaking (AKA âdependency hellâ). âbroadcastâ thus avoids this.
- Avoid wasting resources for translations: Using libraries from other languages, such as âxtensorâ (âC++â) or âNumpyâ (âPythonâ) means that - at some point - one needs to convert between the structure of to that of the other language, and vice-versa, which wastes precious time, memory, and power. âbroadcastâ requires no such translations of structures, and is therefore much less wasteful.
- Ensure consistent behaviour: Using libraries from other languages also means one cannot always guarantee consistent behaviour for some operations. For example: both âNumpyâ and âxtensorâ have only limited support for missing values, whereas supports missing values for both atomic and recursive array/vector types (except type of âRawâ). Since âbroadcastâ does not rely on external libraries, it can ensure behaviour that is consistent with the rest of .
Tested
The âbroadcastâ package is frequently checked using a large suite of unit tests via the tinytest package. These tests have a coverage of over 90%. So the chance of a function from this package breaking completely is relatively low.
âbroadcastâ is still relatively new package, however, so (small) bugs are still very much possible. I encourage users who find bugs to report them promptly to the issues tab on the GitHub page, and I will fix them as soon as time permits.
đQuick Example
Consider the matrices x
and y
:
Suppose one wishes to compute the element-wise addition of these 2 arrays.
This wonât work in base :
+ y
x in x + y : non-conformable arrays Error
You could do the followingâŚ.
+ y[rep(1L, 4L),]
x #> [,1] [,2] [,3] [,4] [,5]
#> [1,] 101 205 309 413 517
#> [2,] 102 206 310 414 518
#> [3,] 103 207 311 415 519
#> [4,] 104 208 312 416 520
⌠but this becomes an issue when x
and/or y
become very large, as the above operation involves replicating/copying y
several times - which costs memory, reduces speed, and the code is not easily scalable for arrays with different dimensions.
The âbroadcastâ package performs âbroadcastingâ, which can do the above, but faster, without unnecessary copies, and scalable to arrays of any size (up to 16 dimensions).
Like so:
bc.num(x, y, "+")
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] 101 205 309 413 517
#> [2,] 102 206 310 414 518
#> [3,] 103 207 311 415 519
#> [4,] 104 208 312 416 520
or like so:
broadcaster(x) <- TRUE
broadcaster(y) <- TRUE
+ y
x #> [,1] [,2] [,3] [,4] [,5]
#> [1,] 101 205 309 413 517
#> [2,] 102 206 310 414 518
#> [3,] 103 207 311 415 519
#> [4,] 104 208 312 416 520
#> broadcaster
đStatus
âbroadcastâ is fully functional, but still experimental.
If you have any suggestions or feedback on the package, its documentation, or even the benchmarks, I encourage you to let me know (either as an Issue or a Discussion).
Iâm eager to read your input!
đDocumentation
The documentation in the âbroadcastâ website is divided into 3 main parts:
- Guides and Vignettes: contains the topic-oriented guides in the form of a few Vignettes.
- Reference Manual: contains the function-oriented reference manual.
- About: Contains the Acknowledgements, Change logs and License file. Here youâll also find some information regarding the relationship between âbroadcastâ and other packages/modules.