Don't Repeat Yourself • tinycodet

library(tinycodet)
#> Run `?tinycodet::tinycodet` to open the introduction help page of 'tinycodet'.

The transform_if function

“Don’t Repeat Yourself”, sometimes abbreviated as “DRY”, is the coding principle that you should try to reduce repeating patterns in your code (within reason).

Consider the following code:

object <- matrix(c(-9:8, NA, NA) , ncol=2)
y <- 0
z <- 1000
ifelse(
  is.na(object>y), -z,
  ifelse(
    object>y,  log(object), object^2
  )
)
#> Warning in log(object): NaNs produced
#>       [,1]          [,2]
#>  [1,]   81     0.0000000
#>  [2,]   64     0.6931472
#>  [3,]   49     1.0986123
#>  [4,]   36     1.3862944
#>  [5,]   25     1.6094379
#>  [6,]   16     1.7917595
#>  [7,]    9     1.9459101
#>  [8,]    4     2.0794415
#>  [9,]    1 -1000.0000000
#> [10,]    0 -1000.0000000

Here a conditional subset of the object object is transformed where the condition is using a function referring to object itself. Consequently, reference to object is written 4 times! This can become cumbersome quickly. Notice also that the above code gives an unnecessary warning, due to ifelse() requiring the entirety of log(object).

The tinycodet package therefore adds the transform_if(x, cond, yes, no, other) function, which will “dry” this up. Here, in argument cond a function must be given that returns a logical vector. For every value where cond(x)==TRUE, function yes(x) is run, for every value where cond(x)==FALSE, function no(x) is run, and for every value where cond(x)==NA, function other is run. Because a function-based approach is used instead of directly supplying vectors, unnecessary warnings and annoying errors are avoided (unlike the above code).

The above code can now be re-written in a less warning/error prone and more compact manner as:

object |> transform_if(\(x)x>y, log, \(x)x^2, \(x) -z)
#>       [,1]          [,2]
#>  [1,]   81     0.0000000
#>  [2,]   64     0.6931472
#>  [3,]   49     1.0986123
#>  [4,]   36     1.3862944
#>  [5,]   25     1.6094379
#>  [6,]   16     1.7917595
#>  [7,]    9     1.9459101
#>  [8,]    4     2.0794415
#>  [9,]    1 -1000.0000000
#> [10,]    0 -1000.0000000

Instead of supplying a function for cond, one can also directly supply a logical vector to argument cond. Moreover, when the transformed value is an atomic scalar, you don’t really need a function; you can just fill in the scalar (vectors are not allowed though, as that will lead the same unnecessary warnings or even annoying errors as occur with ifelse()).

So one can thus also re-write the original code (without warnings/errors and more compact) as:

object |> transform_if(object > y, log, \(x)x^2, -z)
#>       [,1]          [,2]
#>  [1,]   81     0.0000000
#>  [2,]   64     0.6931472
#>  [3,]   49     1.0986123
#>  [4,]   36     1.3862944
#>  [5,]   25     1.6094379
#>  [6,]   16     1.7917595
#>  [7,]    9     1.9459101
#>  [8,]    4     2.0794415
#>  [9,]    1 -1000.0000000
#> [10,]    0 -1000.0000000

Atomic type casting with names and dimensions preserved

Atomic type casting in R is generally performed using the functions as.logical(), as.integer(), as.double(), as.character().

These functions have the annoying property that they strip attributes. If you wish to convert a variable x whilst keeping the names and dimensions, one must first safe the attributes of x before conversion, convert x, and then re-assign the attributes. ‘tinycodet’ adds functions that can do this for you, saving repetitive code:

as_bool(): same as as.logical(), but withnames & dimensions preserved.
as_int(): same as as.integer(), but withnames & dimensions preserved.
as_dbl(): same as as.double() (i.e. convert to real numbers), but withnames & dimensions preserved.
as_chr(): same as as.character(), but withnames & dimensions preserved.
as_cplx(): same as as.complex(), but withnames & dimensions preserved.
as_raw(): same as as.raw(), but withnames & dimensions preserved.

Examples:

x <- matrix(rnorm(30), ncol = 5)
colnames(x) <- month.name[1:5]
rownames(x) <- month.abb[1:6]
names(x) <- c(letters[1:20], LETTERS[1:10])
print(x)
#>          January   February       March      April         May
#> Jan -1.400043517 -1.8218177  2.06502490  0.5429963  1.88850493
#> Feb  0.255317055 -0.2473253 -1.63098940 -0.9140748 -0.09744510
#> Mar -2.437263611 -0.2441996  0.51242695  0.4681544 -0.93584735
#> Apr -0.005571287 -0.2827054 -1.86301149  0.3629513 -0.01595031
#> May  0.621552721 -0.5536994 -0.52201251 -1.3045435 -0.82678895
#> Jun  1.148411606  0.6289820 -0.05260191  0.7377763 -1.51239965
#> attr(,"names")
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"

as_bool(x)
#>     January February March April  May
#> Jan    TRUE     TRUE  TRUE  TRUE TRUE
#> Feb    TRUE     TRUE  TRUE  TRUE TRUE
#> Mar    TRUE     TRUE  TRUE  TRUE TRUE
#> Apr    TRUE     TRUE  TRUE  TRUE TRUE
#> May    TRUE     TRUE  TRUE  TRUE TRUE
#> Jun    TRUE     TRUE  TRUE  TRUE TRUE
#> attr(,"names")
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
as_int(x)
#>     January February March April May
#> Jan      -1       -1     2     0   1
#> Feb       0        0    -1     0   0
#> Mar      -2        0     0     0   0
#> Apr       0        0    -1     0   0
#> May       0        0     0    -1   0
#> Jun       1        0     0     0  -1
#> attr(,"names")
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
as_dbl(x)
#>          January   February       March      April         May
#> Jan -1.400043517 -1.8218177  2.06502490  0.5429963  1.88850493
#> Feb  0.255317055 -0.2473253 -1.63098940 -0.9140748 -0.09744510
#> Mar -2.437263611 -0.2441996  0.51242695  0.4681544 -0.93584735
#> Apr -0.005571287 -0.2827054 -1.86301149  0.3629513 -0.01595031
#> May  0.621552721 -0.5536994 -0.52201251 -1.3045435 -0.82678895
#> Jun  1.148411606  0.6289820 -0.05260191  0.7377763 -1.51239965
#> attr(,"names")
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
as_chr(x)
#>     January                February             March                
#> Jan "-1.40004351672175"    "-1.82181766097663"  "2.06502489535922"   
#> Feb "0.25531705484526"     "-0.247325302073524" "-1.63098940208223"  
#> Mar "-2.43726361121953"    "-0.244199606778383" "0.512426949851805"  
#> Apr "-0.00557128674616073" "-0.282705448814465" "-1.86301149206833"  
#> May "0.621552721415214"    "-0.553699383688721" "-0.522012514745454" 
#> Jun "1.14841160602606"     "0.628982042036008"  "-0.0526019099538795"
#>     April                May                  
#> Jan "0.54299634266114"   "1.88850492923455"   
#> Feb "-0.914074827259928" "-0.0974451044082059"
#> Mar "0.468154420450533"  "-0.935847353500678" 
#> Apr "0.362951255864986"  "-0.0159503112505377"
#> May "-1.30454354503478"  "-0.826788953733443" 
#> Jun "0.737776321255047"  "-1.5123996512628"   
#> attr(,"names")
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
as_cplx(x)
#>             January      February          March         April            May
#> Jan -1.400043517+0i -1.8218177+0i  2.06502490+0i  0.5429963+0i  1.88850493+0i
#> Feb  0.255317055+0i -0.2473253+0i -1.63098940+0i -0.9140748+0i -0.09744510+0i
#> Mar -2.437263611+0i -0.2441996+0i  0.51242695+0i  0.4681544+0i -0.93584735+0i
#> Apr -0.005571287+0i -0.2827054+0i -1.86301149+0i  0.3629513+0i -0.01595031+0i
#> May  0.621552721+0i -0.5536994+0i -0.52201251+0i -1.3045435+0i -0.82678895+0i
#> Jun  1.148411606+0i  0.6289820+0i -0.05260191+0i  0.7377763+0i -1.51239965+0i
#> attr(,"names")
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
as_raw(x)
#> Warning in as_raw(x): out-of-range values treated as 0 in coercion to raw
#>     January February March April May
#> Jan      00       00    02    00  01
#> Feb      00       00    00    00  00
#> Mar      00       00    00    00  00
#> Apr      00       00    00    00  00
#> May      00       00    00    00  00
#> Jun      01       00    00    00  00
#> attr(,"names")
#>  [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"

Subset if and unreal replacement

The tinycodet package adds 2 “subset_if” operators:

The x %[if]% cond operator selects elements from vector/matrix/array x, for which the result of cond(x) returns TRUE.
The x %[!if]% cond operator selects elements from vector/matrix/array x, for which the result of cond(x) returns FALSE.

For example:

object_with_very_long_name <- matrix(-10:9, ncol=2)
print(object_with_very_long_name)
#>       [,1] [,2]
#>  [1,]  -10    0
#>  [2,]   -9    1
#>  [3,]   -8    2
#>  [4,]   -7    3
#>  [5,]   -6    4
#>  [6,]   -5    5
#>  [7,]   -4    6
#>  [8,]   -3    7
#>  [9,]   -2    8
#> [10,]   -1    9
object_with_very_long_name %[if]% \(x)x %in% 1:10
#> [1] 1 2 3 4 5 6 7 8 9
object_with_very_long_name %[!if]% \(x)x %in% 1:10
#>  [1] -10  -9  -8  -7  -6  -5  -4  -3  -2  -1   0

Another operator added by tinycodet is x %unreal =% y, which replaces all NA, NaN, Inf and -Inf in x with the value given in y.

So x %unreal =% y is the same as x[is.na(x)|is.nan(x)|is.infinite(x)] <- y.

General in-place modifier

This R package includes a general in-place modifying infix operator.

Consider the following line of code:

mtcars$mpg[mtcars$cyl>6] <- mtcars$mpg[mtcars$cyl>6]^2

The same expression, mtcars$mpg[mtcars$cyl>6], is written twice, making this code rather long and cumbersome, even though we’re just squaring the expression.

This R package solves the above laid-out problem by implementing a general in-place (mathematical) modifier, through the x %:=% f operator.

With tinycodet one can now make this more compact (more “tiny”, if you will) as follows:

mtcars$mpg[mtcars$cyl>6] %:=% \(x)x^2