library(tinycodet)
#> Run `?tinycodet::tinycodet` to open the introduction help page of 'tinycodet'.
The transform_if function
“Don’t Repeat Yourself”, sometimes abbreviated as “DRY”, is the coding principle that you should try to reduce repeating patterns in your code (within reason).
Consider the following code:
object <- matrix(c(-9:8, NA, NA) , ncol=2)
y <- 0
z <- 1000
ifelse(
is.na(object>y), -z,
ifelse(
object>y, log(object), object^2
)
)
#> Warning in log(object): NaNs produced
#> [,1] [,2]
#> [1,] 81 0.0000000
#> [2,] 64 0.6931472
#> [3,] 49 1.0986123
#> [4,] 36 1.3862944
#> [5,] 25 1.6094379
#> [6,] 16 1.7917595
#> [7,] 9 1.9459101
#> [8,] 4 2.0794415
#> [9,] 1 -1000.0000000
#> [10,] 0 -1000.0000000
Here a conditional subset of the object object
is
transformed where the condition is using a function referring to
object
itself. Consequently, reference to
object
is written 4 times! This can become cumbersome
quickly. Notice also that the above code gives an unnecessary warning,
due to ifelse()
requiring the entirety of
log(object)
.
The tinycodet
package therefore adds the
transform_if(x, cond, yes, no, other)
function, which will
“dry” this up. Here, in argument cond
a function must be
given that returns a logical vector. For every value where
cond(x)==TRUE
, function yes(x)
is run, for
every value where cond(x)==FALSE
, function
no(x)
is run, and for every value where
cond(x)==NA
, function other
is run. Because a
function-based approach is used instead of directly supplying vectors,
unnecessary warnings and annoying errors are avoided (unlike the above
code).
The above code can now be re-written in a less warning/error prone and more compact manner as:
object |> transform_if(\(x)x>y, log, \(x)x^2, \(x) -z)
#> [,1] [,2]
#> [1,] 81 0.0000000
#> [2,] 64 0.6931472
#> [3,] 49 1.0986123
#> [4,] 36 1.3862944
#> [5,] 25 1.6094379
#> [6,] 16 1.7917595
#> [7,] 9 1.9459101
#> [8,] 4 2.0794415
#> [9,] 1 -1000.0000000
#> [10,] 0 -1000.0000000
Instead of supplying a function for cond
, one can also
directly supply a logical vector to argument cond
.
Moreover, when the transformed value is an atomic scalar, you don’t
really need a function; you can just fill in the scalar (vectors are not
allowed though, as that will lead the same unnecessary warnings or even
annoying errors as occur with ifelse()
).
So one can thus also re-write the original code (without warnings/errors and more compact) as:
object |> transform_if(object > y, log, \(x)x^2, -z)
#> [,1] [,2]
#> [1,] 81 0.0000000
#> [2,] 64 0.6931472
#> [3,] 49 1.0986123
#> [4,] 36 1.3862944
#> [5,] 25 1.6094379
#> [6,] 16 1.7917595
#> [7,] 9 1.9459101
#> [8,] 4 2.0794415
#> [9,] 1 -1000.0000000
#> [10,] 0 -1000.0000000
Atomic type casting with names and dimensions preserved
Atomic type casting in R is generally performed using the functions
as.logical()
, as.integer()
,
as.double()
, as.character()
.
These functions have the annoying property that they strip
attributes. If you wish to convert a variable x
whilst
keeping the names and dimensions, one must first safe the attributes of
x
before conversion, convert x
, and then
re-assign the attributes. ‘tinycodet’ adds functions that can do this
for you, saving repetitive code:
-
as_bool()
: same asas.logical()
, but withnames & dimensions preserved. -
as_int()
: same asas.integer()
, but withnames & dimensions preserved. -
as_dbl()
: same asas.double()
(i.e. convert to real numbers), but withnames & dimensions preserved. -
as_chr()
: same asas.character()
, but withnames & dimensions preserved. -
as_cplx()
: same asas.complex()
, but withnames & dimensions preserved. -
as_raw()
: same asas.raw()
, but withnames & dimensions preserved.
Examples:
x <- matrix(rnorm(30), ncol = 5)
colnames(x) <- month.name[1:5]
rownames(x) <- month.abb[1:6]
names(x) <- c(letters[1:20], LETTERS[1:10])
print(x)
#> January February March April May
#> Jan -1.400043517 -1.8218177 2.06502490 0.5429963 1.88850493
#> Feb 0.255317055 -0.2473253 -1.63098940 -0.9140748 -0.09744510
#> Mar -2.437263611 -0.2441996 0.51242695 0.4681544 -0.93584735
#> Apr -0.005571287 -0.2827054 -1.86301149 0.3629513 -0.01595031
#> May 0.621552721 -0.5536994 -0.52201251 -1.3045435 -0.82678895
#> Jun 1.148411606 0.6289820 -0.05260191 0.7377763 -1.51239965
#> attr(,"names")
#> [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
as_bool(x)
#> January February March April May
#> Jan TRUE TRUE TRUE TRUE TRUE
#> Feb TRUE TRUE TRUE TRUE TRUE
#> Mar TRUE TRUE TRUE TRUE TRUE
#> Apr TRUE TRUE TRUE TRUE TRUE
#> May TRUE TRUE TRUE TRUE TRUE
#> Jun TRUE TRUE TRUE TRUE TRUE
#> attr(,"names")
#> [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
as_int(x)
#> January February March April May
#> Jan -1 -1 2 0 1
#> Feb 0 0 -1 0 0
#> Mar -2 0 0 0 0
#> Apr 0 0 -1 0 0
#> May 0 0 0 -1 0
#> Jun 1 0 0 0 -1
#> attr(,"names")
#> [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
as_dbl(x)
#> January February March April May
#> Jan -1.400043517 -1.8218177 2.06502490 0.5429963 1.88850493
#> Feb 0.255317055 -0.2473253 -1.63098940 -0.9140748 -0.09744510
#> Mar -2.437263611 -0.2441996 0.51242695 0.4681544 -0.93584735
#> Apr -0.005571287 -0.2827054 -1.86301149 0.3629513 -0.01595031
#> May 0.621552721 -0.5536994 -0.52201251 -1.3045435 -0.82678895
#> Jun 1.148411606 0.6289820 -0.05260191 0.7377763 -1.51239965
#> attr(,"names")
#> [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
as_chr(x)
#> January February March
#> Jan "-1.40004351672175" "-1.82181766097663" "2.06502489535922"
#> Feb "0.25531705484526" "-0.247325302073524" "-1.63098940208223"
#> Mar "-2.43726361121953" "-0.244199606778383" "0.512426949851805"
#> Apr "-0.00557128674616073" "-0.282705448814465" "-1.86301149206833"
#> May "0.621552721415214" "-0.553699383688721" "-0.522012514745454"
#> Jun "1.14841160602606" "0.628982042036008" "-0.0526019099538795"
#> April May
#> Jan "0.54299634266114" "1.88850492923455"
#> Feb "-0.914074827259928" "-0.0974451044082059"
#> Mar "0.468154420450533" "-0.935847353500678"
#> Apr "0.362951255864986" "-0.0159503112505377"
#> May "-1.30454354503478" "-0.826788953733443"
#> Jun "0.737776321255047" "-1.5123996512628"
#> attr(,"names")
#> [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
as_cplx(x)
#> January February March April May
#> Jan -1.400043517+0i -1.8218177+0i 2.06502490+0i 0.5429963+0i 1.88850493+0i
#> Feb 0.255317055+0i -0.2473253+0i -1.63098940+0i -0.9140748+0i -0.09744510+0i
#> Mar -2.437263611+0i -0.2441996+0i 0.51242695+0i 0.4681544+0i -0.93584735+0i
#> Apr -0.005571287+0i -0.2827054+0i -1.86301149+0i 0.3629513+0i -0.01595031+0i
#> May 0.621552721+0i -0.5536994+0i -0.52201251+0i -1.3045435+0i -0.82678895+0i
#> Jun 1.148411606+0i 0.6289820+0i -0.05260191+0i 0.7377763+0i -1.51239965+0i
#> attr(,"names")
#> [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
as_raw(x)
#> Warning in as_raw(x): out-of-range values treated as 0 in coercion to raw
#> January February March April May
#> Jan 00 00 02 00 01
#> Feb 00 00 00 00 00
#> Mar 00 00 00 00 00
#> Apr 00 00 00 00 00
#> May 00 00 00 00 00
#> Jun 01 00 00 00 00
#> attr(,"names")
#> [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s"
#> [20] "t" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"
Subset if and unreal replacement
The tinycodet
package adds 2 “subset_if” operators:
The
x %[if]% cond
operator selects elements from vector/matrix/arrayx
, for which the result ofcond(x)
returnsTRUE
.The
x %[!if]% cond
operator selects elements from vector/matrix/arrayx
, for which the result ofcond(x)
returnsFALSE
.
For example:
object_with_very_long_name <- matrix(-10:9, ncol=2)
print(object_with_very_long_name)
#> [,1] [,2]
#> [1,] -10 0
#> [2,] -9 1
#> [3,] -8 2
#> [4,] -7 3
#> [5,] -6 4
#> [6,] -5 5
#> [7,] -4 6
#> [8,] -3 7
#> [9,] -2 8
#> [10,] -1 9
object_with_very_long_name %[if]% \(x)x %in% 1:10
#> [1] 1 2 3 4 5 6 7 8 9
object_with_very_long_name %[!if]% \(x)x %in% 1:10
#> [1] -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0
Another operator added by tinycodet
is
x %unreal =% y
, which replaces all NA, NaN, Inf and -Inf in
x
with the value given in y
.
So x %unreal =% y
is the same as
x[is.na(x)|is.nan(x)|is.infinite(x)] <- y
.
General in-place modifier
This R package includes a general in-place modifying infix operator.
Consider the following line of code:
mtcars$mpg[mtcars$cyl>6] <- mtcars$mpg[mtcars$cyl>6]^2
The same expression, mtcars$mpg[mtcars$cyl>6]
, is
written twice, making this code rather long and cumbersome, even though
we’re just squaring the expression.
This R package solves the above laid-out problem by implementing a
general in-place (mathematical) modifier, through the
x %:=% f
operator.
With tinycodet
one can now make this more compact (more
“tiny”, if you will) as follows:
mtcars$mpg[mtcars$cyl>6] %:=% \(x)x^2