This help page describes the main modification semantics
available in 'squarebrackets'.
Base R's default modification
For most average users, R's default copy-on-modify semantics are fine.
The benefits of the indexing arguments from 'squarebrackets'
can be combined the [<-
operator,
through the idx method.
The result of the idx()
method
can be used inside the regular square-brackets operators.
For example like so:
x <- array(...)
my_indices <- idx(x, s, d)
x[my_indices] <- value
y <- data.frame(...)
rows <- idx(y, 1:10, 1, inv = TRUE)
cols <- idx(y, c("a", "b"), 2)
y[rows, cols] <- value
thus allowing the user to benefit from the convenient index translations from 'squarebrackets',
whilst still using R's default copy-on-modification semantics
(instead of the semantics provided by 'squarebrackets').
Explicit Copy
'squarebrackets' provides
the sb_mod/sb2_mod method
to modify through a (shallow) copy.
This method returns the modified object.
For recursive objects, sb2_mod returns the original object,
where only the modified subsets are copied,
thus preventing unnecessary usage of memory.
Pass-by-Reference
'squarebrackets' provides
the sb_set/sb2_set and slice_set methods
to modify by reference,
meaning no copy is made at all.
Pass-by-Reference is fastest and the most memory efficient.
But it is also more involved than the other modification forms,
and requires more thought.
See squarebrackets_PassByReference for more information.
Replacement and Transformation in Atomic Objects
The rp
argument is used to replace the values at the specified indices
with the values specified in rp
.
Using the rp
argument in the modification methods,
corresponds to something like the following:
x[...] <- rp
The tf
argument is used to transform the values at the specified indices
through transformation function tf
.
Using the tf
argument
corresponds to something like the following:
x[...] <- tf(x[...])
where tf
is a function that returns an object of appropriate type and size
(so tf
should not be a pass-by-reference function).
Replacement and Transformation in Lists
The rp
and tf
arguments work mostly in the same way for recursive objects.
But there are some slight differences.
Argument rp
'squarebrackets' demands that rp
is always provided as a list
in the S3 methods for recursive vectors, matrices, and arrays (i.e. lists).
This is to prevent ambiguity
with respect to how the replacement is recycled or distributed over the specified indices
(See Footnote 1
below).
Argument tf
Most functions in (base) 'R' are vectorized for atomic objects, but not for lists
(see Footnote 2
below).
'squarebrackets' will therefore apply transformation function tf
via lapply
,
like so:
x[...] <- lapply(x[...], tf)
In the methods for recursive objects,
the tf
argument is accompanied by the .lapply
argument.
By default, .lapply = lapply
.
The user may supply a custom lapply()
-like function
in this argument to use instead.
For example, to perform parallel transformation,
the user may supply future.apply::
future_lapply.
The supplied function must use the exact same argument convention as
lapply,
otherwise errors or unexpected behaviour may occur.
Replacement and Transformation in data.frame-like Objects
Replacement and transformations
in data.frame-like objects are a bit more flexible than in Lists. rp
is not always demanded to be a list for data.frame-like objects,
only when appropriate
(for example, when replacing multiple columns, or when the column itself is a list.)
When rp
is given as a list,
it is unclassed and unnamed before being used to replace values.
This is to ensure consistency across all supported data.frame types.
Bear in mind that every column in a data.frame is like an element in a list;
so .lapply
is used for transformations across multiple columns.
Recycling and Coercion
Recycling is not allowed in the modification methods.
So, for example, length(rp)
must be equal to the length of the selected subset,
or equal to 1
.
When using Pass-by-Reference semantics,
the user should be extra mindful of the auto-coercion rules.
See squarebrackets_coercion for details.
Footnotes
Footnote 1
Consider the following replacement in base 'R':
x <-list(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
x[1:2] <- 2:1
What will happen?
Will the x[1]
be list(1:2)
and x[2]
also be list(1:2)
?
Or will x[1]
be list(2)
and x[2]
be list(1)
?
It turns out the latter will happen; but this is somewhat ambiguous from the code.
To prevent such ambiguity in your code,
'squarebrackets' demands that rp
is always provided as a list.
Footnote 2
Most functions in (base) 'R' are vectorized for atomic objects, but not for lists.
One of the reasons is the following:
In an atomic vector x
of some type t
,
every single element of x
is a scalar of type t
.
However, every element of some list x
can be virtually anything:
an atomic object, another list,
an unevaluated expression, even dark magic like quote(expr =)
.
It is difficult to make a vectorized function for an object with so many unknowns.
Therefore, in the vast majority of the cases,
one needs to loop through the list elements.