Skip to contents

There are several types of arguments that can be used in the generic methods of 'squarebrackets' to specify the indices to perform operations on:

  • i: to specify flat (i.e. dimensionless) indices.

  • s, d: to specify indices of arbitrary dimensions in any dimensional object supported by 'squarebrackets' (i.e. arrays and data.frame-like objects).

  • margin, slice: to specify indices of one particular dimension (for arrays and data.frame-like objects).
    Only used in the idx method.

  • obs, vars: to specify observations and/or variables in specifically in data.frame-like objects.

For the fundamentals of indexing in 'squarebrackets', see squarebrackets_indx_fundamentals.
In this help page x refers to the object on which subset operations are performed.


Argument i

[class: atomic vector]
[class: derived atomic vector]
[class: recursive vector]
[class: atomic array]
[class: recursive array]

Any of the following can be specified for argument i:

  • NULL, corresponds to missing argument.

  • a vector of length 0, in which case no indices are selected for the operation (i.e. empty selection).

  • a numeric vector of strictly positive whole numbers giving indices.

  • a complex vector, as explained in squarebrackets_indx_fundamentals.

  • a logical vector, of the same length as x, giving the indices to select for the operation.

  • a character vector of index names.
    If an object has multiple indices with the given name, ALL the corresponding indices will be selected for the operation.

  • a function that takes as input x, and returns a logical vector, giving the element indices to select for the operation.
    For atomic objects, i is interpreted as i(x).
    For recursive objects, i is interpreted as lapply(x, i).

Using the i arguments corresponds to doing something like the following:

 sb_x(x, i = i) # ==> x[i]   # if `x` is atomic
 sb2_x(x, i = i) # ==> x[i]  # if `x` is recursive

If i is a function, it corresponds to the following:

 sb_x(x, i = i) # ==> x[i(x)] # if `x` is atomic
 sb2_x(x, i = i) # ==> x[lapply(x, i)] # if `x` is recursive

Argument Pair s, d

[class: atomic array]
[class: recursive array]
[class: data.frame-like]
The s, d argument pair, inspired by the abind::asub function from the 'abind' package, is the primary indexing argument for sub-set operations on dimensional objects.

The s argument specifies the subscripts (i.e. dimensional indices).
The d argument gives the dimensions for which the s holds (i.e. d specifies the "non-missing" margins).

The d argument must be an integer vector.

s must be a list of length 1, or a list of the same length as d.
If s is a list of length 1, it is internally recycled to become the same length as d.

Each element of s can be any of the following:

  • a vector of length 0, in which case no indices are selected for the operation (i.e. empty selection).

  • a numeric vector of strictly positive whole numbers with indices of the specified dimension to select for the operation.

  • a complex vector, as explained in squarebrackets_indx_fundamentals.

  • a logical vector of the same length as the corresponding dimension size, giving the indices of the specified dimension to select for the operation.

  • a character vector giving the dimnames to select.
    If a dimension has multiple indices with the given name, ALL the corresponding indices will be selected for the operation.

Note the following:

  • As stated, d specifies which index margins are non-missing.
    If d is of length 0, it is taken as "all index margins are missing".

  • The default value for d is 1:ndim(x).

To keep the syntax short, the user can use the n function instead of list() to specify s.

EXAMPLES
Here are some examples for clarity, using an atomic array x of 3 dimensions:

  • sb_x(x, n(1:10, 1:5), c(1, 3))
    extracts the first 10 rows, all columns, and the first 5 layers, of array x.

  • sb_x(x, n(1:10), 2)
    extracts the first 10 columns of array x.

  • sb_x(x, n(1:10)),
    extracts the first 10 rows, columns, and layers of array x.

  • sb_x(x, n(1:10), c(1, 3)),
    extracts the first 10 rows, all columns, and the first 10 layers, of array x.

I.e.:


sb_x(x, n(1:10, 1:5), c(1, 3)) # ==> x[1:10, , 1:5, drop = FALSE]

sb_x(x, n(1:10), 2)               # ==> x[ , 1:10, , drop = FALSE]

sb_x(x, n(1:10))                  # ==> x[1:10, 1:10, 1:10, drop = FALSE]

sb_x(x, n(1:10), c(1, 3))         # ==> x[1:10, , 1:10, drop = FALSE]

NOTE
If length(d) is 1, s can also be given as an atomic vector (of any length), instead of a list of length 1.
Although it is allowed for s and d to both be atomic vectors of length 1, for the readability of your code it is highly recommended that s and d be explicitly named in your method call, in such a case.
I.e.:


sb_x(x, 1, 1) # BAD: this is not very readable

sb_x(x, s = 1, d = 1) # This is GOOD

For a brief explanation of the relationship between flat indices (i) and subscripts (s, d) in arrays, see sub2ind.

Argument Pair margin, slice

[class: atomic array]
[class: recursive array]
[class: data.frame-like]

Relevant only for the idx method.
The margin argument specifies the dimension on which argument slice is used.
I.e. when margin = 1, slice selects rows;
when margin = 2, slice selects columns;
etc.

The slice argument can be any of the following:

  • a numeric vector of strictly positive whole numbers with dimension indices to select for the operation.

  • a complex vector, as explained in squarebrackets_indx_fundamentals.

  • a logical vector of the same length as the corresponding dimension size, giving the dimension indices to select for the operation.

  • a character vector of index names.
    If a dimension has multiple indices with the given name, ALL the corresponding indices will be selected for the operation.

One could also give a vector of length 0 for slice;
Argument slice is only used in the idx method , and the result of idx are meant to be used inside the regular [ and [<- operators.
Thus the effect of a zero-length index specification depends on the rule-set of [.class(x) and [<-.class(x).

Arguments obs, vars

[class: data.frame-like]
The obs argument specifies indices for observations (i.e. rows) in data.frame-like objects.
The vars argument specifies indices for variables (i.e. columns) in data.frame-like objects.
The obs and vars arguments are inspired by the subset and select arguments, respectively, of base R's subset.data.frame method. However, the obs and vars arguments do not use non-standard evaluation, as to keep 'squarebrackets' fully programmatically friendly.

The obs Argument
The obs argument can be any of the following:

  • NULL (default), corresponds to a missing argument.

  • a vector of length 0, in which case no indices are selected for the operation (i.e. empty selection).

  • a numeric vector of strictly positive whole numbers with row indices to select for the operation.

  • a complex vector, as explained in squarebrackets_indx_fundamentals.

  • a logical vector of the same length as the number of rows, giving the row indices to select for the operation.

  • a one-sided formula, with a single logical expression using the column names of the data.frame, giving the condition which observation/row indices should be selected for the operation.

So to perform an operation on the observations for which holds that height > 2 and sex != "female", specify the following formula:

obs = ~ (height > 2) & (sex != "female")

If the formula is linked to an environment, any variables not found in the data set will be searched from the environment.

The vars Argument
The vars argument can be any of the following

  • NULL (default), corresponds to a missing argument.

  • a vector of length 0, in which case no indices are selected for the operation (i.e. empty selection).

  • a numeric vector of strictly positive whole numbers with column indices to select for the operation.

  • a complex vector, as explained in squarebrackets_indx_fundamentals.

  • a logical vector of the same length as the number of columns, giving the column indices to select for the operation.

  • a character vector giving the colnamess to select.
    Note that 'squarebrackets' assumes data.frame-like objects have unique column names.

  • a function that returns a logical vector, giving the column indices to select for the operation.
    For example, to select all numeric variables, specify vars = is.numeric.

  • a two-sided formula, where each side consists of a single term, giving a range of names to select.
    For example, to select all variables between and including the variables "height" and "weight", specify the following:
    vars = heigth ~ weight.

EXAMPLE

So using the obs, vars arguments corresponds to doing something like the following:

 sb2_x(x, obs = obs, vars = vars) # ==> subset(x, ...obs..., ...vars...)

Argument inv

[all classes]

Relevant for the sb_mod/sb2_mod, sb_set/sb2_set, and idx methods.
By default, inv = FALSE, which translates the indices like normally.
When inv = TRUE, the inverse of the indices is taken.
Consider, for example, an atomic matrix x;
using sb_mod(x, 1:2, 2L, tf = tf) corresponds to something like the following:


x[, 1:2] <- tf(x[, 1:2])
x

and using sb_mod(x, vars = 1:2, inv = TRUE, tf = tf) corresponds to something like the following:


x[, -1:-2] <- tf(x[, -1:-2])
x

NOTE
The order in which the user gives indices when inv = TRUE generally does not matter.
The order of the indices as they appear in the original object x is maintained, just like in base 'R'.
Therefore, when replacing multiple values where the order of the replacement matters, it is better to keep inv = FALSE, which is the default.
For replacement with a single value or with a transformation function, inv = TRUE can be used without considering the ordering.

All Missing Indices

NULL in the indexing arguments corresponds to a missing argument.
For s, d, specifying d of length 0 also corresponds to all subscripts being missing.
Thus, for both sb_x/sb2_x and sb_wo/sb2_wo, using missing or NULL indexing arguments for all indexing arguments corresponds to something like the following:


x[]

Similarly, for sb_mod/sb2_mod and sb_set/sb2_set, using missing or NULL indexing arguments corresponds to something like the following:


x[] <- rp # for replacement
x[] <- tf(x) # for transformation

The above is true even if inv = TRUE and/or red = TRUE.

Disallowed Combinations of Index Arguments

One cannot specify i and the other indexing arguments simultaneously; it's either i, or the other arguments.

One cannot specify row and filter simultaneously; it's either one or the other.
One cannot specify col and vars simultaneously; it's either one or the other.
One cannot specify the s, d pair and slice, margin pair simultaneously; it's either one pair or the other pair.
In the above cases it holds that if one set is specified, the other is set is ignored.

Drop

Sub-setting with the generic methods from the 'squarebrackets' R-package using dimensional arguments (s, d, row, col filter, vars) always use drop = FALSE.
To drop potentially redundant (i.e. single level) dimensions, use the drop function, like so:

 sb_x(x, s, d) |> drop() # ==> x[..., drop = TRUE]

References

Plate T, Heiberger R (2016). abind: Combine Multidimensional Arrays. R package version 1.4-5, https://CRAN.R-project.org/package=abind.