Compute Grouped Indices — idx_by • squarebrackets

Given:

a sub-set function f;
an object x with its margin m;
and a grouping factor grp;

the idx_by() function takes indices per group grp.
The result of idx_by() can be supplied to the indexing arguments (see squarebrackets_indx_args) to perform grouped subset operations.

Usage

idx_by(x, m, f, grp, parallel = FALSE, mc.cores = 1L)

Arguments

x: the object from which to compute the indices.
m: a single non-negative integer giving the margin for which to compute indices.
For flat indices or for non-dimensional objects, use m = 0L.
f: a subset function to be applied per group on indices.
If m == 0L, indices is here defined as setNames(1:length(x), names(x)).
If m > 0L, indices is here defined as setNames(1:dim(x)[m], dimnames(x)[[m]]).
The function must produce a character or integer vector as output.
For example, to subset the last element per group, specify:
f = last
grp: a factor giving the groups.
parallel, mc.cores: see BY.

Value

A vector of indices.

Examples



# vectors ====
(a <- 1:20)
#>  [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20
(grp <- factor(rep(letters[1:5], each = 4)))
#>  [1] a a a a b b b b c c c c d d d d e e e e
#> Levels: a b c d e

# get the last element of `a` for each group in `grp`:
s <- list(idx_by(a, 0L, last, grp))
ss_x(cbind(a, grp), s, 1L)
#>       a grp
#> [1,]  4   1
#> [2,]  8   2
#> [3,] 12   3
#> [4,] 16   4
#> [5,] 20   5


# data.frame ====
x <- data.frame(
  a = sample(1:20),
  b = letters[1:20],
  group = factor(rep(letters[1:5], each = 4))
)
print(x)
#>     a b group
#> 1  19 a     a
#> 2  18 b     a
#> 3  14 c     a
#> 4   1 d     a
#> 5   5 e     b
#> 6   6 f     b
#> 7   8 g     b
#> 8   9 h     b
#> 9  10 i     c
#> 10 12 j     c
#> 11 16 k     c
#> 12  7 l     c
#> 13 11 m     d
#> 14 17 n     d
#> 15  3 o     d
#> 16 20 p     d
#> 17  2 q     e
#> 18 13 r     e
#> 19  4 s     e
#> 20 15 t     e
# get the first row for each group in data.frame `x`:
row <- idx_by(x, 1, first, x$group)
ss2_x(x, row, 1L)
#>    a b group
#> 1 19 a     a
#> 2  5 e     b
#> 3 10 i     c
#> 4 11 m     d
#> 5  2 q     e
# get the first row for each group for which a > 10:
x2 <- ss2_x(x, obs = ~ a > 10)
row <- na.omit(idx_by(x2, 1, first, x2$group))
ss2_x(x2, row, 1L)
#>    a b group
#> 1 19 a     a
#> 2 12 j     c
#> 3 11 m     d
#> 4 13 r     e