Skip to contents

squarebrackets:
Subset Methods as Alternatives to the Square Brackets Operators for Programming.

'squarebrackets' provides subset methods (supporting both atomic and recursive S3 classes) that may be more convenient alternatives to the [ and [<- operators, whilst maintaining similar performance.
Some nice properties of these methods include, but are not limited to, the following.

  1. The [ and [<- operators use different rule-sets for different data.frame-like types (data.frames, data.tables, tibbles, tidytables, etc.).
    The 'squarebrackets' methods use the same rule-sets for the different data.frame-like types.

  2. Performing dimensional subset operations on an array using [ and [<-, requires a-priori knowledge on the number of dimensions the array has.
    The 'squarebrackets' methods work on any arbitrary dimensions without requiring such prior knowledge.

  3. When selecting names with the [ and [<- operators, only the first occurrence of the names are selected in case of duplicate names.
    The 'squarebrackets' methods always perform on all names in case of duplicates, not just the first.

  4. The [<- operator only supports copy-on-modify semantics for most classes.
    The 'squarebrackets' methods provides explicit pass-by-reference and pass-by-value semantics, whilst still respecting things like binding-locks and mutability rules.

  5. 'squarebrackets' supports index-less sub-set operations, which is more memory efficient (and better for the environment) for long vectors than sub-set operations using the [ and [<- operators.

Goal

Among programming languages, 'R' has perhaps one of the most flexible and comprehensive sub-setting functionality, provided by the square brackets operators ([, [<-).
But in some situations the square brackets operators are occasionally less than optimally convenient

The Goal of the 'squarebrackets' package is not to replace the square-brackets operators, but to provide alternative sub-setting methods and functions, to be used in situations where the square bracket operators are inconvenient.

Quick Start Guide

For the Quick Start Guide, see:
https://tony-aw.github.io/squarebrackets/articles/squarebrackets.html.

Overview Help Pages

Essentials
The essential documentation is split into the following help pages:

Arguments
The methods in 'squarebrackets' share a lot of common arguments.
The explanations for these common arguments are given in the following help pages:

Pass-By-Reference
The following help pages explain the pass-by-reference semantics provided by 'squarebrackets', and only need to be read when planning to use those semantics:

  • squarebrackets_PassByReference:
    Explains Pass-by-Reference semantics, and its important consequences.

  • squarebrackets_coercion:
    Explains the difference in coercion rules between modification through Pass-by-Reference semantics and modification through copy (i.e. pass-by-value).

Other
And finally, there is the squarebrackets_method_dispatch help page, which gives some small additional details regarding the S3 method dispatch used in 'squarebrackets'.

Helper Functions

A couple of convenience functions, and helper functions for creating ranges, sequences, and indices (often needed in sub-setting) are provided:

  • n: Nested version of c, and short-hand for list.

  • ndim: Get the number of dimensions of an object.

  • sub2coord, coord2ind: Convert subscripts (array indices) to coordinates, coordinates to flat indices, and vice-versa.

  • match_all: Find all matches, of one vector in another, taking into account the order and any duplicate values of both vectors.

  • Computing indices:
    idx_r to compute an integer index range.
    idx_by to compute grouped indices.
    idx_ord_-functions to compute ordered indices.

Properties Details

The alternative sub-setting methods and functions provided by 'squarebrackets' have the following properties:

  • Programmatically friendly:

    • Unlike base [, it's not required to know the number of dimensions of an array a-priori, to perform subset-operations on an array.

    • Missing arguments can be filled with NULL, instead of using dark magic like base::quote(expr = ).

    • No Non-standard evaluation.

    • Functions are pipe-friendly.

    • No (silent) vector recycling.

    • Extracting and removing subsets uses the same syntax.

  • Class consistent:

    • sub-setting of multi-dimensional objects by specifying dimensions (i.e. rows, columns, ...) use drop = FALSE.
      So matrix in, matrix out.

    • The methods deliver the same results for data.frames, data.tables, tibbles, and tidytables.
      No longer does one have to re-learn the different brackets-based sub-setting rules for different types of data.frame-like objects.
      Powered by the subclass agnostic 'C'-code from 'collapse' and 'data.table'.

  • Explicit copy semantics:

    • Sub-set operations that change its memory allocations, always return a modified (partial) copy of the object.

    • For sub-set operations that just change values in-place (similar to the [<- and [[<- methods) the user can choose a method that modifies the object by reference, or choose a method that returns a (partial) copy.

  • Careful handling of names:

    • Sub-setting an object by index names returns ALL matches with the given names, not just the first.

    • Data.frame-like objects (see supported classes below) are forced to have unique column names.

    • Sub-setting arrays using x[indx1, indx2, etc.] will drop names(x).
      The methods from 'squarebrackets' will not drop names(x).

  • Concise function and argument names.

  • Performance & Energy aware:
    Despite the many checks performed, the functions are kept reasonably speedy, through the use of the 'Rcpp', 'collapse', and 'data.table' R-packages.
    The functions were also made to be as memory efficient as reasonably possible, to lower the carbon footprint of this package.

References

The badges shown in the documentation of this R-package were made using the services of: https://shields.io/

Author

Author, Maintainer: Tony Wilkes tony_a_wilkes@outlook.com (ORCID)