List Casting Explained

1 Introduction

Hierarchical data is surprisingly common, and are commonly represented in by nested lists.

Broadcasted operations can be performed over dimensions, but not through nesting or hierarchies.
Therefore, it is useful to be able to cast nested lists into dimensional lists.
The ‘broadcast’ package provides the cast_hier2dim() to cast a nested list into a dimensional list (AKA a recursive array).

Casting between nested and dimensional lists is not only useful for broadcasting, however.
Casting nested lists to dimensional lists has its own merits, as dimensional lists have some advantages over nested lists beside the broadcasting, such as the following:

  • Performing sub-set operations on multiple recursive subsets (using the [[ and [[<- operators) requires a (potentially slow) loop, whereas multi-dimensional subsets (using operator forms like [..., ...] and [..., ...]<-) are vectorized and generally much faster.
  • Re-organizing dimensions of a recursive array is generally much easier, faster, and more straight-forward than re-organizing hierarchies of a nested list.

This Vignette gives an overview of the functions ‘broadcast’ provides to cast between nested and dimensional lists.

 

2 Cast Hierarchical List to Dimensional List

2.1 Introduction

The cast_hier2dim() function casts a nested list into a dimensional list.
This section gently introduces the properties of this function through a series of examples, where each subsequent example builds on the previous one.
Familiarity with nested lists and dimensional lists (i.e. arrays of type list) is essential to follow these examples.

 

2.2 Example 1: Basics

For a first example, consider the following list:

x <- list(
  group1 = list(
    class1 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    ),
    class2 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    )
  ),
  group2 = list(
    class1 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    ),
    class2 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    )
  )
)

Before actually casting x into a dimensional list, one may want to know what the dimensions will become when casted as a dimensional list;
The hier2dim() function shows you that:

hier2dim(x)
#>       
#> 3 2 2

It returns the dimensions c(3, 2, 2).

Let’s now cast x as a dimensional list:

x2 <- cast_hier2dim(x) # actually cast nested list into dimensional list
print(x2)
#> , , 1
#> 
#>      [,1]         [,2]        
#> [1,] numeric,10   numeric,10  
#> [2,] numeric,10   numeric,10  
#> [3,] character,10 character,10
#> 
#> , , 2
#> 
#>      [,1]         [,2]        
#> [1,] numeric,10   numeric,10  
#> [2,] numeric,10   numeric,10  
#> [3,] character,10 character,10

Using the default arguments, element x[[i]][[j]][[k]] corresponds to element x2[k, j, i] (for all i, j, and k).
This can be changed, as will be shown in a later example.

As shown in the results above, cast_hier2dim() will obviously not preserve names by default.
It is trivially easy to set the dimnames of x2, using hiernames2dimnames():

dimnames(x2) <- hiernames2dimnames(x)
print(x2)
#> , , group1
#> 
#>        class1       class2      
#> height numeric,10   numeric,10  
#> weight numeric,10   numeric,10  
#> sex    character,10 character,10
#> 
#> , , group2
#> 
#>        class1       class2      
#> height numeric,10   numeric,10  
#> weight numeric,10   numeric,10  
#> sex    character,10 character,10

There, the names are now correct.

As shown above, will display a dimensional list more compactly than a nested list.
Depending on the situation this may be either be desirable or undesirable.

One can print x2 less compactly without much effort by flattening it, using the cast_dim2flat() function.
We only need to see a portion of the list in detail, so let’s look at class1 from group 1 in the flattened form:

cast_dim2flat(x2[, 1, "group1", drop = FALSE])
#> $`['height', 'class1', 'group1']`
#>  [1] 170.4097 169.5869 170.5792 169.2318 169.3209 170.1329 170.1860 170.0604
#>  [9] 167.8982 171.1381
#> 
#> $`['weight', 'class1', 'group1']`
#>  [1] 77.48852 81.27759 81.93543 80.28457 80.44759 80.22265 79.70715 80.83743
#>  [9] 80.53305 79.62126
#> 
#> $`['sex', 'class1', 'group1']`
#>  [1] NA  NA  "M" "M" NA  "F" "M" NA  "F" NA

 

Dimensional lists can be easier to work with than hierarchical lists.
Consider, for example, printing the height of the first class of every group in a list - let’s compare how to do this in a nested list vs a dimensional list.

With a nested list, doing this takes a slow, messy for-loop:


for(i in seq_along(x)) {
  print(names(x)[i])
  x[[i]][[1]][["height"]] |> print() # slow for-loop, messy code
}
#> [1] "group1"
#>  [1] 170.4097 169.5869 170.5792 169.2318 169.3209 170.1329 170.1860 170.0604
#>  [9] 167.8982 171.1381
#> [1] "group2"
#>  [1] 168.8685 169.4471 169.7592 168.8014 169.7667 169.9114 170.0467 170.3272
#>  [9] 169.9100 168.5977

With a dimensional list, the very same thing can be done with sleek, vectorized code; no messy loop needed:


x2["height", 1L, ] |> print()
#> $group1
#>  [1] 170.4097 169.5869 170.5792 169.2318 169.3209 170.1329 170.1860 170.0604
#>  [9] 167.8982 171.1381
#> 
#> $group2
#>  [1] 168.8685 169.4471 169.7592 168.8014 169.7667 169.9114 170.0467 170.3272
#>  [9] 169.9100 168.5977

x2["height", 1L, , drop = FALSE] |> cast_dim2flat() # same but more informative
#> $`['height', 'class1', 'group1']`
#>  [1] 170.4097 169.5869 170.5792 169.2318 169.3209 170.1329 170.1860 170.0604
#>  [9] 167.8982 171.1381
#> 
#> $`['height', 'class1', 'group2']`
#>  [1] 168.8685 169.4471 169.7592 168.8014 169.7667 169.9114 170.0467 170.3272
#>  [9] 169.9100 168.5977

It is also easier to re-arrange dimensions - for example using aperm() - than it is to re-arrange hierarchies.

 

2.3 Example 2: Cast from outside to inside

In Example 1, the default arguments were used for cast_hier2dim().
One of these arguments is in2out, which defaults to TRUE.

Consider a nested list x with a depth of 3, and a dimensional list X2 with 3 dimensions, where the relationship between x and x2 can be expressed as x2 <- cast_hier2dim(x, ...).
Given this, the following can be stated about in2out:

  • If in2out = TRUE, which is the default and used in Example 1, element x[[i]][[j]][[k]] corresponds to element x2[k, j, i] (for all i, j, and k).
  • If in2out = FALSE, element x[[i]][[j]][[k]] corresponds to element x2[i, j, k] (for all i, j, and k).

The default of in2out = TRUE was chosen, because elements in subsequent rows are close to each other, while elements in subsequent layers (third dimension) are generally not close to each other, and the default of in2out = TRUE attempts to retain that behaviour.

For this example, the same list will be used as in Example 1:

x <- list(
  group1 = list(
    class1 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    ),
    class2 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    )
  ),
  group2 = list(
    class1 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    ),
    class2 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    )
  )
)

Let’s once again cast this list to a dimensional list, but this time use in2out = FALSE:

hier2dim(x, in2out = FALSE) # check once again the dimensions
#>       
#> 2 2 3
x2 <- cast_hier2dim(x, in2out = FALSE, direction.names = 1)
print(x2)
#> , , height
#> 
#>        class1     class2    
#> group1 numeric,10 numeric,10
#> group2 numeric,10 numeric,10
#> 
#> , , weight
#> 
#>        class1     class2    
#> group1 numeric,10 numeric,10
#> group2 numeric,10 numeric,10
#> 
#> , , sex
#> 
#>        class1       class2      
#> group1 character,10 character,10
#> group2 character,10 character,10

x2 is the casted list. Since in2out = FALSE, element x[[i]][[j]][[k]] corresponds to element x2[i, j, k] (for all i, j, and k).
The argument direction.names = 1 was specified, which intelligently tries to deduce good dimnames for the result. So we don’t have to set the dimension names via hiernames2dimnames().

One can print x2 less compactly without much effort by flattening it, again using the cast_dim2flat() function.
We only need to see a portion of the list in detail, so let’s look at class1 from group 1 in the flattened form:

cast_dim2flat(x2["group1", 1, , drop = FALSE])
#> $`['group1', 'class1', 'height']`
#>  [1] 170.2561 169.2926 169.0605 170.5066 169.0408 171.3947 169.8313 169.5866
#>  [9] 169.7594 170.9919
#> 
#> $`['group1', 'class1', 'weight']`
#>  [1] 80.87561 79.75086 80.12591 81.15372 79.85092 80.61488 80.49711 80.10148
#>  [9] 82.60428 79.00926
#> 
#> $`['group1', 'class1', 'sex']`
#>  [1] "F" NA  NA  NA  "F" "M" "M" "F" "M" NA

 

2.4 Example 3: Padding

For Example 3, we take the same list as before, but remove x$group1$class2:

x <- list(
  group1 = list(
    class1 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    )
  ),
  group2 = list(
    class1 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    ),
    class2 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    )
  )
)

Let’s first check what dimensions it will get when casted using hier2dim():

hier2dim(x)
#>         padding         
#>       3       2       2

The dimensions are the same as in Example 1: c(3, 2, 2).
But notice the names of the output are different: the second element has the name “padding”; this indicates that some columns won’t have enough elements to completely fill the column, and so additional elements will be added as padding.

So let’s cast this list as dimensional:

x2 <- cast_hier2dim(x, direction.names = 1)
print(x2)
#> , , group1
#> 
#>        class1       class2
#> height numeric,10   NULL  
#> weight numeric,10   NULL  
#> sex    character,10 NULL  
#> 
#> , , group2
#> 
#>        class1       class2      
#> height numeric,10   numeric,10  
#> weight numeric,10   numeric,10  
#> sex    character,10 character,10

Subset x2[, 2, 1] is filled with NULL; this is the place where x$group1$class2 was in Example 1, but since it’s not there, we need to fill something.

Sometimes, a different value than NULL is desired for padding.
So let’s replace the padding value with something really obvious, using the padding argument:

x2 <- cast_hier2dim(x, padding = list(~ "this is padding!"), direction.names = 1)
print(x2)
#> , , group1
#> 
#>        class1       class2             
#> height numeric,10   ~"this is padding!"
#> weight numeric,10   ~"this is padding!"
#> sex    character,10 ~"this is padding!"
#> 
#> , , group2
#> 
#>        class1       class2      
#> height numeric,10   numeric,10  
#> weight numeric,10   numeric,10  
#> sex    character,10 character,10

Once again, one can print or present x2 less compactly by flattening it:

cast_dim2flat(x2)
#> $`['height', 'class1', 'group1']`
#>  [1] 169.7480 168.8050 170.6286 170.2953 169.2970 170.7642 171.8230 170.6736
#>  [9] 170.0244 170.2502
#> 
#> $`['weight', 'class1', 'group1']`
#>  [1] 81.45773 79.16637 81.09699 78.98942 80.71325 80.60078 79.27912 79.58807
#>  [9] 79.41417 80.68828
#> 
#> $`['sex', 'class1', 'group1']`
#>  [1] NA  NA  "F" "F" "M" NA  "F" "F" "F" "F"
#> 
#> $`['height', 'class2', 'group1']`
#> ~"this is padding!"
#> 
#> $`['weight', 'class2', 'group1']`
#> ~"this is padding!"
#> 
#> $`['sex', 'class2', 'group1']`
#> ~"this is padding!"
#> 
#> $`['height', 'class1', 'group2']`
#>  [1] 170.7896 170.2148 169.1561 168.2601 169.0330 171.6365 170.9374 171.9694
#>  [9] 170.3675 170.2503
#> 
#> $`['weight', 'class1', 'group2']`
#>  [1] 82.65918 78.77814 81.60174 81.19392 78.87513 81.32251 80.30454 79.56475
#>  [9] 81.05508 77.99748
#> 
#> $`['sex', 'class1', 'group2']`
#>  [1] NA  "M" "M" "F" NA  "M" NA  NA  "M" NA 
#> 
#> $`['height', 'class2', 'group2']`
#>  [1] 169.7085 170.5905 168.6865 169.7705 169.1561 171.1776 169.5868 169.3961
#>  [9] 169.7559 170.6221
#> 
#> $`['weight', 'class2', 'group2']`
#>  [1] 80.27604 79.86020 80.11658 79.98731 80.38130 80.07964 80.25593 81.02540
#>  [9] 79.72967 80.12346
#> 
#> $`['sex', 'class2', 'group2']`
#>  [1] NA  "F" NA  NA  "M" NA  "M" "M" "M" "M"

 

2.5 Example 4: Comparing in2out with padding

In this example, the same nested list as from the previous example is used, to demonstrate the difference between in2out = TRUE (which is the default), and in2out = FALSE.

Consider first the original list again:


x <- list(
  group1 = list(
    class1 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    )
  ),
  group2 = list(
    class1 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    ),
    class2 = list(
      height = rnorm(10, 170),
      weight = rnorm(10, 80),
      sex = sample(c("M", "F", NA), 10, TRUE)
    )
  )
)

On the left side the list is casted as dimensional using the default of in2out = TRUE, with proper names assigned.
On the right side the list is casted as dimensional using in2out = FALSE, again with proper names assigned.
For the sake of this example, we will set the dimnames manually via hiernames2dimnames().

x2 <- cast_hier2dim(x)
dimnames(x2) <- hiernames2dimnames(x)
print(x2)
#> , , group1
#> 
#>        class1       class2
#> height numeric,10   NULL  
#> weight numeric,10   NULL  
#> sex    character,10 NULL  
#> 
#> , , group2
#> 
#>        class1       class2      
#> height numeric,10   numeric,10  
#> weight numeric,10   numeric,10  
#> sex    character,10 character,10
x2 <- cast_hier2dim(x, in2out = FALSE)
dimnames(x2) <- hiernames2dimnames(x, in2out = FALSE)
print(x2)
#> , , height
#> 
#>        class1     class2    
#> group1 numeric,10 NULL      
#> group2 numeric,10 numeric,10
#> 
#> , , weight
#> 
#>        class1     class2    
#> group1 numeric,10 NULL      
#> group2 numeric,10 numeric,10
#> 
#> , , sex
#> 
#>        class1       class2      
#> group1 character,10 NULL        
#> group2 character,10 character,10

 

3 Cast Dimensional List to Hierarchical list

‘broadcast’ provides the cast_dim2hier():
cast_dim2hier() takes a dimensional list (i.e. an array of type list), and casts it to a nested list.

Consider the following recursive array as an example:


x <- array(c(as.list(1:11), ~hello, as.list(month.abb)), c(4:2))
dimnames(x) <- list(
  letters[1:4],
  LETTERS[1:3],
  c("group1", "group2")
)
print(x)
#> , , group1
#> 
#>   A B C     
#> a 1 5 9     
#> b 2 6 10    
#> c 3 7 11    
#> d 4 8 ~hello
#> 
#> , , group2
#> 
#>   A     B     C    
#> a "Jan" "May" "Sep"
#> b "Feb" "Jun" "Oct"
#> c "Mar" "Jul" "Nov"
#> d "Apr" "Aug" "Dec"

Like cast_hier2dim() also has the in2out argument, which (again) defaults to TRUE.
Let’s cast the above dimensional list to a nested list, and compare the results when using in2out = TRUE (on the left) versus in2out = FALSE (on the right):


x2 <- cast_dim2hier(
  x, distr.names = TRUE
)
lobstr::tree(x2)
#> <list>
#> ├─group1: <list>
#> │ ├─A: <list>
#> │ │ ├─a: 1
#> │ │ ├─b: 2
#> │ │ ├─c: 3
#> │ │ └─d: 4
#> │ ├─B: <list>
#> │ │ ├─a: 5
#> │ │ ├─b: 6
#> │ │ ├─c: 7
#> │ │ └─d: 8
#> │ └─C: <list>
#> │   ├─a: 9
#> │   ├─b: 10
#> │   ├─c: 11
#> │   └─d: S3<formula> ~hello
#> └─group2: <list>
#>   ├─A: <list>
#>   │ ├─a: "Jan"
#>   │ ├─b: "Feb"
#>   │ ├─c: "Mar"
#>   │ └─d: "Apr"
#>   ├─B: <list>
#>   │ ├─a: "May"
#>   │ ├─b: "Jun"
#>   │ ├─c: "Jul"
#>   │ └─d: "Aug"
#>   └─C: <list>
#>     ├─a: "Sep"
#>     ├─b: "Oct"
#>     ├─c: "Nov"
#>     └─d: "Dec"

x2 <- cast_dim2hier(
  x, in2out = FALSE, distr.names = TRUE
)
lobstr::tree(x2)
#> <list>
#> ├─a: <list>
#> │ ├─A: <list>
#> │ │ ├─group1: 1
#> │ │ └─group2: "Jan"
#> │ ├─B: <list>
#> │ │ ├─group1: 5
#> │ │ └─group2: "May"
#> │ └─C: <list>
#> │   ├─group1: 9
#> │   └─group2: "Sep"
#> ├─b: <list>
#> │ ├─A: <list>
#> │ │ ├─group1: 2
#> │ │ └─group2: "Feb"
#> │ ├─B: <list>
#> │ │ ├─group1: 6
#> │ │ └─group2: "Jun"
#> │ └─C: <list>
#> │   ├─group1: 10
#> │   └─group2: "Oct"
#> ├─c: <list>
#> │ ├─A: <list>
#> │ │ ├─group1: 3
#> │ │ └─group2: "Mar"
#> │ ├─B: <list>
#> │ │ ├─group1: 7
#> │ │ └─group2: "Jul"
#> │ └─C: <list>
#> │   ├─group1: 11
#> │   └─group2: "Nov"
#> └─d: <list>
#>   ├─A: <list>
#>   │ ├─group1: 4
#>   │ └─group2: "Apr"
#>   ├─B: <list>
#>   │ ├─group1: 8
#>   │ └─group2: "Aug"
#>   └─C: <list>
#>     ├─group1: S3<formula> ~hello
#>     └─group2: "Dec"

The added distr.names = TRUE argument will distribute the dimnames in a logical way over the nested elements.

 

4 Simple Data Wrangling Example: Turning list inside out

The cast functions can be used to turn a list inside out.

Let’s start with the following list:

x <- list(
  group1 = list(
    class1 = list(
      height = rnorm(5, 170) |> as.integer(),
      weight = rnorm(5, 80)  |> as.integer(),
      sex = sample(c("M", "F", NA), 5, TRUE)
    ),
    class2 = list(
      height = rnorm(5, 170)  |> as.integer(),
      weight = rnorm(5, 80) |> as.integer(),
      sex = sample(c("M", "F", NA), 5, TRUE)
    )
  ),
  group2 = list(
    class1 = list(
      height = rnorm(5, 170) |> as.integer(),
      weight = rnorm(5, 80) |> as.integer(),
      sex = sample(c("M", "F", NA), 5, TRUE)
    ),
    class2 = list(
      height = rnorm(5, 170) |> as.integer(),
      weight = rnorm(5, 80) |> as.integer(),
      sex = sample(c("M", "F", NA), 5, TRUE)
    )
  )
)

Turning this list inside out means manipulating this list such that height, weight and sex become the surface-level elements and the groups become the deepest levels.

This can be done fast & easy with ‘broadcast’, by casting the nested list to dimensional with in2out = TRUE, and then casting the dimensional list back to nested using in2out = FALSE.

 

First, cast nested list to dimensional list:

x2 <- cast_hier2dim(x, direction.names = 1)
print(x2)
#> , , group1
#> 
#>        class1      class2     
#> height integer,5   integer,5  
#> weight integer,5   integer,5  
#> sex    character,5 character,5
#> 
#> , , group2
#> 
#>        class1      class2     
#> height integer,5   integer,5  
#> weight integer,5   integer,5  
#> sex    character,5 character,5

The default value for in2out is TRUE, so we don’t have to specify it here.
The direction.names argument in cast_hier2dim() will construct dimensional names for you, so you don’t have to set them manually using hiernames2dimnames().

Second, cast the newly created dimensional list back to a nested list, but this time use in2out = FALSE:



x3 <- cast_dim2hier(x2, in2out = FALSE, distr.names = TRUE)
lobstr::tree(x3)
#> <list>
#> ├─height: <list>
#> │ ├─class1: <list>
#> │ │ ├─group1<int [5]>: 169, 168, 169, 169, 169
#> │ │ └─group2<int [5]>: 169, 170, 170, 170, 171
#> │ └─class2: <list>
#> │   ├─group1<int [5]>: 169, 169, 169, 169, 169
#> │   └─group2<int [5]>: 169, 168, 169, 169, 169
#> ├─weight: <list>
#> │ ├─class1: <list>
#> │ │ ├─group1<int [5]>: 79, 80, 78, 81, 79
#> │ │ └─group2<int [5]>: 81, 79, 80, 81, 79
#> │ └─class2: <list>
#> │   ├─group1<int [5]>: 81, 78, 81, 80, 79
#> │   └─group2<int [5]>: 81, 79, 79, 79, 78
#> └─sex: <list>
#>   ├─class1: <list>
#>   │ ├─group1<chr [5]>: "F", "NA", "M", "NA", "F"
#>   │ └─group2<chr [5]>: "NA", "NA", "F", "M", "NA"
#>   └─class2: <list>
#>     ├─group1<chr [5]>: "NA", "M", "F", "F", "M"
#>     └─group2<chr [5]>: "M", "NA", "F", "F", "M"

We use lobstr::tree() to print the results more compactly;
as shown, the original nested list has now successfully been turned inside-out, with great ease.