Conditional Formatting

Alex Lishinski

2024-01-29

As a part of the new improvements to the package, a feature that I’ve long wanted to add was conditional formatting. By conditional formatting, I mean formatting different parts of the plot differently, like perhaps changing the shape of only latent variable nodes, or changing the color of covariance edges separately from latent variable or regression edges. Well, I’m pleased to introduce this functionality into the package for version 0.8.0!!

Conditional formatting in LavaanPlot

Conditional formatting means applying different node or edge options to different groups of nodes or edges respectively. For the context of structural equation models there are some natural groupings of nodes and edges. For nodes it is often helpful to distinguish latent from observed values, and for edges it is often helpful to distinguish regression, latent, and covariance relationships.

The way that I have implemented conditional formatting in lavaanPlot involves a new helper function called formatting. In the old version of the package, for both the original lavaanPlot and the new lavaanPlot2, formatting options were specified through the node_options and edge_options argument (through the graph_options argument too, but I won’t discuss that here because it doesn’t lend itself to conditional formatting). You would specify a named list of attribute-value pairs for these arguments to give your formatting choices, and these would apply to the whole graph. Not anymore.

Now with the formatting helper function you can create sets of node and edge attributes for specific parts of the graph. You can do this for the natural groupings of nodes and edges mentioned below, and you can also do it for custom groupings of edges.

formatting function

The formatting function works as follows. You supply lists of formatting you want to apply to the portions of the graph, and you pass the resulting list to lavaanPlot2.

There are a few main scenarios to think about.

Here’s a good example latent variable model:

HS.model <- ' visual  =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed   =~ x7 + x8 + x9
'

fit2 <- cfa(HS.model, data=HolzingerSwineford1939)
summary(fit2)
#> lavaan 0.6.15 ended normally after 35 iterations
#> 
#>   Estimator                                         ML
#>   Optimization method                           NLMINB
#>   Number of model parameters                        21
#> 
#>   Number of observations                           301
#> 
#> Model Test User Model:
#>                                                       
#>   Test statistic                                85.306
#>   Degrees of freedom                                24
#>   P-value (Chi-square)                           0.000
#> 
#> Parameter Estimates:
#> 
#>   Standard errors                             Standard
#>   Information                                 Expected
#>   Information saturated (h1) model          Structured
#> 
#> Latent Variables:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>   visual =~                                           
#>     x1                1.000                           
#>     x2                0.554    0.100    5.554    0.000
#>     x3                0.729    0.109    6.685    0.000
#>   textual =~                                          
#>     x4                1.000                           
#>     x5                1.113    0.065   17.014    0.000
#>     x6                0.926    0.055   16.703    0.000
#>   speed =~                                            
#>     x7                1.000                           
#>     x8                1.180    0.165    7.152    0.000
#>     x9                1.082    0.151    7.155    0.000
#> 
#> Covariances:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>   visual ~~                                           
#>     textual           0.408    0.074    5.552    0.000
#>     speed             0.262    0.056    4.660    0.000
#>   textual ~~                                          
#>     speed             0.173    0.049    3.518    0.000
#> 
#> Variances:
#>                    Estimate  Std.Err  z-value  P(>|z|)
#>    .x1                0.549    0.114    4.833    0.000
#>    .x2                1.134    0.102   11.146    0.000
#>    .x3                0.844    0.091    9.317    0.000
#>    .x4                0.371    0.048    7.779    0.000
#>    .x5                0.446    0.058    7.642    0.000
#>    .x6                0.356    0.043    8.277    0.000
#>    .x7                0.799    0.081    9.823    0.000
#>    .x8                0.488    0.074    6.573    0.000
#>    .x9                0.566    0.071    8.003    0.000
#>     visual            0.809    0.145    5.564    0.000
#>     textual           0.979    0.112    8.737    0.000
#>     speed             0.384    0.086    4.451    0.000
labels2 = c(visual = "Visual Ability", textual = "Textual Ability", speed = "Speed Ability")

And here’s the old way of doing things without conditional formatting:

lavaanPlot2(fit2, include = "covs", labels = labels2,
            graph_options = list(label = "my first graph with signficance stars"),
            node_options = list( fontname = "Helvetica"),
            edge_options = list(color = "grey"),
            stars = c("latent"),
            coef_labels = TRUE)

Contitional formatting for nodes

Now here’s an example where we’re separately customizing the formatting of the latent and observed variable nodes. The formatting function creates our differentiated formatting, and we specify that we’re doing node formatting, so it then receives our formatting lists in order of latent, followed by observed variables.

n_opts <- formatting(list(shape = "polygon", sides = "6", color = "orange"), list(shape = "polygon", sides = "8",color = "blue"), type = "node")

lavaanPlot2(fit2, include = "covs", labels = labels2,
            graph_options = list(label = "my first graph with signficance stars"),
            node_options = n_opts,
            edge_options = list(color = "grey"),
            stars = c("latent"),
            coef_labels = TRUE)

Conditional formatting for edges

Here’s an example with the node formatting, along with some differentiated edge formatting for the regression, latent, covariance groups.

e_opts <- formatting(list(color = "orange"),list(color = "red", penwidth = 6), list(color = "blue"), type = "edge")

lavaanPlot2(fit2, include = "covs", labels = labels2,
            graph_options = list(label = "my first graph with signficance stars"),
            node_options = n_opts,
            edge_options = e_opts,
            stars = c("latent"),
            coef_labels = TRUE)

Conditional formatting for Custom sets of edges

And finally we can add on another layer of edge formatting, on top of the standard groupings of edges used above, with our own custom selected edges, by specifying parameter labels in the model specification. To add labels, simply pre-multiply a text string with the name of the variable.

HS.model <- ' visual  =~ A*x1 + x2 + x3
textual =~ x4 + x5 + B*x6
speed   =~ x7 + x8 + x9
'

fit2 <- cfa(HS.model, data=HolzingerSwineford1939)

Then we can specify a set of options for the custom parameter labels, and make a list of the two different sets of edge options.

c_opts <- formatting(list(color = "yellow", penwidth = 8), list(color = "blue", penwidth = 10), type = "custom", groups = c("A", "B"))

lavaanPlot2(fit2, include = "covs", labels = labels2,
            graph_options = list(label = "my first graph with signficance stars"),
            node_options = n_opts,
            edge_options = list(e_opts, c_opts),
            stars = c("latent"),
            coef_labels = TRUE)