As a part of the new improvements to the package, a feature that I’ve long wanted to add was conditional formatting. By conditional formatting, I mean formatting different parts of the plot differently, like perhaps changing the shape of only latent variable nodes, or changing the color of covariance edges separately from latent variable or regression edges. Well, I’m pleased to introduce this functionality into the package for version 0.8.0!!
Conditional formatting means applying different node or edge options to different groups of nodes or edges respectively. For the context of structural equation models there are some natural groupings of nodes and edges. For nodes it is often helpful to distinguish latent from observed values, and for edges it is often helpful to distinguish regression, latent, and covariance relationships.
The way that I have implemented conditional formatting in
lavaanPlot
involves a new helper function called
formatting
. In the old version of the package, for both the
original lavaanPlot
and the new lavaanPlot2
,
formatting options were specified through the node_options
and edge_options
argument (through the
graph_options
argument too, but I won’t discuss that here
because it doesn’t lend itself to conditional formatting). You would
specify a named list of attribute-value pairs for these arguments to
give your formatting choices, and these would apply to the whole graph.
Not anymore.
Now with the formatting
helper function you can create
sets of node and edge attributes for specific parts of the graph. You
can do this for the natural groupings of nodes and edges mentioned
below, and you can also do it for custom groupings of edges.
formatting
functionThe formatting
function works as follows. You supply
lists of formatting you want to apply to the portions of the graph, and
you pass the resulting list to lavaanPlot2
.
There are a few main scenarios to think about.
Here’s a good example latent variable model:
HS.model <- ' visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
'
fit2 <- cfa(HS.model, data=HolzingerSwineford1939)
summary(fit2)
#> lavaan 0.6.15 ended normally after 35 iterations
#>
#> Estimator ML
#> Optimization method NLMINB
#> Number of model parameters 21
#>
#> Number of observations 301
#>
#> Model Test User Model:
#>
#> Test statistic 85.306
#> Degrees of freedom 24
#> P-value (Chi-square) 0.000
#>
#> Parameter Estimates:
#>
#> Standard errors Standard
#> Information Expected
#> Information saturated (h1) model Structured
#>
#> Latent Variables:
#> Estimate Std.Err z-value P(>|z|)
#> visual =~
#> x1 1.000
#> x2 0.554 0.100 5.554 0.000
#> x3 0.729 0.109 6.685 0.000
#> textual =~
#> x4 1.000
#> x5 1.113 0.065 17.014 0.000
#> x6 0.926 0.055 16.703 0.000
#> speed =~
#> x7 1.000
#> x8 1.180 0.165 7.152 0.000
#> x9 1.082 0.151 7.155 0.000
#>
#> Covariances:
#> Estimate Std.Err z-value P(>|z|)
#> visual ~~
#> textual 0.408 0.074 5.552 0.000
#> speed 0.262 0.056 4.660 0.000
#> textual ~~
#> speed 0.173 0.049 3.518 0.000
#>
#> Variances:
#> Estimate Std.Err z-value P(>|z|)
#> .x1 0.549 0.114 4.833 0.000
#> .x2 1.134 0.102 11.146 0.000
#> .x3 0.844 0.091 9.317 0.000
#> .x4 0.371 0.048 7.779 0.000
#> .x5 0.446 0.058 7.642 0.000
#> .x6 0.356 0.043 8.277 0.000
#> .x7 0.799 0.081 9.823 0.000
#> .x8 0.488 0.074 6.573 0.000
#> .x9 0.566 0.071 8.003 0.000
#> visual 0.809 0.145 5.564 0.000
#> textual 0.979 0.112 8.737 0.000
#> speed 0.384 0.086 4.451 0.000
labels2 = c(visual = "Visual Ability", textual = "Textual Ability", speed = "Speed Ability")
And here’s the old way of doing things without conditional formatting:
lavaanPlot2(fit2, include = "covs", labels = labels2,
graph_options = list(label = "my first graph with signficance stars"),
node_options = list( fontname = "Helvetica"),
edge_options = list(color = "grey"),
stars = c("latent"),
coef_labels = TRUE)
Now here’s an example where we’re separately customizing the
formatting of the latent and observed variable nodes. The
formatting
function creates our differentiated formatting,
and we specify that we’re doing node formatting, so it then receives our
formatting lists in order of latent, followed by observed variables.
n_opts <- formatting(list(shape = "polygon", sides = "6", color = "orange"), list(shape = "polygon", sides = "8",color = "blue"), type = "node")
lavaanPlot2(fit2, include = "covs", labels = labels2,
graph_options = list(label = "my first graph with signficance stars"),
node_options = n_opts,
edge_options = list(color = "grey"),
stars = c("latent"),
coef_labels = TRUE)
Here’s an example with the node formatting, along with some differentiated edge formatting for the regression, latent, covariance groups.
e_opts <- formatting(list(color = "orange"),list(color = "red", penwidth = 6), list(color = "blue"), type = "edge")
lavaanPlot2(fit2, include = "covs", labels = labels2,
graph_options = list(label = "my first graph with signficance stars"),
node_options = n_opts,
edge_options = e_opts,
stars = c("latent"),
coef_labels = TRUE)
And finally we can add on another layer of edge formatting, on top of the standard groupings of edges used above, with our own custom selected edges, by specifying parameter labels in the model specification. To add labels, simply pre-multiply a text string with the name of the variable.
HS.model <- ' visual =~ A*x1 + x2 + x3
textual =~ x4 + x5 + B*x6
speed =~ x7 + x8 + x9
'
fit2 <- cfa(HS.model, data=HolzingerSwineford1939)
Then we can specify a set of options for the custom parameter labels, and make a list of the two different sets of edge options.
c_opts <- formatting(list(color = "yellow", penwidth = 8), list(color = "blue", penwidth = 10), type = "custom", groups = c("A", "B"))
lavaanPlot2(fit2, include = "covs", labels = labels2,
graph_options = list(label = "my first graph with signficance stars"),
node_options = n_opts,
edge_options = list(e_opts, c_opts),
stars = c("latent"),
coef_labels = TRUE)