In this example, we will show how to use lslx
to conduct
multi-group factor analysis. The example uses data
HolzingerSwineford1939
in the package lavaan
.
Hence, lavaan
must be installed.
In the following specification, x1
- x9
is
assumed to be measurements of 3 latent factors: visual
,
textual
, and speed
.
<- "visual :=> 1 * x1 + x2 + x3
model_mgfa textual :=> 1 * x4 + x5 + x6
speed :=> 1 * x7 + x8 + x9"
The operator :=>
means that the LHS latent factors is
defined by the RHS observed variables. In this model,
visual
is mainly measured by x1
-
x3
, textual
is mainly measured by
x4
- x6
, and speed
is mainly
measured by x7
- x9
. Loadings of
x1
, x4
, and x7
are fixed at 1 for
scale setting. The above specification is valid for both groups. Details
of model syntax can be found in the section of Model Syntax via
?lslx
.
lslx
is written as an R6
class. Everytime
we conduct analysis with lslx
, an lslx
object
must be initialized. The following code initializes an lslx
object named lslx_mgfa
.
library(lslx)
<- lslx$new(model = model_mgfa,
lslx_mgfa data = lavaan::HolzingerSwineford1939,
group_variable = "school",
reference_group = "Pasteur")
An 'lslx' R6 class is initialized via 'data' argument.
Response Variables: x1 x2 x3 x4 x5 x6 x7 x8 x9
Latent Factors: visual textual speed
Groups: Grant-White Pasteur
Reference Group: Pasteur
NOTE: Because Pasteur is set as reference, coefficients in other groups actually represent increments from the reference.
Here, lslx
is the object generator for lslx
object and new
is the build-in method of lslx
to generate a new lslx
object. The initialization of
lslx
requires users to specify a model for model
specification (argument model
) and a data set to be fitted
(argument sample_data
). The data set must contain all the
observed variables specified in the given model. Because in this example
a multi-group analysis is considered, variable for group labeling
(argument group_variable
) must be specified. In lslx, two
types of parameterization can be used in multi-group analysis. The first
type is the same with the traditional multi-group SEM, which treats
model parameters in each group separately. The second type sets one
group as reference and treats model parameters in other groups as
increments with respect to the reference. Under the second type of
parameterization, the group heterogeneity can be efficiently explored if
we treat the increments as penalized parameters. In this example,
Pasteur
is set as reference. Hence, the parameters in
Grant-White
now reflect differences from the reference.
After an lslx
object is initialized, the heterogeneity
of a multi-group model can be quickly respecified by
$free_heterogeneity()
, $fix_heterogeneity()
,
and $penalize_heterogeneity()
methods. The following code
sets x2<-visual
, x3<-visual
,
x5<-textual
, x6<-textual
,
x8<-speed
, x9<-speed
, and
x2<-1
, x3<-1
, x5<-1
,
x6<-1
, x8<-1
, x9<-1
in
Grant-White
as penalized parameters. Note that parameters
in Grant-White
now reflect differences since
Pasteur
is set as reference.
$penalize_heterogeneity(block = c("y<-1", "y<-f"), group = "Grant-White") lslx_mgfa
The relation x1<-1 under Grant-White is set as PENALIZED with starting value = 0.
The relation x2<-1 under Grant-White is set as PENALIZED with starting value = 0.
The relation x3<-1 under Grant-White is set as PENALIZED with starting value = 0.
The relation x4<-1 under Grant-White is set as PENALIZED with starting value = 0.
The relation x5<-1 under Grant-White is set as PENALIZED with starting value = 0.
The relation x6<-1 under Grant-White is set as PENALIZED with starting value = 0.
The relation x7<-1 under Grant-White is set as PENALIZED with starting value = 0.
The relation x8<-1 under Grant-White is set as PENALIZED with starting value = 0.
The relation x9<-1 under Grant-White is set as PENALIZED with starting value = 0.
The relation x2<-visual under Grant-White is set as PENALIZED with starting value = 0.
The relation x3<-visual under Grant-White is set as PENALIZED with starting value = 0.
The relation x5<-textual under Grant-White is set as PENALIZED with starting value = 0.
The relation x6<-textual under Grant-White is set as PENALIZED with starting value = 0.
The relation x8<-speed under Grant-White is set as PENALIZED with starting value = 0.
The relation x9<-speed under Grant-White is set as PENALIZED with starting value = 0.
NOTE: Because Pasteur is set as reference, a relation under other group actually represents an increment.
NOTE: Please check whether the starting value for the increment represents a difference.
Since the homogeneity of latent factor means may not be a reasonable assumption when examining measurement invariance, the following code relaxes this assumption
$free_block(block = "f<-1", group = "Grant-White") lslx_mgfa
The relation visual<-1 under Grant-White is set as FREE with starting value = 0.
The relation textual<-1 under Grant-White is set as FREE with starting value = 0.
The relation speed<-1 under Grant-White is set as FREE with starting value = 0.
NOTE: Because Pasteur is set as reference, a relation under other group actually represents an increment.
NOTE: Please check whether the starting value for the increment represents a difference.
To see more methods to modify a specified model, please check the
section of Set-Related Method via ?lslx
.
After an lslx
object is initialized, method
$fit_mcp()
can be used to fit the specified model into the
given data with MCP.
$fit_mcp() lslx_mgfa
CONGRATS: Algorithm converges under EVERY specified penalty level.
Specified Tolerance for Convergence: 0.001
Specified Maximal Number of Iterations: 100
All the fitting result will be stored in the fitting
field of lslx_mgfa
.
Unlike traditional SEM analysis, lslx
fits the model
into data under all the penalty levels considered. To summarize the
fitting result, a selector to determine an optimal penalty level must be
specified. Available selectors can be found in the section of Penalty
Level Selection via ?lslx
. The following code summarize the
fitting result under the penalty level selected by Haughton’s Bayesian
information criterion (HBIC).
$summarize(selector = "hbic") lslx_mgfa
General Information
number of observations 301
number of complete observations 301
number of missing patterns none
number of groups 2
number of responses 9
number of factors 3
number of free coefficients 48
number of penalized coefficients 15
Numerical Conditions
selected lambda 0.134
selected delta 3.063
selected step none
objective value 0.485
objective gradient absolute maximum 0.001
objective Hessian convexity 0.187
number of iterations 11.000
loss value 0.430
number of non-zero coefficients 50.000
degrees of freedom 58.000
robust degrees of freedom 60.646
scaling factor 1.046
Fit Indices
root mean square error of approximation (rmsea) 0.090
comparative fit index (cfi) 0.919
non-normed fit index (nnfi) 0.900
standardized root mean of residual (srmr) 0.085
Likelihood Ratio Test
statistic df p-value
unadjusted 129.424 58.000 0.000
mean-adjusted 123.777 58.000 0.000
Root Mean Square Error of Approximation Test
estimate lower upper
unadjusted 0.090 0.065 0.115
mean-adjusted 0.089 0.063 0.114
Coefficient Test (Group = "Pasteur", Std.Error = "sandwich")
Factor Loading (reference component)
type estimate std.error z-value P(>|z|) lower upper
x1<-visual fixed 1.000 - - - - -
x2<-visual free 0.604 0.143 4.211 0.000 0.323 0.885
x3<-visual free 0.789 0.157 5.027 0.000 0.481 1.096
x4<-textual fixed 1.000 - - - - -
x5<-textual free 1.120 0.067 16.599 0.000 0.988 1.252
x6<-textual free 0.932 0.064 14.678 0.000 0.808 1.057
x7<-speed fixed 1.000 - - - - -
x8<-speed free 1.200 0.134 8.947 0.000 0.937 1.463
x9<-speed free 1.040 0.208 5.005 0.000 0.633 1.448
Covariance (reference component)
type estimate std.error z-value P(>|z|) lower upper
textual<->visual free 0.406 0.135 3.017 0.003 0.142 0.671
speed<->visual free 0.169 0.066 2.565 0.010 0.040 0.298
speed<->textual free 0.173 0.060 2.899 0.004 0.056 0.290
Variance (reference component)
type estimate std.error z-value P(>|z|) lower upper
visual<->visual free 0.801 0.230 3.489 0.000 0.351 1.252
textual<->textual free 0.880 0.135 6.532 0.000 0.616 1.144
speed<->speed free 0.305 0.083 3.684 0.000 0.143 0.467
x1<->x1 free 0.556 0.181 3.077 0.002 0.202 0.910
x2<->x2 free 1.269 0.172 7.370 0.000 0.931 1.606
x3<->x3 free 0.881 0.131 6.744 0.000 0.625 1.136
x4<->x4 free 0.446 0.070 6.328 0.000 0.308 0.584
x5<->x5 free 0.502 0.083 6.019 0.000 0.339 0.666
x6<->x6 free 0.263 0.058 4.518 0.000 0.149 0.377
x7<->x7 free 0.849 0.113 7.516 0.000 0.628 1.071
x8<->x8 free 0.516 0.094 5.469 0.000 0.331 0.701
x9<->x9 free 0.656 0.118 5.573 0.000 0.426 0.887
Intercept (reference component)
type estimate std.error z-value P(>|z|) lower upper
x1<-1 free 4.914 0.095 51.569 0.000 4.727 5.101
x2<-1 free 6.087 0.080 75.899 0.000 5.930 6.245
x3<-1 free 2.487 0.093 26.780 0.000 2.305 2.669
x4<-1 free 2.778 0.087 31.915 0.000 2.608 2.949
x5<-1 free 4.035 0.103 39.171 0.000 3.833 4.237
x6<-1 free 1.926 0.075 25.776 0.000 1.779 2.072
x7<-1 free 4.432 0.087 51.183 0.000 4.263 4.602
x8<-1 free 5.569 0.074 75.578 0.000 5.425 5.714
x9<-1 free 5.409 0.070 77.099 0.000 5.272 5.547
Coefficient Test (Group = "Grant-White", Std.Error = "sandwich")
Factor Loading (increment component)
type estimate std.error z-value P(>|z|) lower upper
x1<-visual fixed 0.000 - - - - -
x2<-visual pen 0.000 - - - - -
x3<-visual pen 0.000 - - - - -
x4<-textual fixed 0.000 - - - - -
x5<-textual pen 0.000 - - - - -
x6<-textual pen 0.000 - - - - -
x7<-speed fixed 0.000 - - - - -
x8<-speed pen 0.000 - - - - -
x9<-speed pen 0.000 - - - - -
Covariance (increment component)
type estimate std.error z-value P(>|z|) lower upper
textual<->visual free 0.020 0.144 0.136 0.892 -0.263 0.303
speed<->visual free 0.144 0.105 1.363 0.173 -0.063 0.351
speed<->textual free 0.050 0.108 0.461 0.645 -0.163 0.263
Variance (increment component)
type estimate std.error z-value P(>|z|) lower upper
visual<->visual free -0.085 0.198 -0.427 0.669 -0.473 0.304
textual<->textual free -0.011 0.167 -0.063 0.950 -0.337 0.316
speed<->speed free 0.170 0.094 1.801 0.072 -0.015 0.355
x1<->x1 free 0.094 0.178 0.530 0.596 -0.254 0.442
x2<->x2 free -0.329 0.221 -1.490 0.136 -0.761 0.104
x3<->x3 free -0.277 0.138 -2.000 0.045 -0.548 -0.006
x4<->x4 free -0.103 0.094 -1.101 0.271 -0.286 0.080
x5<->x5 free -0.126 0.103 -1.220 0.223 -0.327 0.076
x6<->x6 free 0.174 0.093 1.874 0.061 -0.008 0.356
x7<->x7 free -0.250 0.133 -1.886 0.059 -0.510 0.010
x8<->x8 free -0.109 0.142 -0.769 0.442 -0.387 0.169
x9<->x9 free -0.126 0.142 -0.884 0.377 -0.404 0.153
Intercept (increment component)
type estimate std.error z-value P(>|z|) lower upper
visual<-1 free 0.050 0.132 0.377 0.706 -0.209 0.309
textual<-1 free 0.576 0.120 4.789 0.000 0.340 0.812
speed<-1 free -0.072 0.089 -0.807 0.419 -0.245 0.102
x1<-1 pen 0.000 - - - - -
x2<-1 pen 0.000 - - - - -
x3<-1 pen -0.531 0.117 -4.520 0.000 -0.761 -0.301
x4<-1 pen 0.000 - - - - -
x5<-1 pen 0.000 - - - - -
x6<-1 pen 0.000 - - - - -
x7<-1 pen -0.440 0.108 -4.065 0.000 -0.652 -0.228
x8<-1 pen 0.000 - - - - -
x9<-1 pen 0.000 - - - - -
In this example, we can see that all of the loadings are invariant
across the two groups. However, the intercepts of x3
and
x7
seem to be not invariant. The $summarize()
method also shows the result of significance tests for the coefficients.
In lslx
, the default standard errors are calculated based
on sandwich formula whenever raw data is available. It is generally
valid even when the model is misspecified and the data is not normal.
However, it may not be valid after selecting an optimal penalty
level.