Introduction

The package can be used to estimate latent variable count regression models in one or multiple groups. In its simplest form, it can estimate ordinary Poisson or negative binomial regression models with manifest covariates in one group (similar to the glm()-function from the stats package or the glm.nb()-function from the MASS package). In its most complex form, it can regress a count variable on multiple manifest and latent covariates within multiple groups. Let’s see, how it works!

Simple Poisson Regression Model

As said before, the simplest case that can be estimated with lavacrag is an ordinary Poisson regression model, regressing a count outcome Y on a manifest covariate Z with \[ \begin{align*} E(Y|Z) &= \mu_Y = \exp(\beta_0 + \beta_1 \cdot Z)\\ Y &\sim \mathcal{P}(\lambda = \mu_Y) \end{align*} \] In our example dataset, we can fit this model and compare it to the output of the glm()-function from the stats package:

# Usage of main function: countreg(y ~ z, data = d, family = "poisson")
m0 <- countreg(dv ~ z11, data = example01, family = "poisson")
#> Computing starting values...done. Took: 0 s
#> Fitting the model...done. Took: 0.1 s
#> Computing standard errors...done. Took: 0 s

m1 <- glm(dv ~ z11, data = example01, family = poisson())

summary(m0)
#> 
#> 
#> --------------------- Group 1 ----------------------- 
#> 
#> Regression:
#>       Estimate       SE   Est./SE   p-value
#> 1        2.759   0.0146       189         0
#> z11     -0.138   0.0081       -17         0
#> 
#> Means:
#>       Estimate       SE   Est./SE   p-value
#> z11       1.58   0.0418      37.8         0
#> 
#> Variances:
#>       Estimate       SE   Est./SE   p-value
#> z11       1.52   0.0729      20.9         0

summary(m1)
#> 
#> Call:
#> glm(formula = dv ~ z11, family = poisson(), data = example01)
#> 
#> Coefficients:
#>              Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)  2.759062   0.014636  188.51   <2e-16 ***
#> z11         -0.137692   0.008095  -17.01   <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for poisson family taken to be 1)
#> 
#>     Null deviance: 2144.8  on 870  degrees of freedom
#> Residual deviance: 1844.0  on 869  degrees of freedom
#> AIC: 5588.4
#> 
#> Number of Fisher Scoring iterations: 4

Negative Binomial Regression with Latent Covariate

In the next step, we add a latent covariate to the model. That is, we use the option lv to specify a list of latent variables giving the names of the latent variables and a character vector of indicators measuring the latent variable. We can use the name of the latent variable within the forml option. In addition, we change family to be “nbinom” in oder to estimate a negative binomial regression, that is, adding a dispersion parameter to the model:

m2 <- countreg(dv ~ eta1,
    lv = list(eta1 = c("z41", "z42", "z43")),
    data = example01,
    family = "nbinom"
)
#> Computing starting values...done. Took: 0.1 s
#> Fitting the model...done. Took: 0.9 s
#> Computing standard errors...done. Took: 0.5 s

summary(m2)
#> 
#> 
#> --------------------- Group 1 ----------------------- 
#> 
#> Regression:
#>        Estimate       SE   Est./SE    p-value
#> 1        2.6862   0.0238    113.05   0.00e+00
#> eta1    -0.0834   0.0119     -7.03   2.06e-12
#> 
#>              Estimate      SE   Est./SE   p-value
#> Dispersion       9.75   0.871      11.2         0
#> 
#> Means:
#>        Estimate       SE   Est./SE   p-value
#> eta1       1.62   0.0605      26.8         0
#> 
#> Variances:
#>        Estimate      SE   Est./SE   p-value
#> eta1       1.95   0.166      11.8         0
#> 
#> Measurement Model:
#> Intercepts:
#>           Estimate      SE   Est./SE    p-value
#> z41 ~ 1      0.000      NA        NA         NA
#> z42 ~ 1     -0.114   0.115    -0.987   0.323470
#> z43 ~ 1     -0.435   0.122    -3.581   0.000343
#> 
#> Loadings:
#>               Estimate       SE   Est./SE   p-value
#> eta1 =~ z41       1.00       NA        NA        NA
#> eta1 =~ z42       1.29   0.0571      22.6         0
#> eta1 =~ z43       1.35   0.0617      21.8         0
#> 
#> Residual Variances:
#>              Estimate       SE   Est./SE   p-value
#> z41 ~~ z41       1.46   0.0929     15.69         0
#> z42 ~~ z42       1.45   0.1231     11.80         0
#> z43 ~~ z43       1.27   0.1376      9.26         0

Multi-group Poisson Regression with Latent and Manifest Covariates

In this final model, we use a combination of manifest and latent covariates in the forml option, that is, one of the covariates is defined in the lv and the other is observed in the dataset. In addition, we specify a multi-group structural equation model using the group option.

m3 <- countreg(dv ~ eta1 + z11,
    lv = list(eta1 = c("z41", "z42", "z43")),
    group = "treat",
    data = example01,
    family = "poisson"
)
#> Computing starting values...done. Took: 1 s
#> Fitting the model...done. Took: 4.4 s
#> Computing standard errors...done. Took: 2.9 s

summary(m3)
#> 
#> 
#> --------------------- Group 1 ----------------------- 
#> 
#> Regression:
#>        Estimate       SE   Est./SE    p-value
#> 1         2.783   0.0276    100.74   0.00e+00
#> z11      -0.127   0.0126    -10.07   0.00e+00
#> eta1     -0.102   0.0155     -6.55   5.91e-11
#> 
#> Means:
#>        Estimate       SE   Est./SE   p-value
#> eta1       1.58   0.0794      20.0         0
#> z11        1.59   0.0627      25.4         0
#> 
#> Variances:
#>        Estimate      SE   Est./SE   p-value
#> eta1       1.88   0.196      9.59         0
#> z11        1.69   0.115     14.61         0
#> 
#> Covariances:
#>               Estimate       SE   Est./SE    p-value
#> eta1 ~~ z11      0.468   0.0988      4.74   2.11e-06
#> 
#> Measurement Model:
#> Intercepts:
#>           Estimate      SE   Est./SE    p-value
#> z41 ~ 1      0.000      NA        NA         NA
#> z42 ~ 1     -0.083   0.113    -0.736   0.461913
#> z43 ~ 1     -0.417   0.119    -3.497   0.000471
#> 
#> Loadings:
#>               Estimate       SE   Est./SE   p-value
#> eta1 =~ z41       1.00       NA        NA        NA
#> eta1 =~ z42       1.27   0.0556      22.9         0
#> eta1 =~ z43       1.34   0.0602      22.2         0
#> 
#> Residual Variances:
#>              Estimate      SE   Est./SE   p-value
#> z41 ~~ z41       1.52   0.133     11.38         0
#> z42 ~~ z42       1.47   0.159      9.20         0
#> z43 ~~ z43       1.48   0.167      8.87         0
#> 
#> 
#> --------------------- Group 2 ----------------------- 
#> 
#> Regression:
#>        Estimate       SE   Est./SE    p-value
#> 1         2.873   0.0233    123.47   0.000000
#> z11      -0.105   0.0124     -8.51   0.000000
#> eta1     -0.041   0.0118     -3.47   0.000528
#> 
#> Means:
#>        Estimate       SE   Est./SE   p-value
#> eta1       1.65   0.0737      22.3         0
#> z11        1.55   0.0546      28.4         0
#> 
#> Variances:
#>        Estimate       SE   Est./SE   p-value
#> eta1       2.12   0.2280      9.31         0
#> z11        1.37   0.0931     14.74         0
#> 
#> Covariances:
#>               Estimate       SE   Est./SE    p-value
#> eta1 ~~ z11      0.631   0.0988      6.39   1.71e-10
#> 
#> Measurement Model:
#> Intercepts:
#>           Estimate      SE   Est./SE    p-value
#> z41 ~ 1      0.000      NA        NA         NA
#> z42 ~ 1     -0.083   0.113    -0.736   0.461913
#> z43 ~ 1     -0.417   0.119    -3.497   0.000471
#> 
#> Loadings:
#>               Estimate       SE   Est./SE   p-value
#> eta1 =~ z41       1.00       NA        NA        NA
#> eta1 =~ z42       1.27   0.0556      22.9         0
#> eta1 =~ z43       1.34   0.0602      22.2         0
#> 
#> Residual Variances:
#>              Estimate      SE   Est./SE    p-value
#> z41 ~~ z41       1.34   0.118     11.37   0.00e+00
#> z42 ~~ z42       1.53   0.151     10.10   0.00e+00
#> z43 ~~ z43       1.12   0.174      6.43   1.24e-10