Introduction to rrMixture

Background

Consider the following multivariate linear regression model where we want to study a relationship between multiple responses and a set of predictors: \[ Y=XB+E \] where $Y$ is an $n \times q$ response matrix, $X$ is an $n \times p$ design matrix, $B$ is a $p \times q$ coefficient matrix, and $E$ is an $n \times q$ error matrix. To explore the relationship between $p$ predictors and $q$ responses, our interest is to estimate the coefficient matrix $B$. Traditionally, it can be estimated by \[ \hat{B}_L=\arg\min_{B} \|Y-XB\|^2_F=(X^\top X)^{-}X^\top Y \] where $\|\cdot\|_F$ denotes the Frobenius norm and $(\cdot)^{-}$ denotes a Moore-Penrose inverse. This approach is simple and fast, but it has some limitations:

It ignores the joint structure of the multivariate response.
It may not be feasible with high dimensionality.
It assumes a full-rank structure of the coefficient matrix.
It assumes the homogeneity of data, i.e., $n$ observations come from a single group.

To overcome the drawbacks, rrmix has been developed to fit reduced-rank mixture models, which simultaneously take into account the joint structure of the multivariate response, possibility of low-rank structure of coefficient matrix and heterogeneity of data.

Reduced-rank mixture models in multivariate regression

Suppose the $n$ observations belong to $K$ different subpopulations, implying the heterogeneity of data. The marginal distribution of $\mathbf{y}_i$ can be assumed to be \[\begin{equation}\label{mixmodel} \mathbf{y}_i|\mathbf{x}_i\sim \sum_{k=1}^K \pi_k N_q(B_k^\top\mathbf{x}_i,\Sigma_k) \end{equation}\] for $i=1,\cdots,n$ with the Gaussian assumption. Accordingly, the objective function to maximize is the log-likelihood of a $K$-component finite Gaussian mixture model, i.e., \[\begin{equation*} \mathcal{L}(\boldsymbol{\theta}) = \sum_{i=1}^n \log \left[ \sum_{k=1}^K \pi_k \phi_q(\mathbf{y}_i;B_k^\top\mathbf{x}_i,\Sigma_k) \right],~~~\mathbf{y}_i \in \mathbb{R}^q, \end{equation*}\] where $\boldsymbol{\theta}=(\pi_1,B_1,\Sigma_1,\cdots,\pi_K,B_K,\Sigma_K)^\top$ is a collection of the unknown parameters with $\sum_{k=1}^K \pi_k=1$ and $\phi_q(\cdot;\mu,\Sigma)$ denotes the density function of $N_q(\mu,\Sigma)$. Under a reduced rank regression framework, our objective function is the penalized log-likelihood which can be written as \[\begin{equation} \label{F} \mathcal{F}(\boldsymbol{\theta}) = \mathcal{L}(\boldsymbol{\theta}) - \mathcal{P}_{\boldsymbol{\lambda}}(\boldsymbol{\theta}), \end{equation}\] where $\boldsymbol{\lambda}=(\lambda_1,\cdots,\lambda_K)^\top$ is the tuning parameter. For example, the rank penalty is \[\begin{equation} \label{rp} \mathcal{P}_{\boldsymbol{\lambda}}(\boldsymbol{\theta}) = \frac{1}{2}\sum_{k=1}^K \lambda_k^2 \text{rank}(B_k) \end{equation}\] and the adaptive nuclear norm penalty is \[\begin{equation} \label{annp} \mathcal{P}_{\boldsymbol{\lambda}}(\boldsymbol{\theta}) = \sum_{k=1}^K \lambda_k ||XB_k||_{*w} \end{equation}\] where $||\cdot||_{*w}=\sum_{i=1}^{p \wedge q} w_id_i(\cdot)$, $n \wedge q = \min(n,q)$, $w_i = d_i^{-\gamma}(X\hat{B}_L)$ following Zou (2006), $d_i(\cdot)$ denotes the $i$th largest singular value of a matrix, and $\gamma$ is a nonnegative constant. The rank-penalized and adaptive nuclear norm penalized estimation have originally been developed in Bunea et al. (2011) and Chen et al. (2013) respectively, for non-mixture cases, and adapted for an extension to mixture models in Kang et al. (2022+).

Introduction to `rrMixture`

rrMixture is an R package for fitting reduced-rank mixture models in multivariate regression via an iterative algorithm. It allows users to

find an optimal number of mixture components $K$ for a given data set;
estimate the parameter $\boldsymbol{\theta}=(\pi_1,B_1,\Sigma_1,\cdots,\pi_K,B_K,\Sigma_K)^\top$ which maximizes $\mathcal{F}(\boldsymbol{\theta})$;
find the best tuning parameter(s) $\boldsymbol{\lambda}=(\lambda_1,\cdots,\lambda_K)^\top$ and/or $\gamma$; and
determine the ranks of $K$ coefficient matrices.

For simplicity, we assume $\lambda_1=\cdots=\lambda_K=\lambda$ and try to find an optimal value of $\lambda$. As of version 0.1-2, available methods are

FR: full-ranked estimation with $\mathcal{P}_{\boldsymbol{\lambda}}(\boldsymbol{\theta})=0$.
RP: rank-penalized estimation with $\mathcal{P}_{\boldsymbol{\lambda}}(\boldsymbol{\theta}) = \frac{1}{2}\lambda^2\sum_{k=1}^K \text{rank}(B_k)$.
ANNP: adaptive nuclear norm penalized estimation with $\mathcal{P}_{\boldsymbol{\lambda}}(\boldsymbol{\theta}) = \lambda \sum_{k=1}^K ||XB_k||_{*w}$.

This document shows how to make use of rrMixture functionalities. rrMixture also provides a set of tools to summarize and visualize the estimation results.

Getting started: Installation

rrMixture is available from CRAN. Install the package and load the library:

install.packages("rrMixture")
library(rrMixture)

A real data example: `tuna` data

To describe the usage of rrMixture, we here consider a real data set which is available within the R package bayesm. It contains the volume of weekly sales (Move) for seven of the top 10 U.S. brands in the canned tuna product category for $n=338$ weeks between September 1989 and May 1997 along with with a measure of the display activity (Nsale) and the log price of each brand (Lprice). See Chevalier et al. (2003) for details. The goal is to study the effect of prices and promotional activities on sales for these 7 products; therefore, we have the following $X$ and $Y$ matrices and thus $n=338$, $q=7$, and $p=15$. Try ?tuna for details.

# Load and pre-process a data set
library(bayesm)
data(tuna)
tunaY <- log(tuna[, c("MOVE1", "MOVE2", "MOVE3", "MOVE4", "MOVE5", "MOVE6", "MOVE7")])
tunaX <- tuna[, c("NSALE1", "NSALE2", "NSALE3", "NSALE4", "NSALE5", "NSALE6", "NSALE7",
              "LPRICE1", "LPRICE2", "LPRICE3", "LPRICE4", "LPRICE5", "LPRICE6", "LPRICE7")]
tunaX <- cbind(intercept = 1, tunaX)

Parameter estimation using `rrmix()`

rrmix() function estimates $\boldsymbol{\theta}$ via an EM algorithm using either the full-ranked, rank penalized, or adaptive nuclear norm penalized mixture models when $K$, $\lambda$, and $\gamma$ are given; Later we will discuss another function for finding optimal values of these parameters. For rrmix(), we set several arguments such as:

K: number of mixture components.
X and Y: $X$ and $Y$ matrices of data.
est: estimation method to use, either "FR" (full-ranked), "RP" (rank penalized), or "ANNP" (adaptive nuclear norm penalized).
lambda and gamma: tuning parameter(s) to set. lambda is only used for "RP" and "ANNP", and gamma is only used for "ANNP".

In order to implement the EM algorithm, we need starting values of $\boldsymbol{\theta}$ which can be set by the following arguments:

n.init: number of repetitions of initialization to try. Note that the final estimates of $\boldsymbol{\theta}$ depends on the starting values, so we try to get starting values n.init times and use the one with the largest log-likelihood as the final starting values. In order to assign initial mixture membership to each observation, two methods are used: K-means and random clustering.
seed (optional): seed number for reproducibility of starting values.

An example with tuna data:

# Parameter estimation with `rrmix()` using the rank penalized method
tuna.mod <- rrmix(K = 2, X = tunaX, Y = tunaY, est = "RP", lambda = 3,
                  seed = 100, n.init = 100)

We can find the estimated parameters, $\hat{\boldsymbol{\theta}}=(\hat\pi_1,\hat{B}_1,\hat\Sigma_1,\cdots,\hat\pi_K,\hat{B}_K,\hat\Sigma_K)^\top$, with the command tuna.mod$para.

# Estimated parameters
tuna.mod$para

## [[1]]
## [[1]]$pi
## [1] 0.8793335
## 
## [[1]]$B
##                  MOVE1       MOVE2       MOVE3       MOVE4       MOVE5        MOVE6       MOVE7
## intercept  9.697739196  9.58004404 10.25972900 11.48293812  9.57257811  5.286435408 16.88180893
## NSALE1     0.264758797 -0.07357711  0.08848321 -0.06317495 -0.03905764  0.168597790 -0.34082103
## NSALE2    -0.030891006  0.15074311 -0.06805328 -0.17671419 -0.04559106  0.136535925 -0.19511789
## NSALE3    -0.167928967 -0.03270118  0.12373708  0.00584671  0.09893268 -0.006719683 -0.13678998
## NSALE4    -0.046126451 -0.05398677  0.02909986 -0.06003936 -0.11198196 -0.030637909  0.01746416
## NSALE5     0.037234871  0.10131262 -0.03173598  0.12825966  0.13640425 -0.019036824 -0.30948403
## NSALE6    -0.105596185 -0.11328508 -0.06710156 -0.16378112 -0.10056351  0.452642652 -0.24536842
## NSALE7     0.009898625 -0.13428886 -0.02874585 -0.19134459 -0.02986234 -0.103503684  0.50989322
## LPRICE1   -3.620489328  1.10874628  0.43853735  1.44290680 -0.07191063  0.274192590  0.21678455
## LPRICE2    0.605598440 -5.23779722  0.05152277  1.04481042 -0.16159788  0.315346800 -0.26706440
## LPRICE3   -1.312250355 -1.65032984 -4.18091735 -0.72328165  1.07934056  0.019059245 -0.48474486
## LPRICE4    0.757267996  0.91868374 -0.03822025 -5.00462654 -0.39856090 -0.043581011  0.28513770
## LPRICE5    1.245436719  1.85257681  0.32397858  0.74684669 -3.89550148  0.998771021 -0.92707576
## LPRICE6   -0.356928595 -0.98490400 -0.23129587 -2.43450428 -0.79775535  0.889300225 -6.44645553
## LPRICE7    0.245307808 -0.32982140 -0.43982712 -0.16059763 -0.13203062 -0.441057708 -2.18630625
## 
## [[1]]$Sigma
##             MOVE1        MOVE2        MOVE3       MOVE4       MOVE5        MOVE6        MOVE7
## MOVE1  0.10027979  0.022850112  0.014598890  0.03336132 0.019478733 -0.012177185  0.011459331
## MOVE2  0.02285011  0.239894750  0.021559759  0.02377434 0.031907926 -0.009215861 -0.008205517
## MOVE3  0.01459889  0.021559759  0.042124926  0.03644332 0.009534536  0.006753641 -0.004059046
## MOVE4  0.03336132  0.023774343  0.036443316  0.23853304 0.017091292 -0.018254510 -0.011638096
## MOVE5  0.01947873  0.031907926  0.009534536  0.01709129 0.035661859  0.001658638  0.006083920
## MOVE6 -0.01217719 -0.009215861  0.006753641 -0.01825451 0.001658638  0.086088152 -0.026781412
## MOVE7  0.01145933 -0.008205517 -0.004059046 -0.01163810 0.006083920 -0.026781412  0.302357944
## 
## 
## [[2]]
## [[2]]$pi
## [1] 0.1206665
## 
## [[2]]$B
##                MOVE1      MOVE2       MOVE3        MOVE4       MOVE5        MOVE6        MOVE7
## intercept 12.1618998  6.8135406 -19.7848459 11.134598356  8.36894956 -25.31234680 18.074540867
## NSALE1    -1.3799401  0.1082732   0.7756834  0.192467305  0.02149564   0.48124630 -0.024485074
## NSALE2     0.9309726  0.2748817   0.4271569 -0.706356643  0.10207620   0.97518008 -0.730762472
## NSALE3     1.9937937 -0.6066941  -0.1480685 -0.351401019  0.09757915   0.36671402 -0.407156358
## NSALE4     0.9637697  0.7211725   0.8428552 -0.005961994  0.69405966   0.86280190 -0.496122471
## NSALE5    -0.4147939  0.2721955   0.2844473  0.004886992  0.02884490   0.04020195 -0.021491251
## NSALE6    -0.6720243  0.2854669   1.3906442  0.509588914  0.16743924   0.77558586 -0.030125194
## NSALE7     1.2294988 -0.0724503  -2.0590187  0.245503899  0.27764132  -1.67780746 -0.004936125
## LPRICE1   -5.1225754  0.8407906   3.0890073  0.683339339 -0.11384347   3.81927062  0.914629856
## LPRICE2    1.0451861 -3.7460418  -1.8067515 -0.997563203 -0.28016256   0.22614671 -0.232220724
## LPRICE3    8.5212020 -1.3526899 -22.3310971  0.762034375 -2.52342959 -13.92983469 -0.627547939
## LPRICE4    3.9312856  2.4825187  -6.8921826 -2.987481081  4.06841349  -1.37724156 -0.484910359
## LPRICE5    0.8080700 -0.3677838  12.9709864 -1.101818729 -6.81076437  -1.69780303 -0.459393781
## LPRICE6   -5.8970984  2.1119481  24.5489270 -2.058549217  3.42907251  32.07627090 -7.890724801
## LPRICE7    1.6976674 -0.2024230  -9.9204652  1.918824618  0.70354917  -4.31166964 -4.466990108
## 
## [[2]]$Sigma
##              MOVE1       MOVE2        MOVE3        MOVE4       MOVE5       MOVE6       MOVE7
## MOVE1  0.462506694 -0.06348751  0.138542542  0.008465615 -0.01694162  0.01851249 -0.09422843
## MOVE2 -0.063487506  0.07349867 -0.088111118  0.029005883  0.03317261 -0.07376150  0.02811246
## MOVE3  0.138542542 -0.08811112  1.173924092  0.008063392 -0.14678689  0.84098527 -0.10013659
## MOVE4  0.008465615  0.02900588  0.008063392  0.080053501  0.01148986 -0.02862116  0.02509401
## MOVE5 -0.016941616  0.03317261 -0.146786892  0.011489858  0.05605736 -0.12300356  0.03315975
## MOVE6  0.018512495 -0.07376150  0.840985267 -0.028621155 -0.12300356  0.73881130 -0.03049476
## MOVE7 -0.094228429  0.02811246 -0.100136595  0.025094013  0.03315975 -0.03049476  0.07795657

Besides the estimated parameter $\hat{\boldsymbol{\theta}}$, one of the fundamental tasks in reduced rank estimation is rank determination. The estimated ranks of $K$ coefficient matrices can be found by tuna.mod$est.rank as follows. In this case with $\lambda = 3$, both coefficient matrices turned out to have full rank since $\min(p,q)=7$.

# Estimated ranks of coefficient matrices
tuna.mod$est.rank

## [1] 7 7

The package provides summary() and plot() methods for the rrmix object. With the summary() function, we can see summarized information of the fitted model. We can see that the EM algorithm is terminated after 57 iterations and the BIC of the fitted model is 3115.056.

# Summarize the estimation results
summary(tuna.mod)

## 
## Call:
## rrmix(K = 2, X = tunaX, Y = tunaY, est = "RP", lambda = 3, seed = 100, n.init = 100)
## 
## 
## 
## Method:  Rank penalized mixture 
## 
## Initialization:  K-means and random clustering 
## 
## Tuning Parameters:
##  lambda:  3 
## 
## Fitted Model:
##      Number of components:  2 
##            Log-likelihood:  -780.1514 
##  Penalized log-likelihood:  -843.1514 
##                       BIC:  3115.056 
## 
## 
## Number of Iterations:  57

The plot() function shows the log-likelihood and penalized log-likelihood values at each iteration as can be seen below. The ascent property of the penalized log-likelihood can be observed.

# Visualize the estimation results
plot(tuna.mod)

## 
## F: Penalized log-likelihood
## L: Log-likelihood

Parameter tuning with `tune.rrmix()`

As shown above, the rrmix() function estimates parameters with fixed $K$ (number of mixture components), $\lambda$ and $\gamma$ (tuning parameters). In practice, choosing proper values of $K$, $\lambda$ and $\gamma$ is important. This can be done using the tune.rrmix() function by performing a grid search over user-specified parameter ranges. Notice that the full-ranked method ("FR") has no tuning parameters, the rank penalized method ("RP") only has one tuning parameter, $\lambda$. The adaptive nuclear norm penalized method ("ANNP") has two tuning parameters, $\lambda$ and $\gamma$.

Suppose one wants to tune $\lambda$ for the rank penalized method with fixed $K=2$. We can try the following with a set of candidate values of $\lambda$:

# Parameter tuning with `tune.rrmix()` using the rank penalized method
tuna.tune1 <- tune.rrmix(K = 2, X = tunaX, Y = tunaY, est = "RP",
                         lambda = exp(seq(0, log(20), length = 20)),
                         seed = 100, n.init = 100)

The tune.rrmix object also has summary() and plot() methods. Both functions find an optimal tuning parameter that provides the best performance metric - the smallest BIC in this case - among the given set of candidate values.

# Summarize the results
summary(tuna.tune1)

## 
## Call:
## tune.rrmix(K = 2, X = tunaX, Y = tunaY, est = "RP", lambda = exp(seq(0, log(20), length = 20)), seed = 100, n.init = 100)
## 
## 
## Parameter tuning of 'rrmix':
## 
## - Method: Rank penalized mixture 
## 
## - Initialization:  K-means and random clustering 
## 
## - Performance metric: bic 
## 
## - Best performance: 2963.037 
## 
## - Best parameters:
##  K  lambda
##  2 9.09188
## 
## - Detailed performance results:
##    K    lambda     loglik    penalty  penloglik npar      bic est.rank1 est.rank2
## 1  2  1.000000  -780.1514   7.000000  -787.1514  267 3115.056         7         7
## 2  2  1.170780  -780.1514   9.595079  -789.7465  267 3115.056         7         7
## 3  2  1.370726  -780.1514  13.152221  -793.3036  267 3115.056         7         7
## 4  2  1.604818  -780.1514  18.028086  -798.1795  267 3115.056         7         7
## 5  2  1.878889  -780.1514  24.711559  -804.8630  267 3115.056         7         7
## 6  2  2.199765  -780.1514  33.872767  -814.0242  267 3115.056         7         7
## 7  2  2.575441  -780.1514  46.430269  -826.5817  267 3115.056         7         7
## 8  2  3.015274  -780.1514  63.643158  -843.7946  267 3115.056         7         7
## 9  2  3.530223  -780.1514  87.237306  -867.3887  267 3115.056         7         7
## 10 2  4.133114  -778.9185 111.037095  -889.9556  258 3060.183         7         6
## 11 2  4.838967  -778.9185 152.201389  -931.1199  258 3060.183         7         6
## 12 2  5.665365  -778.9185 208.626341  -987.5448  258 3060.183         7         6
## 13 2  6.632896  -756.4340 285.969468 -1042.4035  258 3015.214         7         6
## 14 2  7.765661  -798.5918 361.832928 -1160.4247  247 3035.476         7         5
## 15 2  9.091880  -800.2222 454.642521 -1254.8648  234 2963.037         7         4
## 16 2 10.644590  -906.4122 566.536496 -1472.9487  225 3123.010         6         4
## 17 2 12.462472 -1139.2845 543.596258 -1682.8807  186 3361.655         4         3
## 18 2 14.590812 -1149.9582 638.675408 -1788.6336  169 3284.011         4         2
## 19 2 17.082630 -1149.9582 875.448736 -2025.4070  169 3284.011         4         2
## 20 2 20.000000 -1350.4807 800.000000 -2150.4807  135 3487.073         3         1

When only one parameter is considered for tuning with other parameters being fixed, the parameter is on the x-axis and the performance metric is on the y-axis when using the plot() function. In this case, we use a grid of 20 $\lambda$ values equally spaced on the log scale, we can set transform.x = log for a better illustration.

# Visualize the results
plot(tuna.tune1, transform.x = log, xlab = "log(lambda)")

Now suppose one wants to tune both $K$ and $\lambda$. We can try the following command with setting K.max instead of K. If K.max is 3, the tune.rrmix() function try all cases where $K$ is 1 through 3. Note that $K=1$ indicates a non-mixture case. Hence, the tune.rrmix() function can automatically examine whether assuming heterogeneity of data is appropriate for a given data set.

# Parameter tuning with `tune.rrmix()` using the rank penalized method
tuna.tune2 <- tune.rrmix(K.max = 3, X = tunaX, Y = tunaY, est = "RP",
                         lambda = exp(seq(0, log(20), length = 20)),
                         seed = 100, n.init = 100)

# Summarize the results
summary(tuna.tune2)

## 
## Call:
## tune.rrmix(K.max = 3, X = tunaX, Y = tunaY, est = "RP", lambda = exp(seq(0, log(20), length = 20)), seed = 100, n.init = 100)
## 
## 
## Parameter tuning of 'rrmix':
## 
## - Method: Rank penalized mixture 
## 
## - Initialization:  K-means and random clustering 
## 
## - Performance metric: bic 
## 
## - Best performance: 2963.037 
## 
## - Best parameters:
##  K  lambda
##  2 9.09188
## 
## - Detailed performance results:
##    K    lambda     loglik     penalty  penloglik npar      bic est.rank1 est.rank2 est.rank3
## 1  1  1.000000 -1377.0748    7.000000 -1384.0748  133 3528.615         7        NA        NA
## 2  1  1.170780 -1377.0748    9.595079 -1386.6699  133 3528.615         7        NA        NA
## 3  1  1.370726 -1377.0748   13.152221 -1390.2271  133 3528.615         7        NA        NA
## 4  1  1.604818 -1377.0748   18.028086 -1395.1029  133 3528.615         7        NA        NA
## 5  1  1.878889 -1377.0748   24.711559 -1401.7864  133 3528.615         7        NA        NA
## 6  1  2.199765 -1377.0748   33.872767 -1410.9476  133 3528.615         7        NA        NA
## 7  1  2.575441 -1377.0748   46.430269 -1423.5051  133 3528.615         7        NA        NA
## 8  1  3.015274 -1434.5586   54.551278 -1489.1098  124 3591.175         6        NA        NA
## 9  1  3.530223 -1434.5586   74.774834 -1509.3334  124 3591.175         6        NA        NA
## 10 1  4.133114 -1434.5586  102.495780 -1537.0543  124 3591.175         6        NA        NA
## 11 1  4.838967 -1566.3163  117.077991 -1683.3943  113 3790.637         5        NA        NA
## 12 1  5.665365 -1566.3163  160.481801 -1726.7981  113 3790.637         5        NA        NA
## 13 1  6.632896 -1566.3163  219.976514 -1786.2928  113 3790.637         5        NA        NA
## 14 1  7.765661 -1566.3163  301.527440 -1867.8438  113 3790.637         5        NA        NA
## 15 1  9.091880 -1741.3884  247.986830 -1989.3752   85 3977.736         3        NA        NA
## 16 1 10.644590 -1741.3884  339.921898 -2081.3103   85 3977.736         3        NA        NA
## 17 1 12.462472 -1741.3884  465.939649 -2207.3280   85 3977.736         3        NA        NA
## 18 1 14.590812 -1941.6003  425.783606 -2367.3839   68 4279.168         2        NA        NA
## 19 1 17.082630 -2173.2818  291.816245 -2465.0980   49 4631.893         1        NA        NA
## 20 1 20.000000 -2173.2818  400.000000 -2573.2818   49 4631.893         1        NA        NA
## 21 2  1.000000  -780.1514    7.000000  -787.1514  267 3115.056         7         7        NA
## 22 2  1.170780  -780.1514    9.595079  -789.7465  267 3115.056         7         7        NA
## 23 2  1.370726  -780.1514   13.152221  -793.3036  267 3115.056         7         7        NA
## 24 2  1.604818  -780.1514   18.028086  -798.1795  267 3115.056         7         7        NA
## 25 2  1.878889  -780.1514   24.711559  -804.8630  267 3115.056         7         7        NA
## 26 2  2.199765  -780.1514   33.872767  -814.0242  267 3115.056         7         7        NA
## 27 2  2.575441  -780.1514   46.430269  -826.5817  267 3115.056         7         7        NA
## 28 2  3.015274  -780.1514   63.643158  -843.7946  267 3115.056         7         7        NA
## 29 2  3.530223  -780.1514   87.237306  -867.3887  267 3115.056         7         7        NA
## 30 2  4.133114  -778.9185  111.037095  -889.9556  258 3060.183         7         6        NA
## 31 2  4.838967  -778.9185  152.201389  -931.1199  258 3060.183         7         6        NA
## 32 2  5.665365  -778.9185  208.626341  -987.5448  258 3060.183         7         6        NA
## 33 2  6.632896  -756.4340  285.969468 -1042.4035  258 3015.214         7         6        NA
## 34 2  7.765661  -798.5918  361.832928 -1160.4247  247 3035.476         7         5        NA
## 35 2  9.091880  -800.2222  454.642521 -1254.8648  234 2963.037         7         4        NA
## 36 2 10.644590  -906.4122  566.536496 -1472.9487  225 3123.010         6         4        NA
## 37 2 12.462472 -1139.2845  543.596258 -1682.8807  186 3361.655         4         3        NA
## 38 2 14.590812 -1149.9582  638.675408 -1788.6336  169 3284.011         4         2        NA
## 39 2 17.082630 -1149.9582  875.448736 -2025.4070  169 3284.011         4         2        NA
## 40 2 20.000000 -1350.4807  800.000000 -2150.4807  135 3487.073         3         1        NA
## 41 3  1.000000  -416.3847   10.500000  -426.8847  401 3167.811         7         7         7
## 42 3  1.170780  -416.3847   14.392619  -430.7774  401 3167.811         7         7         7
## 43 3  1.370726  -416.3847   19.728331  -436.1131  401 3167.811         7         7         7
## 44 3  1.604818  -416.3847   27.042129  -443.4269  401 3167.811         7         7         7
## 45 3  1.878889  -416.3847   37.067338  -453.4521  401 3167.811         7         7         7
## 46 3  2.199765  -416.3847   50.809150  -467.1939  401 3167.811         7         7         7
## 47 3  2.575441  -416.3847   69.645403  -486.0301  401 3167.811         7         7         7
## 48 3  3.015274  -416.3847   95.464737  -511.8495  401 3167.811         7         7         7
## 49 3  3.530223  -416.3847  130.855959  -547.2407  401 3167.811         7         7         7
## 50 3  4.133114  -416.3847  179.367614  -595.7524  401 3167.811         7         7         7
## 51 3  4.838967  -434.3091  234.155983  -668.4651  392 3151.252         6         7         7
## 52 3  5.665365  -486.5614  304.915421  -791.4769  383 3203.349         6         7         6
## 53 3  6.632896  -555.2958  417.955376  -973.2512  383 3340.818         6         7         6
## 54 3  7.765661  -547.1041  512.596648 -1059.7007  361 3196.328         5         7         5
## 55 3  9.091880  -667.3193  619.967075 -1287.2864  339 3308.651         5         6         4
## 56 3 10.644590  -823.4459  736.497445 -1559.9434  315 3481.151         5         4         4
## 57 3 12.462472  -878.4625  931.879299 -1810.3418  300 3503.839         5         4         3
## 58 3 14.590812  -973.6083  958.013112 -1931.6215  255 3432.093         3         4         2
## 59 3 17.082630 -1090.1773  875.448736 -1965.6260  204 3368.256         1         3         2
## 60 3 20.000000 -1379.2821 1000.000000 -2379.2821  187 3847.474         1         2         2

When $K$ and $\lambda$ are tuned, a contour plot of the performance metric (BIC) can be drawn using the plot() function with $K$ and $\lambda$ being on the x- and y-axis, respectively.

# Visualize the results
plot(tuna.tune2, transform.y = log, ylab = "log(lambda)")

As we can see above, the best model is the one with $K=2$ and $\lambda = 8.728963$, implying that the data is heterogeneous and consisting of 2 subpopulations.

Note also that, if candidate values of $\lambda$ are not pre-specified, i.e., if lambda = NULL in tune.rrmix(), a data-adaptive range of $\lambda$ for tuning will internally be determined. In this case, users can set the argument n.lambda which specifies the number of $\lambda$ values to be explored in the range. Furthermore, if candidate values of $\gamma$ are not specified with est = "ANNP" in tune.rrmix(), gamma = 2 will be used by default since it generally showed good performance in Chen et al. (2013).

Interpretation of final model

Let’s see how we can interpret the final model determined by tune.rrmix(). As mentioned earlier, $\hat{\boldsymbol{\theta}}=(\hat\pi_1,\hat{B}_1,\hat\Sigma_1,\cdots,\hat\pi_K,\hat{B}_K,\hat\Sigma_K)^\top$ can be obtained by best.mod$para.

# The final model
best.K <- summary(tuna.tune2)$best.K
best.lambda <- summary(tuna.tune2)$best.lambda
best.mod <- rrmix(K = best.K, X = tunaX, Y = tunaY, est = "RP", lambda = best.lambda,
                  seed = 100, n.init = 100)
summary(best.mod)

## 
## Call:
## rrmix(K = best.K, X = tunaX, Y = tunaY, est = "RP", lambda = best.lambda, seed = 100, n.init = 100)
## 
## 
## 
## Method:  Rank penalized mixture 
## 
## Initialization:  K-means and random clustering 
## 
## Tuning Parameters:
##  lambda:  9.09188 
## 
## Fitted Model:
##      Number of components:  2 
##            Log-likelihood:  -800.2222 
##  Penalized log-likelihood:  -1254.865 
##                       BIC:  2963.037 
## 
## 
## Number of Iterations:  54

best.mod$para

## [[1]]
## [[1]]$pi
## [1] 0.881181
## 
## [[1]]$B
##                  MOVE1       MOVE2       MOVE3         MOVE4       MOVE5        MOVE6       MOVE7
## intercept  9.831832635  9.65864042 10.14738007 11.1979483392  9.48687846  6.153218054 17.18630762
## NSALE1     0.245720206 -0.07798723  0.08916616 -0.0549966149 -0.03548999  0.143390850 -0.36154379
## NSALE2    -0.036411994  0.14667386 -0.06105783 -0.1633427541 -0.03813799  0.100875057 -0.21237738
## NSALE3    -0.175968205 -0.03849338  0.11794301  0.0003528289  0.09136335 -0.006991503 -0.13725684
## NSALE4    -0.011456346 -0.03916083  0.03453038 -0.0711003217 -0.10492094 -0.024757594  0.02609806
## NSALE5     0.046909727  0.10183915 -0.03315014  0.1204480450  0.13378362 -0.015261578 -0.30355839
## NSALE6    -0.110186495 -0.11943755 -0.06486655 -0.1577068083 -0.09670243  0.409968442 -0.25879877
## NSALE7     0.004071193 -0.13539608 -0.01688634 -0.1759992155 -0.02189734 -0.137505047  0.50195911
## LPRICE1   -3.672011093  1.10326249  0.41067244  1.4165535971 -0.08589023  0.314033147  0.19650308
## LPRICE2    0.596909979 -5.25728296  0.06323593  1.0739359825 -0.16848094  0.229513404 -0.29174927
## LPRICE3   -1.306251346 -1.65267666 -4.21361349 -0.7814175830  0.99484564  0.165184982 -0.41794841
## LPRICE4    0.786561736  0.93243282 -0.01614248 -4.9895399813 -0.38006124 -0.058316769  0.28721163
## LPRICE5    1.375630514  1.88822889  0.44238323  0.8539790130 -3.87533825  0.407177105 -1.03070115
## LPRICE6   -0.502635251 -1.05582296 -0.16117212 -2.2058663320 -0.70630650  0.294804445 -6.68745467
## LPRICE7    0.275847305 -0.31984729 -0.42179987 -0.1370997993 -0.14293543 -0.555428679 -2.19622222
## 
## [[1]]$Sigma
##             MOVE1        MOVE2        MOVE3       MOVE4       MOVE5        MOVE6        MOVE7
## MOVE1  0.10433508  0.023733541  0.015521534  0.03247732 0.019452039 -0.014019707  0.011213734
## MOVE2  0.02373354  0.239529549  0.022272681  0.02428049 0.032421400 -0.011098250 -0.009117916
## MOVE3  0.01552153  0.022272681  0.042279754  0.03658732 0.010310018  0.008806337 -0.003876825
## MOVE4  0.03247732  0.024280488  0.036587324  0.23743596 0.017621679 -0.013435994 -0.010668746
## MOVE5  0.01945204  0.032421400  0.010310018  0.01762168 0.037183284  0.003938157  0.005520488
## MOVE6 -0.01401971 -0.011098250  0.008806337 -0.01343599 0.003938157  0.066793640 -0.034358863
## MOVE7  0.01121373 -0.009117916 -0.003876825 -0.01066875 0.005520488 -0.034358863  0.298840993
## 
## 
## [[2]]
## [[2]]$pi
## [1] 0.118819
## 
## [[2]]$B
##                 MOVE1       MOVE2        MOVE3       MOVE4        MOVE5        MOVE6       MOVE7
## intercept 10.75741485  3.05845114  24.58595902 11.65593374  4.356064495  6.222446295 15.38872468
## NSALE1    -0.05534936  0.05831940   0.12594001  0.14424588  0.006196908  0.034632335 -0.05045462
## NSALE2     0.06893836  0.49625290  -1.39789028 -0.62327880  0.183635380 -0.099207906 -0.36455311
## NSALE3     0.35931685 -0.27753088  -1.12758206 -0.34110929  0.310196484 -0.006793506  0.11600480
## NSALE4     0.17604880  0.86240476  -1.52812832 -0.32588892  0.437330138  0.065938270 -0.42558206
## NSALE5    -0.05791128  0.22839832   0.18319877  0.06554551 -0.028323271  0.021209575 -0.05856984
## NSALE6     0.01319062  0.28873577   0.57403352  0.53897724  0.115680162  0.190861213  0.04077665
## NSALE7     0.27858992 -0.01130092  -0.26308287 -0.06083649  0.185879055  0.061231734  0.20101446
## LPRICE1   -0.07084777  0.50773376   0.52030640  0.83309298  0.235197471  0.288331118 -0.15013008
## LPRICE2    0.02204864 -3.47196226  -2.94020979 -1.05188581  0.005805008 -0.563784272  0.02242193
## LPRICE3   -2.16412699 -1.48644936  -0.03924001  0.07432347 -1.213040603 -0.683191409 -1.80261974
## LPRICE4    3.57952823  2.78110427 -10.34939108 -4.40864029  2.944292619  0.068353213  0.42155101
## LPRICE5   -3.90373082  0.72735230  10.26220480 -2.58037770 -5.788255976 -2.167315583 -0.37680378
## LPRICE6    1.70570061  4.94764428 -20.55093102 -2.16448806  5.870869263  0.849327803 -5.32076485
## LPRICE7   -1.89520208  0.26344154  -5.69867364  0.60316534  0.825501480 -0.091956650 -3.72405036
## 
## [[2]]$Sigma
##             MOVE1       MOVE2      MOVE3       MOVE4       MOVE5       MOVE6       MOVE7
## MOVE1  1.55349212 -0.21819091  0.4478371  0.01521395 -0.08972691  0.02398342 -0.34739213
## MOVE2 -0.21819091  0.10541875 -0.2066950  0.03113594  0.04890543 -0.12597494  0.07756811
## MOVE3  0.44783706 -0.20669504  2.3161115 -0.09818200 -0.17691700  1.51793400 -0.35143336
## MOVE4  0.01521395  0.03113594 -0.0981820  0.08974138  0.02265828 -0.11688499  0.01242897
## MOVE5 -0.08972691  0.04890543 -0.1769170  0.02265828  0.09515097 -0.21162975  0.02438693
## MOVE6  0.02398342 -0.12597494  1.5179340 -0.11688499 -0.21162975  1.33305219 -0.08158394
## MOVE7 -0.34739213  0.07756811 -0.3514334  0.01242897  0.02438693 -0.08158394  0.17317664

The estimated ranks of $\hat{B}_1$ and $\hat{B}_2$ are 7 and 4, respectively, as follows. In this case, the first coefficient matrix turned out to have full rank and the second coefficient matrix had low-rank structure since full rank is $\min(p,q)=7$.

# Estimated ranks of coefficient matrices
best.mod$est.rank

## [1] 7 4

We can also get information on the membership of mixture components using the following command. The final model suggests that, among $n=338$ observations, 299 observations belong to the first mixture component and the remaining 39 observations belong to the second mixture component.

# Membership information of mixture components
best.mod$ind

##   [1] 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 2 1 1 2 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
##  [98] 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [195] 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [292] 1 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1

best.mod$n.est

## [1] 298  40

Comparison with existing R package `rrpack`

The package rrpack allows users to use a variety of reduced-rank estimation methods in multivariate regression. For example, the rrr() function with the argument penaltySVD = "ann" provides a solution of the adaptive nuclear norm penalized estimator (Chen et al., 2013). One difference between the rrr() function from the rrpack package and the rrmix() function from the rrMixture package is that rrmix() incorporates the idea of mixture models. Therefore, when the number of mixture components is $K=1$, rrmix() produces the same solution with that of rrr() with the same tuning parameter(s). Let’s see an example using the tuna data.

# rrpack package
require(rrpack)
rfit <- rrr(Y = as.matrix(tunaY), X = as.matrix(tunaX),
            modstr = list(lambda = 3, gamma = 2), penaltySVD = "ann")
# estimated coefficient matrix
coef(rfit)

##                  [,1]        [,2]        [,3]        [,4]        [,5]        [,6]        [,7]
## intercept 11.03998749  9.83249740  3.56373788 10.70868788 10.19160504  2.72918239 16.83806187
## NSALE1     0.07121642 -0.08071614  0.39618150 -0.06486088 -0.06339453  0.38363150 -0.30735354
## NSALE2    -0.05658269  0.24055106 -0.02932187 -0.26515306 -0.04092540  0.12900651 -0.21566104
## NSALE3     0.02024762 -0.10537125  0.12965432  0.01079927  0.07390348  0.03451341 -0.14329480
## NSALE4    -0.02895474 -0.03509104  0.16751885 -0.05675389 -0.12375623  0.02407959 -0.01758902
## NSALE5     0.02998639  0.12148697 -0.12686690  0.15788231  0.15495945 -0.04865510 -0.31089527
## NSALE6    -0.19737220 -0.10438770  0.37455663 -0.13469460 -0.12757906  0.56894979 -0.22060351
## NSALE7     0.04594728 -0.12055535 -0.06720603 -0.22011430 -0.04092932 -0.11660787  0.47469217
## LPRICE1   -4.28741270  1.03192398  1.39870205  1.42080926 -0.24789623  1.14221414  0.41978453
## LPRICE2    0.60813348 -4.65651982 -0.55593757  0.43919224 -0.18971462  0.01751022 -0.20724822
## LPRICE3   -0.09598038 -1.90877300 -5.42734434 -0.51100895  1.04589913 -0.75185626 -0.54730109
## LPRICE4    1.13781735  0.89342271 -1.16647791 -4.84616334 -0.20130401 -0.53258880  0.22146657
## LPRICE5    1.15223276  1.52539481  2.45035300  0.96955101 -4.02048740  0.86906867 -1.27826228
## LPRICE6   -1.85448362 -0.92699736  4.44614639 -2.04304693 -1.22949573  3.15448413 -6.28101320
## LPRICE7    0.59922556 -0.40628855 -1.51981370 -0.16617315 -0.15146697 -0.80683139 -2.38651214

# estimated rank of the coefficient matrix
rfit$rank

## [1] 7

# rrMixture package
rfit2 <- rrmix(K = 1, Y = tunaY, X = tunaX, lambda = 3, gamma = 2, est = "ANNP")
# estimated coefficient matrix
rfit2$para[[1]]$B

##                 MOVE1       MOVE2       MOVE3       MOVE4       MOVE5       MOVE6       MOVE7
## intercept 11.03998749  9.83249740  3.56373788 10.70868788 10.19160504  2.72918239 16.83806187
## NSALE1     0.07121642 -0.08071614  0.39618150 -0.06486088 -0.06339453  0.38363150 -0.30735354
## NSALE2    -0.05658269  0.24055106 -0.02932187 -0.26515306 -0.04092540  0.12900651 -0.21566104
## NSALE3     0.02024762 -0.10537125  0.12965432  0.01079927  0.07390348  0.03451341 -0.14329480
## NSALE4    -0.02895474 -0.03509104  0.16751885 -0.05675389 -0.12375623  0.02407959 -0.01758902
## NSALE5     0.02998639  0.12148697 -0.12686690  0.15788231  0.15495945 -0.04865510 -0.31089527
## NSALE6    -0.19737220 -0.10438770  0.37455663 -0.13469460 -0.12757906  0.56894979 -0.22060351
## NSALE7     0.04594728 -0.12055535 -0.06720603 -0.22011430 -0.04092932 -0.11660787  0.47469217
## LPRICE1   -4.28741270  1.03192398  1.39870205  1.42080926 -0.24789623  1.14221414  0.41978453
## LPRICE2    0.60813348 -4.65651982 -0.55593757  0.43919224 -0.18971462  0.01751022 -0.20724822
## LPRICE3   -0.09598038 -1.90877300 -5.42734434 -0.51100895  1.04589913 -0.75185626 -0.54730109
## LPRICE4    1.13781735  0.89342271 -1.16647791 -4.84616334 -0.20130401 -0.53258880  0.22146657
## LPRICE5    1.15223276  1.52539481  2.45035300  0.96955101 -4.02048740  0.86906867 -1.27826228
## LPRICE6   -1.85448362 -0.92699736  4.44614639 -2.04304693 -1.22949573  3.15448413 -6.28101320
## LPRICE7    0.59922556 -0.40628855 -1.51981370 -0.16617315 -0.15146697 -0.80683139 -2.38651214

# estimated rank of the coefficient matrix
rfit2$est.rank

## [1] 7

We can see that the two sets of estimation results are consistent.

Wrapping up

This document illustrates the usage of the rrMixture package. For illustrative purposes, some but not all functions are described. For details of the estimation methods and additional functions of the package, see Kang et al. (2022+) and the R package manual.

References

Bunea, F., She, Y. & Wegkamp, M. H. (2011), ‘Optimal selection of reduced rank estimators of high-dimensional matrices’, The Annals of Statistics 39(2), 1282–1309.
Chen, K., Dong, H. & Chan, K.-S. (2013), ‘Reduced rank regression via adaptive nuclear norm penalization’, Biometrika 100(4), 901–920.
Chevalier, J. A., Kashyap, A. K. & Rossi, P. E. (2003), ‘Why don’t prices rise during periods of peak demand? evidence from scanner data’, American Economic Review 93(1), 15–37.
Kang, S., Chen, K., & Yao, W. (2022+). ‘Reduced rank estimation in mixtures of multivariate linear regression’.
Zou, H. (2006), ‘The adaptive lasso and its oracle properties’, Journal of the American statistical association 101(476), 1418–1429.

Introduction to rrMixture

Background

Reduced-rank mixture models in multivariate regression

Introduction to rrMixture

Getting started: Installation

A real data example: tuna data

Parameter estimation using rrmix()

Parameter tuning with tune.rrmix()

Interpretation of final model

Comparison with existing R package rrpack

Wrapping up

References

Introduction to `rrMixture`

A real data example: `tuna` data

Parameter estimation using `rrmix()`

Parameter tuning with `tune.rrmix()`

Comparison with existing R package `rrpack`