Consider the following multivariate linear regression model where we want to study a relationship between multiple responses and a set of predictors: \[ Y=XB+E \] where \(Y\) is an \(n \times q\) response matrix, \(X\) is an \(n \times p\) design matrix, \(B\) is a \(p \times q\) coefficient matrix, and \(E\) is an \(n \times q\) error matrix. To explore the relationship between \(p\) predictors and \(q\) responses, our interest is to estimate the coefficient matrix \(B\). Traditionally, it can be estimated by \[ \hat{B}_L=\arg\min_{B} \|Y-XB\|^2_F=(X^\top X)^{-}X^\top Y \] where \(\|\cdot\|_F\) denotes the Frobenius norm and \((\cdot)^{-}\) denotes a Moore-Penrose inverse. This approach is simple and fast, but it has some limitations:
To overcome the drawbacks, rrmix
has been developed to fit reduced-rank mixture models, which simultaneously take into account the joint structure of the multivariate response, possibility of low-rank structure of coefficient matrix and heterogeneity of data.
Suppose the \(n\) observations belong to \(K\) different subpopulations, implying the heterogeneity of data. The marginal distribution of \(\mathbf{y}_i\) can be assumed to be \[\begin{equation}\label{mixmodel} \mathbf{y}_i|\mathbf{x}_i\sim \sum_{k=1}^K \pi_k N_q(B_k^\top\mathbf{x}_i,\Sigma_k) \end{equation}\] for \(i=1,\cdots,n\) with the Gaussian assumption. Accordingly, the objective function to maximize is the log-likelihood of a \(K\)-component finite Gaussian mixture model, i.e., \[\begin{equation*} \mathcal{L}(\boldsymbol{\theta}) = \sum_{i=1}^n \log \left[ \sum_{k=1}^K \pi_k \phi_q(\mathbf{y}_i;B_k^\top\mathbf{x}_i,\Sigma_k) \right],~~~\mathbf{y}_i \in \mathbb{R}^q, \end{equation*}\] where \(\boldsymbol{\theta}=(\pi_1,B_1,\Sigma_1,\cdots,\pi_K,B_K,\Sigma_K)^\top\) is a collection of the unknown parameters with \(\sum_{k=1}^K \pi_k=1\) and \(\phi_q(\cdot;\mu,\Sigma)\) denotes the density function of \(N_q(\mu,\Sigma)\). Under a reduced rank regression framework, our objective function is the penalized log-likelihood which can be written as \[\begin{equation} \label{F} \mathcal{F}(\boldsymbol{\theta}) = \mathcal{L}(\boldsymbol{\theta}) - \mathcal{P}_{\boldsymbol{\lambda}}(\boldsymbol{\theta}), \end{equation}\] where \(\boldsymbol{\lambda}=(\lambda_1,\cdots,\lambda_K)^\top\) is the tuning parameter. For example, the rank penalty is \[\begin{equation} \label{rp} \mathcal{P}_{\boldsymbol{\lambda}}(\boldsymbol{\theta}) = \frac{1}{2}\sum_{k=1}^K \lambda_k^2 \text{rank}(B_k) \end{equation}\] and the adaptive nuclear norm penalty is \[\begin{equation} \label{annp} \mathcal{P}_{\boldsymbol{\lambda}}(\boldsymbol{\theta}) = \sum_{k=1}^K \lambda_k ||XB_k||_{*w} \end{equation}\] where \(||\cdot||_{*w}=\sum_{i=1}^{p \wedge q} w_id_i(\cdot)\), \(n \wedge q = \min(n,q)\), \(w_i = d_i^{-\gamma}(X\hat{B}_L)\) following Zou (2006), \(d_i(\cdot)\) denotes the \(i\)th largest singular value of a matrix, and \(\gamma\) is a nonnegative constant. The rank-penalized and adaptive nuclear norm penalized estimation have originally been developed in Bunea et al. (2011) and Chen et al. (2013) respectively, for non-mixture cases, and adapted for an extension to mixture models in Kang et al. (2022+).
rrMixture
rrMixture
is an R package for fitting reduced-rank mixture models in multivariate regression via an iterative algorithm. It allows users to
For simplicity, we assume \(\lambda_1=\cdots=\lambda_K=\lambda\) and try to find an optimal value of \(\lambda\). As of version 0.1-2, available methods are
FR
: full-ranked estimation with \(\mathcal{P}_{\boldsymbol{\lambda}}(\boldsymbol{\theta})=0\).RP
: rank-penalized estimation with \(\mathcal{P}_{\boldsymbol{\lambda}}(\boldsymbol{\theta}) = \frac{1}{2}\lambda^2\sum_{k=1}^K \text{rank}(B_k)\).ANNP
: adaptive nuclear norm penalized estimation with \(\mathcal{P}_{\boldsymbol{\lambda}}(\boldsymbol{\theta}) = \lambda \sum_{k=1}^K ||XB_k||_{*w}\).This document shows how to make use of rrMixture
functionalities. rrMixture
also provides a set of tools to summarize and visualize the estimation results.
rrMixture
is available from CRAN. Install the package and load the library:
install.packages("rrMixture")
library(rrMixture)
tuna
dataTo describe the usage of rrMixture
, we here consider a real data set which is available within the R package bayesm
. It contains the volume of weekly sales (Move
) for seven of the top 10 U.S. brands in the canned tuna product category for \(n=338\) weeks between September 1989 and May 1997 along with with a measure of the display activity (Nsale
) and the log price of each brand (Lprice
). See Chevalier et al. (2003) for details. The goal is to study the effect of prices and promotional activities on sales for these 7 products; therefore, we have the following \(X\) and \(Y\) matrices and thus \(n=338\), \(q=7\), and \(p=15\). Try ?tuna
for details.
# Load and pre-process a data set
library(bayesm)
data(tuna)
<- log(tuna[, c("MOVE1", "MOVE2", "MOVE3", "MOVE4", "MOVE5", "MOVE6", "MOVE7")])
tunaY <- tuna[, c("NSALE1", "NSALE2", "NSALE3", "NSALE4", "NSALE5", "NSALE6", "NSALE7",
tunaX "LPRICE1", "LPRICE2", "LPRICE3", "LPRICE4", "LPRICE5", "LPRICE6", "LPRICE7")]
<- cbind(intercept = 1, tunaX) tunaX
rrmix()
rrmix()
function estimates \(\boldsymbol{\theta}\) via an EM algorithm using either the full-ranked, rank penalized, or adaptive nuclear norm penalized mixture models when \(K\), \(\lambda\), and \(\gamma\) are given; Later we will discuss another function for finding optimal values of these parameters. For rrmix()
, we set several arguments such as:
K
: number of mixture components.X
and Y
: \(X\) and \(Y\) matrices of data.est
: estimation method to use, either "FR"
(full-ranked), "RP"
(rank penalized), or "ANNP"
(adaptive nuclear norm penalized).lambda
and gamma
: tuning parameter(s) to set. lambda
is only used for "RP"
and "ANNP"
, and gamma
is only used for "ANNP"
.In order to implement the EM algorithm, we need starting values of \(\boldsymbol{\theta}\) which can be set by the following arguments:
n.init
: number of repetitions of initialization to try. Note that the final estimates of \(\boldsymbol{\theta}\) depends on the starting values, so we try to get starting values n.init
times and use the one with the largest log-likelihood as the final starting values. In order to assign initial mixture membership to each observation, two methods are used: K-means and random clustering.seed
(optional): seed number for reproducibility of starting values.An example with tuna
data:
# Parameter estimation with `rrmix()` using the rank penalized method
<- rrmix(K = 2, X = tunaX, Y = tunaY, est = "RP", lambda = 3,
tuna.mod seed = 100, n.init = 100)
We can find the estimated parameters, \(\hat{\boldsymbol{\theta}}=(\hat\pi_1,\hat{B}_1,\hat\Sigma_1,\cdots,\hat\pi_K,\hat{B}_K,\hat\Sigma_K)^\top\), with the command tuna.mod$para
.
# Estimated parameters
$para tuna.mod
## [[1]]
## [[1]]$pi
## [1] 0.8793335
##
## [[1]]$B
## MOVE1 MOVE2 MOVE3 MOVE4 MOVE5 MOVE6 MOVE7
## intercept 9.697739196 9.58004404 10.25972900 11.48293812 9.57257811 5.286435408 16.88180893
## NSALE1 0.264758797 -0.07357711 0.08848321 -0.06317495 -0.03905764 0.168597790 -0.34082103
## NSALE2 -0.030891006 0.15074311 -0.06805328 -0.17671419 -0.04559106 0.136535925 -0.19511789
## NSALE3 -0.167928967 -0.03270118 0.12373708 0.00584671 0.09893268 -0.006719683 -0.13678998
## NSALE4 -0.046126451 -0.05398677 0.02909986 -0.06003936 -0.11198196 -0.030637909 0.01746416
## NSALE5 0.037234871 0.10131262 -0.03173598 0.12825966 0.13640425 -0.019036824 -0.30948403
## NSALE6 -0.105596185 -0.11328508 -0.06710156 -0.16378112 -0.10056351 0.452642652 -0.24536842
## NSALE7 0.009898625 -0.13428886 -0.02874585 -0.19134459 -0.02986234 -0.103503684 0.50989322
## LPRICE1 -3.620489328 1.10874628 0.43853735 1.44290680 -0.07191063 0.274192590 0.21678455
## LPRICE2 0.605598440 -5.23779722 0.05152277 1.04481042 -0.16159788 0.315346800 -0.26706440
## LPRICE3 -1.312250355 -1.65032984 -4.18091735 -0.72328165 1.07934056 0.019059245 -0.48474486
## LPRICE4 0.757267996 0.91868374 -0.03822025 -5.00462654 -0.39856090 -0.043581011 0.28513770
## LPRICE5 1.245436719 1.85257681 0.32397858 0.74684669 -3.89550148 0.998771021 -0.92707576
## LPRICE6 -0.356928595 -0.98490400 -0.23129587 -2.43450428 -0.79775535 0.889300225 -6.44645553
## LPRICE7 0.245307808 -0.32982140 -0.43982712 -0.16059763 -0.13203062 -0.441057708 -2.18630625
##
## [[1]]$Sigma
## MOVE1 MOVE2 MOVE3 MOVE4 MOVE5 MOVE6 MOVE7
## MOVE1 0.10027979 0.022850112 0.014598890 0.03336132 0.019478733 -0.012177185 0.011459331
## MOVE2 0.02285011 0.239894750 0.021559759 0.02377434 0.031907926 -0.009215861 -0.008205517
## MOVE3 0.01459889 0.021559759 0.042124926 0.03644332 0.009534536 0.006753641 -0.004059046
## MOVE4 0.03336132 0.023774343 0.036443316 0.23853304 0.017091292 -0.018254510 -0.011638096
## MOVE5 0.01947873 0.031907926 0.009534536 0.01709129 0.035661859 0.001658638 0.006083920
## MOVE6 -0.01217719 -0.009215861 0.006753641 -0.01825451 0.001658638 0.086088152 -0.026781412
## MOVE7 0.01145933 -0.008205517 -0.004059046 -0.01163810 0.006083920 -0.026781412 0.302357944
##
##
## [[2]]
## [[2]]$pi
## [1] 0.1206665
##
## [[2]]$B
## MOVE1 MOVE2 MOVE3 MOVE4 MOVE5 MOVE6 MOVE7
## intercept 12.1618998 6.8135406 -19.7848459 11.134598356 8.36894956 -25.31234680 18.074540867
## NSALE1 -1.3799401 0.1082732 0.7756834 0.192467305 0.02149564 0.48124630 -0.024485074
## NSALE2 0.9309726 0.2748817 0.4271569 -0.706356643 0.10207620 0.97518008 -0.730762472
## NSALE3 1.9937937 -0.6066941 -0.1480685 -0.351401019 0.09757915 0.36671402 -0.407156358
## NSALE4 0.9637697 0.7211725 0.8428552 -0.005961994 0.69405966 0.86280190 -0.496122471
## NSALE5 -0.4147939 0.2721955 0.2844473 0.004886992 0.02884490 0.04020195 -0.021491251
## NSALE6 -0.6720243 0.2854669 1.3906442 0.509588914 0.16743924 0.77558586 -0.030125194
## NSALE7 1.2294988 -0.0724503 -2.0590187 0.245503899 0.27764132 -1.67780746 -0.004936125
## LPRICE1 -5.1225754 0.8407906 3.0890073 0.683339339 -0.11384347 3.81927062 0.914629856
## LPRICE2 1.0451861 -3.7460418 -1.8067515 -0.997563203 -0.28016256 0.22614671 -0.232220724
## LPRICE3 8.5212020 -1.3526899 -22.3310971 0.762034375 -2.52342959 -13.92983469 -0.627547939
## LPRICE4 3.9312856 2.4825187 -6.8921826 -2.987481081 4.06841349 -1.37724156 -0.484910359
## LPRICE5 0.8080700 -0.3677838 12.9709864 -1.101818729 -6.81076437 -1.69780303 -0.459393781
## LPRICE6 -5.8970984 2.1119481 24.5489270 -2.058549217 3.42907251 32.07627090 -7.890724801
## LPRICE7 1.6976674 -0.2024230 -9.9204652 1.918824618 0.70354917 -4.31166964 -4.466990108
##
## [[2]]$Sigma
## MOVE1 MOVE2 MOVE3 MOVE4 MOVE5 MOVE6 MOVE7
## MOVE1 0.462506694 -0.06348751 0.138542542 0.008465615 -0.01694162 0.01851249 -0.09422843
## MOVE2 -0.063487506 0.07349867 -0.088111118 0.029005883 0.03317261 -0.07376150 0.02811246
## MOVE3 0.138542542 -0.08811112 1.173924092 0.008063392 -0.14678689 0.84098527 -0.10013659
## MOVE4 0.008465615 0.02900588 0.008063392 0.080053501 0.01148986 -0.02862116 0.02509401
## MOVE5 -0.016941616 0.03317261 -0.146786892 0.011489858 0.05605736 -0.12300356 0.03315975
## MOVE6 0.018512495 -0.07376150 0.840985267 -0.028621155 -0.12300356 0.73881130 -0.03049476
## MOVE7 -0.094228429 0.02811246 -0.100136595 0.025094013 0.03315975 -0.03049476 0.07795657
Besides the estimated parameter \(\hat{\boldsymbol{\theta}}\), one of the fundamental tasks in reduced rank estimation is rank determination. The estimated ranks of \(K\) coefficient matrices can be found by tuna.mod$est.rank
as follows. In this case with \(\lambda = 3\), both coefficient matrices turned out to have full rank since \(\min(p,q)=7\).
# Estimated ranks of coefficient matrices
$est.rank tuna.mod
## [1] 7 7
The package provides summary()
and plot()
methods for the rrmix
object. With the summary()
function, we can see summarized information of the fitted model. We can see that the EM algorithm is terminated after 57 iterations and the BIC of the fitted model is 3115.056.
# Summarize the estimation results
summary(tuna.mod)
##
## Call:
## rrmix(K = 2, X = tunaX, Y = tunaY, est = "RP", lambda = 3, seed = 100, n.init = 100)
##
##
##
## Method: Rank penalized mixture
##
## Initialization: K-means and random clustering
##
## Tuning Parameters:
## lambda: 3
##
## Fitted Model:
## Number of components: 2
## Log-likelihood: -780.1514
## Penalized log-likelihood: -843.1514
## BIC: 3115.056
##
##
## Number of Iterations: 57
The plot()
function shows the log-likelihood and penalized log-likelihood values at each iteration as can be seen below. The ascent property of the penalized log-likelihood can be observed.
# Visualize the estimation results
plot(tuna.mod)
##
## F: Penalized log-likelihood
## L: Log-likelihood
tune.rrmix()
As shown above, the rrmix()
function estimates parameters with fixed \(K\) (number of mixture components), \(\lambda\) and \(\gamma\) (tuning parameters). In practice, choosing proper values of \(K\), \(\lambda\) and \(\gamma\) is important. This can be done using the tune.rrmix()
function by performing a grid search over user-specified parameter ranges. Notice that the full-ranked method ("FR"
) has no tuning parameters, the rank penalized method ("RP"
) only has one tuning parameter, \(\lambda\). The adaptive nuclear norm penalized method ("ANNP"
) has two tuning parameters, \(\lambda\) and \(\gamma\).
Suppose one wants to tune \(\lambda\) for the rank penalized method with fixed \(K=2\). We can try the following with a set of candidate values of \(\lambda\):
# Parameter tuning with `tune.rrmix()` using the rank penalized method
<- tune.rrmix(K = 2, X = tunaX, Y = tunaY, est = "RP",
tuna.tune1 lambda = exp(seq(0, log(20), length = 20)),
seed = 100, n.init = 100)
The tune.rrmix
object also has summary()
and plot()
methods. Both functions find an optimal tuning parameter that provides the best performance metric - the smallest BIC in this case - among the given set of candidate values.
# Summarize the results
summary(tuna.tune1)
##
## Call:
## tune.rrmix(K = 2, X = tunaX, Y = tunaY, est = "RP", lambda = exp(seq(0, log(20), length = 20)), seed = 100, n.init = 100)
##
##
## Parameter tuning of 'rrmix':
##
## - Method: Rank penalized mixture
##
## - Initialization: K-means and random clustering
##
## - Performance metric: bic
##
## - Best performance: 2963.037
##
## - Best parameters:
## K lambda
## 2 9.09188
##
## - Detailed performance results:
## K lambda loglik penalty penloglik npar bic est.rank1 est.rank2
## 1 2 1.000000 -780.1514 7.000000 -787.1514 267 3115.056 7 7
## 2 2 1.170780 -780.1514 9.595079 -789.7465 267 3115.056 7 7
## 3 2 1.370726 -780.1514 13.152221 -793.3036 267 3115.056 7 7
## 4 2 1.604818 -780.1514 18.028086 -798.1795 267 3115.056 7 7
## 5 2 1.878889 -780.1514 24.711559 -804.8630 267 3115.056 7 7
## 6 2 2.199765 -780.1514 33.872767 -814.0242 267 3115.056 7 7
## 7 2 2.575441 -780.1514 46.430269 -826.5817 267 3115.056 7 7
## 8 2 3.015274 -780.1514 63.643158 -843.7946 267 3115.056 7 7
## 9 2 3.530223 -780.1514 87.237306 -867.3887 267 3115.056 7 7
## 10 2 4.133114 -778.9185 111.037095 -889.9556 258 3060.183 7 6
## 11 2 4.838967 -778.9185 152.201389 -931.1199 258 3060.183 7 6
## 12 2 5.665365 -778.9185 208.626341 -987.5448 258 3060.183 7 6
## 13 2 6.632896 -756.4340 285.969468 -1042.4035 258 3015.214 7 6
## 14 2 7.765661 -798.5918 361.832928 -1160.4247 247 3035.476 7 5
## 15 2 9.091880 -800.2222 454.642521 -1254.8648 234 2963.037 7 4
## 16 2 10.644590 -906.4122 566.536496 -1472.9487 225 3123.010 6 4
## 17 2 12.462472 -1139.2845 543.596258 -1682.8807 186 3361.655 4 3
## 18 2 14.590812 -1149.9582 638.675408 -1788.6336 169 3284.011 4 2
## 19 2 17.082630 -1149.9582 875.448736 -2025.4070 169 3284.011 4 2
## 20 2 20.000000 -1350.4807 800.000000 -2150.4807 135 3487.073 3 1
When only one parameter is considered for tuning with other parameters being fixed, the parameter is on the x-axis and the performance metric is on the y-axis when using the plot()
function. In this case, we use a grid of 20 \(\lambda\) values equally spaced on the log scale, we can set transform.x = log
for a better illustration.
# Visualize the results
plot(tuna.tune1, transform.x = log, xlab = "log(lambda)")
Now suppose one wants to tune both \(K\) and \(\lambda\). We can try the following command with setting K.max
instead of K
. If K.max
is 3, the tune.rrmix()
function try all cases where \(K\) is 1 through 3. Note that \(K=1\) indicates a non-mixture case. Hence, the tune.rrmix()
function can automatically examine whether assuming heterogeneity of data is appropriate for a given data set.
# Parameter tuning with `tune.rrmix()` using the rank penalized method
<- tune.rrmix(K.max = 3, X = tunaX, Y = tunaY, est = "RP",
tuna.tune2 lambda = exp(seq(0, log(20), length = 20)),
seed = 100, n.init = 100)
# Summarize the results
summary(tuna.tune2)
##
## Call:
## tune.rrmix(K.max = 3, X = tunaX, Y = tunaY, est = "RP", lambda = exp(seq(0, log(20), length = 20)), seed = 100, n.init = 100)
##
##
## Parameter tuning of 'rrmix':
##
## - Method: Rank penalized mixture
##
## - Initialization: K-means and random clustering
##
## - Performance metric: bic
##
## - Best performance: 2963.037
##
## - Best parameters:
## K lambda
## 2 9.09188
##
## - Detailed performance results:
## K lambda loglik penalty penloglik npar bic est.rank1 est.rank2 est.rank3
## 1 1 1.000000 -1377.0748 7.000000 -1384.0748 133 3528.615 7 NA NA
## 2 1 1.170780 -1377.0748 9.595079 -1386.6699 133 3528.615 7 NA NA
## 3 1 1.370726 -1377.0748 13.152221 -1390.2271 133 3528.615 7 NA NA
## 4 1 1.604818 -1377.0748 18.028086 -1395.1029 133 3528.615 7 NA NA
## 5 1 1.878889 -1377.0748 24.711559 -1401.7864 133 3528.615 7 NA NA
## 6 1 2.199765 -1377.0748 33.872767 -1410.9476 133 3528.615 7 NA NA
## 7 1 2.575441 -1377.0748 46.430269 -1423.5051 133 3528.615 7 NA NA
## 8 1 3.015274 -1434.5586 54.551278 -1489.1098 124 3591.175 6 NA NA
## 9 1 3.530223 -1434.5586 74.774834 -1509.3334 124 3591.175 6 NA NA
## 10 1 4.133114 -1434.5586 102.495780 -1537.0543 124 3591.175 6 NA NA
## 11 1 4.838967 -1566.3163 117.077991 -1683.3943 113 3790.637 5 NA NA
## 12 1 5.665365 -1566.3163 160.481801 -1726.7981 113 3790.637 5 NA NA
## 13 1 6.632896 -1566.3163 219.976514 -1786.2928 113 3790.637 5 NA NA
## 14 1 7.765661 -1566.3163 301.527440 -1867.8438 113 3790.637 5 NA NA
## 15 1 9.091880 -1741.3884 247.986830 -1989.3752 85 3977.736 3 NA NA
## 16 1 10.644590 -1741.3884 339.921898 -2081.3103 85 3977.736 3 NA NA
## 17 1 12.462472 -1741.3884 465.939649 -2207.3280 85 3977.736 3 NA NA
## 18 1 14.590812 -1941.6003 425.783606 -2367.3839 68 4279.168 2 NA NA
## 19 1 17.082630 -2173.2818 291.816245 -2465.0980 49 4631.893 1 NA NA
## 20 1 20.000000 -2173.2818 400.000000 -2573.2818 49 4631.893 1 NA NA
## 21 2 1.000000 -780.1514 7.000000 -787.1514 267 3115.056 7 7 NA
## 22 2 1.170780 -780.1514 9.595079 -789.7465 267 3115.056 7 7 NA
## 23 2 1.370726 -780.1514 13.152221 -793.3036 267 3115.056 7 7 NA
## 24 2 1.604818 -780.1514 18.028086 -798.1795 267 3115.056 7 7 NA
## 25 2 1.878889 -780.1514 24.711559 -804.8630 267 3115.056 7 7 NA
## 26 2 2.199765 -780.1514 33.872767 -814.0242 267 3115.056 7 7 NA
## 27 2 2.575441 -780.1514 46.430269 -826.5817 267 3115.056 7 7 NA
## 28 2 3.015274 -780.1514 63.643158 -843.7946 267 3115.056 7 7 NA
## 29 2 3.530223 -780.1514 87.237306 -867.3887 267 3115.056 7 7 NA
## 30 2 4.133114 -778.9185 111.037095 -889.9556 258 3060.183 7 6 NA
## 31 2 4.838967 -778.9185 152.201389 -931.1199 258 3060.183 7 6 NA
## 32 2 5.665365 -778.9185 208.626341 -987.5448 258 3060.183 7 6 NA
## 33 2 6.632896 -756.4340 285.969468 -1042.4035 258 3015.214 7 6 NA
## 34 2 7.765661 -798.5918 361.832928 -1160.4247 247 3035.476 7 5 NA
## 35 2 9.091880 -800.2222 454.642521 -1254.8648 234 2963.037 7 4 NA
## 36 2 10.644590 -906.4122 566.536496 -1472.9487 225 3123.010 6 4 NA
## 37 2 12.462472 -1139.2845 543.596258 -1682.8807 186 3361.655 4 3 NA
## 38 2 14.590812 -1149.9582 638.675408 -1788.6336 169 3284.011 4 2 NA
## 39 2 17.082630 -1149.9582 875.448736 -2025.4070 169 3284.011 4 2 NA
## 40 2 20.000000 -1350.4807 800.000000 -2150.4807 135 3487.073 3 1 NA
## 41 3 1.000000 -416.3847 10.500000 -426.8847 401 3167.811 7 7 7
## 42 3 1.170780 -416.3847 14.392619 -430.7774 401 3167.811 7 7 7
## 43 3 1.370726 -416.3847 19.728331 -436.1131 401 3167.811 7 7 7
## 44 3 1.604818 -416.3847 27.042129 -443.4269 401 3167.811 7 7 7
## 45 3 1.878889 -416.3847 37.067338 -453.4521 401 3167.811 7 7 7
## 46 3 2.199765 -416.3847 50.809150 -467.1939 401 3167.811 7 7 7
## 47 3 2.575441 -416.3847 69.645403 -486.0301 401 3167.811 7 7 7
## 48 3 3.015274 -416.3847 95.464737 -511.8495 401 3167.811 7 7 7
## 49 3 3.530223 -416.3847 130.855959 -547.2407 401 3167.811 7 7 7
## 50 3 4.133114 -416.3847 179.367614 -595.7524 401 3167.811 7 7 7
## 51 3 4.838967 -434.3091 234.155983 -668.4651 392 3151.252 6 7 7
## 52 3 5.665365 -486.5614 304.915421 -791.4769 383 3203.349 6 7 6
## 53 3 6.632896 -555.2958 417.955376 -973.2512 383 3340.818 6 7 6
## 54 3 7.765661 -547.1041 512.596648 -1059.7007 361 3196.328 5 7 5
## 55 3 9.091880 -667.3193 619.967075 -1287.2864 339 3308.651 5 6 4
## 56 3 10.644590 -823.4459 736.497445 -1559.9434 315 3481.151 5 4 4
## 57 3 12.462472 -878.4625 931.879299 -1810.3418 300 3503.839 5 4 3
## 58 3 14.590812 -973.6083 958.013112 -1931.6215 255 3432.093 3 4 2
## 59 3 17.082630 -1090.1773 875.448736 -1965.6260 204 3368.256 1 3 2
## 60 3 20.000000 -1379.2821 1000.000000 -2379.2821 187 3847.474 1 2 2
When \(K\) and \(\lambda\) are tuned, a contour plot of the performance metric (BIC) can be drawn using the plot()
function with \(K\) and \(\lambda\) being on the x- and y-axis, respectively.
# Visualize the results
plot(tuna.tune2, transform.y = log, ylab = "log(lambda)")
As we can see above, the best model is the one with \(K=2\) and \(\lambda = 8.728963\), implying that the data is heterogeneous and consisting of 2 subpopulations.
Note also that, if candidate values of \(\lambda\) are not pre-specified, i.e., if lambda = NULL
in tune.rrmix()
, a data-adaptive range of \(\lambda\) for tuning will internally be determined. In this case, users can set the argument n.lambda
which specifies the number of \(\lambda\) values to be explored in the range. Furthermore, if candidate values of \(\gamma\) are not specified with est = "ANNP"
in tune.rrmix()
, gamma = 2
will be used by default since it generally showed good performance in Chen et al. (2013).
Let’s see how we can interpret the final model determined by tune.rrmix()
. As mentioned earlier, \(\hat{\boldsymbol{\theta}}=(\hat\pi_1,\hat{B}_1,\hat\Sigma_1,\cdots,\hat\pi_K,\hat{B}_K,\hat\Sigma_K)^\top\) can be obtained by best.mod$para
.
# The final model
<- summary(tuna.tune2)$best.K
best.K <- summary(tuna.tune2)$best.lambda
best.lambda <- rrmix(K = best.K, X = tunaX, Y = tunaY, est = "RP", lambda = best.lambda,
best.mod seed = 100, n.init = 100)
summary(best.mod)
##
## Call:
## rrmix(K = best.K, X = tunaX, Y = tunaY, est = "RP", lambda = best.lambda, seed = 100, n.init = 100)
##
##
##
## Method: Rank penalized mixture
##
## Initialization: K-means and random clustering
##
## Tuning Parameters:
## lambda: 9.09188
##
## Fitted Model:
## Number of components: 2
## Log-likelihood: -800.2222
## Penalized log-likelihood: -1254.865
## BIC: 2963.037
##
##
## Number of Iterations: 54
$para best.mod
## [[1]]
## [[1]]$pi
## [1] 0.881181
##
## [[1]]$B
## MOVE1 MOVE2 MOVE3 MOVE4 MOVE5 MOVE6 MOVE7
## intercept 9.831832635 9.65864042 10.14738007 11.1979483392 9.48687846 6.153218054 17.18630762
## NSALE1 0.245720206 -0.07798723 0.08916616 -0.0549966149 -0.03548999 0.143390850 -0.36154379
## NSALE2 -0.036411994 0.14667386 -0.06105783 -0.1633427541 -0.03813799 0.100875057 -0.21237738
## NSALE3 -0.175968205 -0.03849338 0.11794301 0.0003528289 0.09136335 -0.006991503 -0.13725684
## NSALE4 -0.011456346 -0.03916083 0.03453038 -0.0711003217 -0.10492094 -0.024757594 0.02609806
## NSALE5 0.046909727 0.10183915 -0.03315014 0.1204480450 0.13378362 -0.015261578 -0.30355839
## NSALE6 -0.110186495 -0.11943755 -0.06486655 -0.1577068083 -0.09670243 0.409968442 -0.25879877
## NSALE7 0.004071193 -0.13539608 -0.01688634 -0.1759992155 -0.02189734 -0.137505047 0.50195911
## LPRICE1 -3.672011093 1.10326249 0.41067244 1.4165535971 -0.08589023 0.314033147 0.19650308
## LPRICE2 0.596909979 -5.25728296 0.06323593 1.0739359825 -0.16848094 0.229513404 -0.29174927
## LPRICE3 -1.306251346 -1.65267666 -4.21361349 -0.7814175830 0.99484564 0.165184982 -0.41794841
## LPRICE4 0.786561736 0.93243282 -0.01614248 -4.9895399813 -0.38006124 -0.058316769 0.28721163
## LPRICE5 1.375630514 1.88822889 0.44238323 0.8539790130 -3.87533825 0.407177105 -1.03070115
## LPRICE6 -0.502635251 -1.05582296 -0.16117212 -2.2058663320 -0.70630650 0.294804445 -6.68745467
## LPRICE7 0.275847305 -0.31984729 -0.42179987 -0.1370997993 -0.14293543 -0.555428679 -2.19622222
##
## [[1]]$Sigma
## MOVE1 MOVE2 MOVE3 MOVE4 MOVE5 MOVE6 MOVE7
## MOVE1 0.10433508 0.023733541 0.015521534 0.03247732 0.019452039 -0.014019707 0.011213734
## MOVE2 0.02373354 0.239529549 0.022272681 0.02428049 0.032421400 -0.011098250 -0.009117916
## MOVE3 0.01552153 0.022272681 0.042279754 0.03658732 0.010310018 0.008806337 -0.003876825
## MOVE4 0.03247732 0.024280488 0.036587324 0.23743596 0.017621679 -0.013435994 -0.010668746
## MOVE5 0.01945204 0.032421400 0.010310018 0.01762168 0.037183284 0.003938157 0.005520488
## MOVE6 -0.01401971 -0.011098250 0.008806337 -0.01343599 0.003938157 0.066793640 -0.034358863
## MOVE7 0.01121373 -0.009117916 -0.003876825 -0.01066875 0.005520488 -0.034358863 0.298840993
##
##
## [[2]]
## [[2]]$pi
## [1] 0.118819
##
## [[2]]$B
## MOVE1 MOVE2 MOVE3 MOVE4 MOVE5 MOVE6 MOVE7
## intercept 10.75741485 3.05845114 24.58595902 11.65593374 4.356064495 6.222446295 15.38872468
## NSALE1 -0.05534936 0.05831940 0.12594001 0.14424588 0.006196908 0.034632335 -0.05045462
## NSALE2 0.06893836 0.49625290 -1.39789028 -0.62327880 0.183635380 -0.099207906 -0.36455311
## NSALE3 0.35931685 -0.27753088 -1.12758206 -0.34110929 0.310196484 -0.006793506 0.11600480
## NSALE4 0.17604880 0.86240476 -1.52812832 -0.32588892 0.437330138 0.065938270 -0.42558206
## NSALE5 -0.05791128 0.22839832 0.18319877 0.06554551 -0.028323271 0.021209575 -0.05856984
## NSALE6 0.01319062 0.28873577 0.57403352 0.53897724 0.115680162 0.190861213 0.04077665
## NSALE7 0.27858992 -0.01130092 -0.26308287 -0.06083649 0.185879055 0.061231734 0.20101446
## LPRICE1 -0.07084777 0.50773376 0.52030640 0.83309298 0.235197471 0.288331118 -0.15013008
## LPRICE2 0.02204864 -3.47196226 -2.94020979 -1.05188581 0.005805008 -0.563784272 0.02242193
## LPRICE3 -2.16412699 -1.48644936 -0.03924001 0.07432347 -1.213040603 -0.683191409 -1.80261974
## LPRICE4 3.57952823 2.78110427 -10.34939108 -4.40864029 2.944292619 0.068353213 0.42155101
## LPRICE5 -3.90373082 0.72735230 10.26220480 -2.58037770 -5.788255976 -2.167315583 -0.37680378
## LPRICE6 1.70570061 4.94764428 -20.55093102 -2.16448806 5.870869263 0.849327803 -5.32076485
## LPRICE7 -1.89520208 0.26344154 -5.69867364 0.60316534 0.825501480 -0.091956650 -3.72405036
##
## [[2]]$Sigma
## MOVE1 MOVE2 MOVE3 MOVE4 MOVE5 MOVE6 MOVE7
## MOVE1 1.55349212 -0.21819091 0.4478371 0.01521395 -0.08972691 0.02398342 -0.34739213
## MOVE2 -0.21819091 0.10541875 -0.2066950 0.03113594 0.04890543 -0.12597494 0.07756811
## MOVE3 0.44783706 -0.20669504 2.3161115 -0.09818200 -0.17691700 1.51793400 -0.35143336
## MOVE4 0.01521395 0.03113594 -0.0981820 0.08974138 0.02265828 -0.11688499 0.01242897
## MOVE5 -0.08972691 0.04890543 -0.1769170 0.02265828 0.09515097 -0.21162975 0.02438693
## MOVE6 0.02398342 -0.12597494 1.5179340 -0.11688499 -0.21162975 1.33305219 -0.08158394
## MOVE7 -0.34739213 0.07756811 -0.3514334 0.01242897 0.02438693 -0.08158394 0.17317664
The estimated ranks of \(\hat{B}_1\) and \(\hat{B}_2\) are 7 and 4, respectively, as follows. In this case, the first coefficient matrix turned out to have full rank and the second coefficient matrix had low-rank structure since full rank is \(\min(p,q)=7\).
# Estimated ranks of coefficient matrices
$est.rank best.mod
## [1] 7 4
We can also get information on the membership of mixture components using the following command. The final model suggests that, among \(n=338\) observations, 299 observations belong to the first mixture component and the remaining 39 observations belong to the second mixture component.
# Membership information of mixture components
$ind best.mod
## [1] 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 2 1 1 2 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [98] 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [195] 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1
## [292] 1 1 1 1 1 1 1 1 1 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1
$n.est best.mod
## [1] 298 40
rrpack
The package rrpack
allows users to use a variety of reduced-rank estimation methods in multivariate regression. For example, the rrr()
function with the argument penaltySVD = "ann"
provides a solution of the adaptive nuclear norm penalized estimator (Chen et al., 2013). One difference between the rrr()
function from the rrpack
package and the rrmix()
function from the rrMixture
package is that rrmix()
incorporates the idea of mixture models. Therefore, when the number of mixture components is \(K=1\), rrmix()
produces the same solution with that of rrr()
with the same tuning parameter(s). Let’s see an example using the tuna
data.
# rrpack package
require(rrpack)
<- rrr(Y = as.matrix(tunaY), X = as.matrix(tunaX),
rfit modstr = list(lambda = 3, gamma = 2), penaltySVD = "ann")
# estimated coefficient matrix
coef(rfit)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## intercept 11.03998749 9.83249740 3.56373788 10.70868788 10.19160504 2.72918239 16.83806187
## NSALE1 0.07121642 -0.08071614 0.39618150 -0.06486088 -0.06339453 0.38363150 -0.30735354
## NSALE2 -0.05658269 0.24055106 -0.02932187 -0.26515306 -0.04092540 0.12900651 -0.21566104
## NSALE3 0.02024762 -0.10537125 0.12965432 0.01079927 0.07390348 0.03451341 -0.14329480
## NSALE4 -0.02895474 -0.03509104 0.16751885 -0.05675389 -0.12375623 0.02407959 -0.01758902
## NSALE5 0.02998639 0.12148697 -0.12686690 0.15788231 0.15495945 -0.04865510 -0.31089527
## NSALE6 -0.19737220 -0.10438770 0.37455663 -0.13469460 -0.12757906 0.56894979 -0.22060351
## NSALE7 0.04594728 -0.12055535 -0.06720603 -0.22011430 -0.04092932 -0.11660787 0.47469217
## LPRICE1 -4.28741270 1.03192398 1.39870205 1.42080926 -0.24789623 1.14221414 0.41978453
## LPRICE2 0.60813348 -4.65651982 -0.55593757 0.43919224 -0.18971462 0.01751022 -0.20724822
## LPRICE3 -0.09598038 -1.90877300 -5.42734434 -0.51100895 1.04589913 -0.75185626 -0.54730109
## LPRICE4 1.13781735 0.89342271 -1.16647791 -4.84616334 -0.20130401 -0.53258880 0.22146657
## LPRICE5 1.15223276 1.52539481 2.45035300 0.96955101 -4.02048740 0.86906867 -1.27826228
## LPRICE6 -1.85448362 -0.92699736 4.44614639 -2.04304693 -1.22949573 3.15448413 -6.28101320
## LPRICE7 0.59922556 -0.40628855 -1.51981370 -0.16617315 -0.15146697 -0.80683139 -2.38651214
# estimated rank of the coefficient matrix
$rank rfit
## [1] 7
# rrMixture package
<- rrmix(K = 1, Y = tunaY, X = tunaX, lambda = 3, gamma = 2, est = "ANNP")
rfit2 # estimated coefficient matrix
$para[[1]]$B rfit2
## MOVE1 MOVE2 MOVE3 MOVE4 MOVE5 MOVE6 MOVE7
## intercept 11.03998749 9.83249740 3.56373788 10.70868788 10.19160504 2.72918239 16.83806187
## NSALE1 0.07121642 -0.08071614 0.39618150 -0.06486088 -0.06339453 0.38363150 -0.30735354
## NSALE2 -0.05658269 0.24055106 -0.02932187 -0.26515306 -0.04092540 0.12900651 -0.21566104
## NSALE3 0.02024762 -0.10537125 0.12965432 0.01079927 0.07390348 0.03451341 -0.14329480
## NSALE4 -0.02895474 -0.03509104 0.16751885 -0.05675389 -0.12375623 0.02407959 -0.01758902
## NSALE5 0.02998639 0.12148697 -0.12686690 0.15788231 0.15495945 -0.04865510 -0.31089527
## NSALE6 -0.19737220 -0.10438770 0.37455663 -0.13469460 -0.12757906 0.56894979 -0.22060351
## NSALE7 0.04594728 -0.12055535 -0.06720603 -0.22011430 -0.04092932 -0.11660787 0.47469217
## LPRICE1 -4.28741270 1.03192398 1.39870205 1.42080926 -0.24789623 1.14221414 0.41978453
## LPRICE2 0.60813348 -4.65651982 -0.55593757 0.43919224 -0.18971462 0.01751022 -0.20724822
## LPRICE3 -0.09598038 -1.90877300 -5.42734434 -0.51100895 1.04589913 -0.75185626 -0.54730109
## LPRICE4 1.13781735 0.89342271 -1.16647791 -4.84616334 -0.20130401 -0.53258880 0.22146657
## LPRICE5 1.15223276 1.52539481 2.45035300 0.96955101 -4.02048740 0.86906867 -1.27826228
## LPRICE6 -1.85448362 -0.92699736 4.44614639 -2.04304693 -1.22949573 3.15448413 -6.28101320
## LPRICE7 0.59922556 -0.40628855 -1.51981370 -0.16617315 -0.15146697 -0.80683139 -2.38651214
# estimated rank of the coefficient matrix
$est.rank rfit2
## [1] 7
We can see that the two sets of estimation results are consistent.
This document illustrates the usage of the rrMixture
package. For illustrative purposes, some but not all functions are described. For details of the estimation methods and additional functions of the package, see Kang et al. (2022+) and the R package manual.