EAinference
package is designed to facilitate simulation based inference of lasso type estimators. The package includes
Let the unknown true model be \[ y = X\beta_0 + \epsilon,\] where \(\beta_0\) is a unknown true coefficient vector and \(\epsilon_i \sim_{iid} N(0,\sigma^2)\).
We use following loss functions. \[ \ell_{grlasso}(\beta ; \lambda) = \frac{1}{2}||y-X\beta||^2_2 + n\lambda\sum_j w_j||\beta_{(j)}||_2,\] \[ \ell_{sgrlasso}(\beta, \sigma ; \lambda) = \frac{1}{2\sigma}||y-X\beta||^2_2 + \frac{\sigma}{2} + n\lambda\sum_j w_j||\beta_{(j)}||_2,\] where \(grlasso\) and \(sgrlasso\) represent group lasso and scaled group lasso, respectively, and \(w_j\) is the weight factor for the \(j\)-th group. Loss functions for lasso and scaled lasso can be treated as special cases of group lasso and group scaled lasso when every group is a singleton, respectively.
lassoFit
computes lasso, group lasso, scaled lasso and scaled group lasso solutions. Users can either provide the value of \(\lambda\) or choose to use cross-validation by setting lbd = "cv.min"
or lbd = "cv.1se"
. By letting lbd = "cv.min"
, users can have \(\lambda\) which gives minimum mean-squared error, while lbd = "cv.1se"
is the largest \(\lambda\) within 1 standard deviaion error bound of "cv.min"
.
library(EAinference)
set.seed(1234)
n <- 30; p <- 50
Niter <- 10
group <- rep(1:(p/5), each = 5)
weights <- rep(1, p/5)
beta0 <- c(rep(1,5), rep(0, p-5))
X <- matrix(rnorm(n*p), n)
Y <- X %*% beta0 + rnorm(n)
# set lambda = .5
lassoFit(X = X, Y = Y, lbd = .5, group = group, weights = weights, type="grlasso")
## $B0
## [1] 0.5161666451 1.0188194987 0.6824793388 0.3931465866 0.6874994287
## [6] -0.0016276550 0.0217320293 0.0301747719 -0.0099519996 -0.0164145624
## [11] -0.0014113541 -0.0045361968 0.0020838981 0.0005231764 -0.0052660052
## [16] 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## [21] 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## [26] 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## [31] 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## [36] 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
## [41] -0.0017693962 -0.0011635879 0.0061087716 0.0010330827 -0.0028810035
## [46] 0.0000000000 0.0000000000 0.0000000000 0.0000000000 0.0000000000
##
## $S0
## [1] 0.33335824 0.65800255 0.44077947 0.25391509 0.44402031
## [6] -0.03886814 0.51891926 0.72052006 -0.23763656 -0.39194621
## [11] -0.19045450 -0.61212553 0.28120884 0.07059705 -0.71060849
## [16] 0.46931065 0.33721146 0.11896615 0.29502880 0.03505243
## [21] 0.05203069 -0.12061855 0.04603167 -0.40549602 -0.50026213
## [26] 0.17202391 0.45006926 -0.17412122 -0.80287493 -0.10071653
## [31] -0.25984436 0.01141286 0.24753340 0.62774880 0.04721167
## [36] 0.33710419 -0.09279505 -0.04843598 -0.43430001 0.40143342
## [41] -0.24735341 -0.16266440 0.85398270 0.14442166 -0.40275203
## [46] 0.53179122 -0.20151832 -0.07230133 -0.45279215 0.21106223
##
## $lbd
## [1] 0.5
##
## $weights
## [1] 1 1 1 1 1 1 1 1 1 1
##
## $group
## [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5
## [24] 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 10
## [47] 10 10 10 10
# use cross-validation
lassoFit(X = X, Y = Y, lbd = "cv.1se", group = group, weights = weights, type="grlasso")
## $B0
## [1] 0.2585567 0.5662969 0.3423216 0.1790861 0.3370465 0.0000000 0.0000000
## [8] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [15] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [22] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [29] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [36] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [43] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [50] 0.0000000
##
## $S0
## [1] 0.320602499 0.702188704 0.424466187 0.222062331 0.417926026
## [6] -0.082467976 0.277804727 0.096219628 -0.208633518 -0.139425279
## [11] -0.045785563 -0.085990733 0.311352884 0.011898323 -0.431104404
## [16] 0.434035959 0.080054096 0.099761888 0.383888283 -0.046916332
## [21] -0.142767164 -0.004633968 0.139308032 -0.385795767 -0.281788517
## [26] 0.100791387 0.170594118 0.002935687 -0.284587073 -0.021219450
## [31] -0.152129053 0.174903539 0.117935249 0.246820956 0.035916060
## [36] 0.306760303 -0.064360881 -0.114288527 -0.306805271 0.216458545
## [41] -0.179758354 -0.247343130 0.344814852 -0.033834872 -0.179308622
## [46] 0.257759246 0.067089513 -0.164422715 -0.196709827 0.228724776
##
## $lbd
## [1] 1.475293
##
## $weights
## [1] 1 1 1 1 1 1 1 1 1 1
##
## $group
## [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5
## [24] 5 5 6 6 6 6 6 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 10
## [47] 10 10 10 10
PBsampler
supports two bootstrap methods; Gaussian bootstrap and wild multiplier bootstrap. Due to the fact that the sampling distirbution of \((\hat{\beta}, S)\), coefficient estimator and its subgradient, is characterized by \((\beta_0, \sigma^2, \lambda)\), users are required to provide PE
(either the estimate of \(\beta_0\) or the estimate of \(E(y) = X\beta_0\)), sig2
(or estimate of \(\sigma^2\)) and lbd
(\(\lambda\) from above loss functions).
By specifying two sets of arguments, (PE_1, sig2_1, lbd_1)
and (PE_2, sig2_2, lbd_2)
, users can sample from the mixture distribution. In this way, samples will be drawn from (PE_1, sig2_1, lbd_1)
with 1/2 probability and from (PE_2, sig2_2, lbd_2)
with another 1/2 probability.
set.seed(1234)
n <- 5; p <- 10
Niter <- 3
group <- rep(1:(p/5), each = 5)
weights <- rep(1, p/5)
beta0 <- c(rep(1,5), rep(0, p-5))
X <- matrix(rnorm(n*p), n)
Y <- X %*% beta0 + rnorm(n)
#
# Using non-mixture distribution
#
PBsampler(X = X, PE_1 = rep(0, p), sig2_1 = 1, lbd_1 = .5,
weights = weights, group = group, type = "grlasso", niter = Niter, parallel = FALSE)
## $beta
## [,1] [,2] [,3] [,4] [,5]
## [1,] -0.3525372 -0.04505232 -0.163496084 -0.20736594 0.04096835
## [2,] 0.0000000 0.00000000 0.000000000 0.00000000 0.00000000
## [3,] -0.1019592 -0.01026948 0.007230875 -0.04528641 0.02613062
## [,6] [,7] [,8] [,9] [,10]
## [1,] 0.07989826 0.01441626 -0.078991476 -0.01728012 -0.02587893
## [2,] 0.19450800 0.10014585 -0.449697335 -0.06017330 -0.11895590
## [3,] -0.48573785 -0.02475042 0.003966679 0.16608124 -0.22664921
##
## $subgrad
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] -0.7928269 -0.10133122 -0.36770991 -0.4663535 0.09212409 0.6801195
## [2,] 0.5445986 0.06272791 -0.41645041 0.2135834 -0.29673172 0.3757996
## [3,] -0.8845247 -0.08909051 0.06272808 -0.3928755 0.22669099 -0.8647542
## [,7] [,8] [,9] [,10]
## [1,] 0.12271068 -0.672414957 -0.1471035 -0.2202979
## [2,] 0.19352241 -0.868867603 -0.1162979 -0.2298557
## [3,] -0.04405627 0.007096803 0.2956620 -0.4035373
##
## $X
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] -1.2070657 0.5060559 -0.47719270 -0.1102855 0.1340882 -1.4482049
## [2,] 0.2774292 -0.5747400 -0.99838644 -0.5110095 -0.4906859 0.5747557
## [3,] 1.0844412 -0.5466319 -0.77625389 -0.9111954 -0.4405479 -1.0236557
## [4,] -2.3456977 -0.5644520 0.06445882 -0.8371717 0.4595894 -0.0151383
## [5,] 0.4291247 -0.8900378 0.95949406 2.4158352 -0.6937202 -0.9359486
## [,7] [,8] [,9] [,10]
## [1,] 1.1022975 -1.1676193 1.4494963 -0.9685143
## [2,] -0.4755931 -2.1800396 -1.0686427 -1.1073182
## [3,] -0.7094400 -1.3409932 -0.8553646 -1.2519859
## [4,] -0.5012581 -0.2942939 -0.2806230 -0.5238281
## [5,] -1.6290935 -0.4658975 -0.9943401 -0.4968500
##
## $PE
## [1] 0 0 0 0 0 0 0 0 0 0
##
## $sig2
## [1] 1
##
## $lbd
## [1] 0.5
##
## $weights
## [1] 1 1
##
## $group
## [1] 1 1 1 1 1 2 2 2 2 2
##
## $type
## [1] "grlasso"
##
## $PEtype
## [1] "coeff"
##
## $Btype
## [1] "gaussian"
##
## $Y
## NULL
##
## $mixture
## [1] FALSE
##
## $call
## PBsampler(X = X, PE_1 = rep(0, p), sig2_1 = 1, lbd_1 = 0.5, weights = weights,
## group = group, niter = Niter, type = "grlasso", parallel = FALSE)
##
## attr(,"class")
## [1] "PB"
#
# Using mixture distribution
#
PBsampler(X = X, PE_1 = rep(0, p), sig2_1 = 1, lbd_1 = .5,
PE_2 = rep(1, p), sig2_2 = 2, lbd_2 = .3, weights = weights,
group = group, type = "grlasso", niter = Niter, parallel = FALSE)
## $beta
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 0.0000000 0.0000000 0.00000000 0.0000000 0.00000000 -0.27036272
## [2,] -0.3321355 -0.3335305 -0.04145217 -0.4657345 0.06209445 -0.08106727
## [3,] 1.2724686 0.7145404 0.26382548 0.7718595 -0.07766709 0.28111670
## [,7] [,8] [,9] [,10]
## [1,] 0.1244449 -0.15833798 0.2244849 -0.1545494
## [2,] -0.3907588 -0.08506543 -0.3701445 -0.2274632
## [3,] 1.2572062 1.56849000 1.1991659 1.4822653
##
## $subgrad
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] -0.6154729 0.1204288 -0.02636072 0.1638148 0.05509138 -0.6236691
## [2,] -0.4984384 -0.5005026 -0.06221061 -0.6989186 0.09317068 -0.1360223
## [3,] 0.7602823 0.4269187 0.15762952 0.4611765 -0.04640962 0.1009130
## [,7] [,8] [,9] [,10]
## [1,] 0.2870925 -0.3652346 0.5177945 -0.3565099
## [2,] -0.6555875 -0.1427033 -0.6210494 -0.3816455
## [3,] 0.4514828 0.5632085 0.4305853 0.5323487
##
## $X
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] -1.2070657 0.5060559 -0.47719270 -0.1102855 0.1340882 -1.4482049
## [2,] 0.2774292 -0.5747400 -0.99838644 -0.5110095 -0.4906859 0.5747557
## [3,] 1.0844412 -0.5466319 -0.77625389 -0.9111954 -0.4405479 -1.0236557
## [4,] -2.3456977 -0.5644520 0.06445882 -0.8371717 0.4595894 -0.0151383
## [5,] 0.4291247 -0.8900378 0.95949406 2.4158352 -0.6937202 -0.9359486
## [,7] [,8] [,9] [,10]
## [1,] 1.1022975 -1.1676193 1.4494963 -0.9685143
## [2,] -0.4755931 -2.1800396 -1.0686427 -1.1073182
## [3,] -0.7094400 -1.3409932 -0.8553646 -1.2519859
## [4,] -0.5012581 -0.2942939 -0.2806230 -0.5238281
## [5,] -1.6290935 -0.4658975 -0.9943401 -0.4968500
##
## $PE
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## PE_1 0 0 0 0 0 0 0 0 0 0
## PE_2 1 1 1 1 1 1 1 1 1 1
##
## $sig2
## [1] 1 2
##
## $lbd
## [1] 0.5 0.3
##
## $weights
## [1] 1 1
##
## $group
## [1] 1 1 1 1 1 2 2 2 2 2
##
## $type
## [1] "grlasso"
##
## $PEtype
## [1] "coeff"
##
## $Btype
## [1] "gaussian"
##
## $Y
## NULL
##
## $mixture
## [1] TRUE
##
## $call
## PBsampler(X = X, PE_1 = rep(0, p), sig2_1 = 1, lbd_1 = 0.5, PE_2 = rep(1,
## p), sig2_2 = 2, lbd_2 = 0.3, weights = weights, group = group,
## niter = Niter, type = "grlasso", parallel = FALSE)
##
## attr(,"class")
## [1] "PB"
Importance sampler enables users to access the inference of an extreme region. This is done by using a proposal distiribution that is denser around the extreme region.
Say that we are interested in computing the expectation of a function of a random variable, \(h(X)\). Let \(f(x)\) be the true or target distribution and \(g(x)\) be the proposal distribution. We can approximate the expectation by a weighted average of samples drawn from the proposal distribution as follows,
\[
\begin{eqnarray}
E_f\Big[h(X)\Big] &=& E_g \Big[h(X)\frac{f(X)}{g(X)} \Big]\\
&\approx& \frac{1}{N}\sum_{i=1}^N h(x_i)\frac{f(x_i)}{g(x_i)}.
\end{eqnarray}
\] By using hdIS
method, one can easily compute the importance weight which is the ratio of target density over proposal density; \(f(x_i)/g(x_i)\) from above equation.
Users can simply draw samples from the proposal distribution using PBsampler
and plug in the result into hdIS
with target distribution parameters in order to compute the importance weights.
# Target distribution parameter
PETarget <- rep(0, p)
sig2Target <- .5
lbdTarget <- .37
# Proposal distribution parameter
PEProp1 <- rep(1, p)
sig2Prop1 <- .5
lbdProp1 <- 1
# Draw samples from proposal distribution
PB <- PBsampler(X = X, PE_1 = PEProp1, sig2_1 = sig2Prop1,
lbd_1 = lbdProp1, weights = weights, group = group, niter = Niter,
type="grlasso", PEtype = "coeff")
# Compute importance weights
hdIS(PB, PETarget = PETarget, sig2Target = sig2Target, lbdTarget = lbdTarget,
log = TRUE)
## [1] -96.45903 -126.87442 -69.81881
In this section, we introduce MHLS
method, a Markov Chain Monte Carlo(MCMC) sampler for lasso estimator. Although bootstrapping is one of the most convenient sampling methods, it has a clear limitation which is that sampling from the conditional distribution is impossible. In contrast, MCMC sampler can easily draw samples from the conditional distribution. Here, we introduce MHLS
function which draws lasso samples under the fixed active set, A.
weights <- rep(1,p)
lbd <- .37
LassoResult <- lassoFit(X = X,Y = Y,lbd = lbd, type = "lasso", weights = weights)
B0 <- LassoResult$B0
S0 <- LassoResult$S0
Result <- MHLS(X = X, PE = B0, sig2 = 1, lbd = 1,
weights = weights, B0 = B0, S0 = S0, niter = 100, burnin = 0,
PEtype = "coeff")
Result
## ===========================
## Number of iteration: 100
##
## Burn-in period: 0
##
## Plug-in PE:
## [1] 0.7995661 0.0000000 0.0000000 0.9605262 0.0000000 0.0000000 0.0000000
## [8] 0.8458788 0.0000000 0.6141082
## PEtype:
## [1] "coeff"
##
## Last 10 steps of beta's:
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
## [1,] 0.6461477 0 0 1.0846858 0 0 0 0.4806811 0
## [2,] 0.3534608 0 0 1.0846858 0 0 0 0.3844643 0
## [3,] 0.3534608 0 0 0.4931794 0 0 0 0.9696950 0
## [4,] 0.3449473 0 0 0.4931794 0 0 0 0.9696950 0
## [5,] 0.3449473 0 0 0.4931794 0 0 0 0.9696950 0
## [6,] 0.3449473 0 0 0.4931794 0 0 0 0.9696950 0
## [7,] 0.3449473 0 0 0.4931794 0 0 0 0.9696950 0
## [8,] 0.3449473 0 0 0.4680849 0 0 0 0.9353218 0
## [9,] 0.3449473 0 0 0.4680849 0 0 0 0.3042383 0
## [10,] 0.3449473 0 0 0.4680849 0 0 0 0.3042383 0
## [,10]
## [1,] 0.2336445
## [2,] 0.2336445
## [3,] 0.2336445
## [4,] 0.2336445
## [5,] 0.2336445
## [6,] 0.2336445
## [7,] 0.2336445
## [8,] 0.2336445
## [9,] 0.7305437
## [10,] 0.7305437
##
## last 10 steps of subgradients:
## [,1] [,2] [,3] [,4] [,5] [,6]
## [1,] 1.000003 0.20751105 0.5348776 0.9999994 -0.13120222 0.8152837
## [2,] 1.000003 0.19902287 0.5373069 0.9999994 -0.13266943 0.8230570
## [3,] 1.000003 0.14644498 0.5523550 0.9999994 -0.14175768 0.8712067
## [4,] 1.000003 0.14644498 0.5523550 0.9999994 -0.14175768 0.8712067
## [5,] 1.000003 0.09852526 0.5660699 0.9999994 -0.15004075 0.9150905
## [6,] 1.000003 0.03423921 0.5844690 0.9999994 -0.16115279 0.9739622
## [7,] 1.000003 0.10255316 0.5649171 0.9999994 -0.14934451 0.9114018
## [8,] 1.000003 0.10255316 0.5649171 0.9999994 -0.14934451 0.9114018
## [9,] 1.000003 0.46924288 0.4599684 0.9999994 -0.08596107 0.5755955
## [10,] 1.000003 0.31439579 0.5042865 0.9999994 -0.11272687 0.7174010
## [,7] [,8] [,9] [,10]
## [1,] -0.02095016 1.000024 -0.11176210 1
## [2,] -0.03408970 1.000024 -0.12631946 1
## [3,] -0.11547926 1.000024 -0.21649131 1
## [4,] -0.11547926 1.000024 -0.21649131 1
## [5,] -0.18965806 1.000024 -0.29867431 1
## [6,] -0.28917162 1.000024 -0.40892581 1
## [7,] -0.18342296 1.000024 -0.29176642 1
## [8,] -0.18342296 1.000024 -0.29176642 1
## [9,] 0.38420566 1.000024 0.33711177 1
## [10,] 0.14450535 1.000024 0.07154677 1
##
## Acceptance rate:
## -----------------------------
## beta subgrad
## # Accepted : 122 62
## # Moved : 396 99
## Acceptance rate : 0.308 0.626
We provide summary and plot functions for MHLS
results.
summary(Result)
## $beta
## mean median s.d 2.5% 97.5%
## [1,] 0.3987769 0.3449473 0.2007944 0.11304538 0.8853218
## [2,] 0.0000000 0.0000000 0.0000000 0.00000000 0.0000000
## [3,] 0.0000000 0.0000000 0.0000000 0.00000000 0.0000000
## [4,] 0.5038473 0.4931794 0.3748674 0.04885877 1.0846858
## [5,] 0.0000000 0.0000000 0.0000000 0.00000000 0.0000000
## [6,] 0.0000000 0.0000000 0.0000000 0.00000000 0.0000000
## [7,] 0.0000000 0.0000000 0.0000000 0.00000000 0.0000000
## [8,] 0.5765659 0.6186559 0.3053770 0.04618138 1.1025196
## [9,] 0.0000000 0.0000000 0.0000000 0.00000000 0.0000000
## [10,] 0.3805857 0.3966923 0.2790838 0.03606252 0.8890657
##
## $subgradient
## mean median s.d 2.5% 97.5%
## [1,] 1.00000314 1.0000031 0.00000000 1.00000314 1.00000314
## [2,] 0.29514219 0.2500872 0.18012879 0.06101587 0.71133476
## [3,] 0.50979702 0.5226920 0.05155390 0.39068031 0.57680532
## [4,] 0.99999936 0.9999994 0.00000000 0.99999936 0.99999936
## [5,] -0.11605491 -0.1238428 0.03113581 -0.15652436 -0.04411475
## [6,] 0.73503303 0.7762934 0.16495796 0.35389307 0.94944074
## [7,] 0.11470115 0.0449569 0.27883589 -0.24772187 0.75895921
## [8,] 1.00002402 1.0000240 0.00000000 1.00002402 1.00002402
## [9,] 0.03852656 -0.0387434 0.30892348 -0.36300346 0.75230284
## [10,] 1.00000000 1.0000000 0.00000000 1.00000000 1.00000000
##
## attr(,"class")
## [1] "summary.MHLS"
plot(Result, index=c(1,4,9))
## Hit <Return> to see the next plot:
## Hit <Return> to see the next plot:
## Hit <Return> to see the next plot:
postInference.MHLS
is a method for post-selection inference. The inference is provided by MHLS
results from multiple chains. In order to achieve the robust result, different PE
values are used for each chain. After drawing samples of \((\hat{\beta}, S)\) with MH sampler, we then refit the estimator to remove the bias of the lasso estimator. The final output is the \((1-a)\) quantile of each active coefficient.
postInference.MHLS(X = X, Y = Y, lbd = lbd, sig2.hat = 1, alpha = .05,
method = "coeff", nChain = 5, niterPerChain = 20,
parallel = !(.Platform$OS.type == "windows"))
## beta1 beta4 beta8 beta10
## LowerBound -0.2274856 -0.06173088 -0.8538049 -0.1946631
## UpperBound 2.5068937 2.09967452 0.9623414 2.4811133