Adaptive Huber Estimation and Regression
This package implements the Huber-type estimator for mean, covariance matrix, regression and l1-regularized Huber regression (Huber-Lasso). For all these methods, the robustification parameter τ is calibrated via a tuning-free principle.
Specifically, for Huber regression, assume the observed data vectors (Y, X) follow a linear model Y = θ0 + X θ + ε, where Y is an n-dimensional response vector, X is an n × d design matrix, and ε is an n-vector of noise variables whose distributions can be asymmetric and/or heavy-tailed. The package computes the standard Huber’s M-estimator when d < n and the Huber-Lasso estimator when d > n. The vector of coefficients θ and the intercept term θ0 are estimated successively via a two-step procedure. See Wang et al., 2021 for more details.
2022-03-04
Version 1.1 is submitted to CRAN.
Install adaHuber
from CRAN
install.packages("adaHuber")
Error: Compilation failed (with messages involving lgfortran, clang, etc.). Solution: This is a compilation error of Rcpp-based source packages. It happens when we recently submit a new version to CRAN, but it usually takes 3-5 days to build the binary package. Please use an older version or patiently wait for 3-5 days and then install the updated version.
Error: unable to load shared object.. Symbol not found:
_EXTPTR_PTR. Solution: This issue is common in some
specific versions of R
when we load Rcpp-based libraries.
It is an error in R caused by a minor change about
EXTPTR_PTR
. Upgrading R to 4.0.2 will solve the
problem.
There are five functions in this package:
adaHuber.mean
: Adaptive Huber mean estimation.adaHuber.cov
: Adaptive Huber covariance
estimation.adaHuber.reg
: Adaptive Huber regression.adaHuber.lasso
: Adaptive Huber-Lasso regression.adaHuber.cv.lasso
: Cross-validated adaptive Huber-Lasso
regression.Help on the functions can be accessed by typing ?
,
followed by function name at the R command prompt.
For example, ?adaHuber.reg
will present a detailed
documentation with inputs, outputs and examples of the function
adaHuber.reg
.
First, we present an example of Huber mean estimation. We generate data from a t distribution, which is heavy-tailed. We estimate its mean by the tuning-free Huber mean estimator.
library(adaHuber)
= 1000
n = 2
mu = rt(n, 2) + mu
X = adaHuber.mean(X)
fit.mean $mu fit.mean
Then we present an example of Huber covariance matrix estimation. We generate data from t distribution with df = 3, which is heavy-tailed.
= 100
n = 5
p = matrix(rt(n * p, 3), n, p)
X = adaHuber.cov(X)
fit.cov $cov fit.cov
Next, we present an example of adaptive Huber regression. Here we generate data from a linear model Y = X θ + ε, where ε follows a t distribution, and estimate the intercept and coefficients by tuning-free Huber regression.
= 200
n = 10
p = rep(1.5, p + 1)
beta = matrix(rnorm(n * p), n, p)
X = rt(n, 2)
err = cbind(1, X) %*% beta + err
Y
= adaHuber.reg(X, Y, method = "adaptive")
fit.adahuber = fit.adahuber$coef beta.adahuber
Finally, we illustrate the use of l1-regularized Huber regression. Again, we generate data from a linear model Y = X θ + ε, where θ is a high-dimensional vector, and ε is from a t distribution. We estimate the intercept and coefficients by Huber-Lasso regression, where the regularization parameter λ is calibrated by K-fold cross-validation, and the robustification parameter τ is chosen by a tuning-free procedure.
= 100; p = 200; s = 5
n = c(rep(1.5, s + 1), rep(0, p - s))
beta = matrix(rnorm(n * p), n, p)
X = rt(n, 2)
err = cbind(rep(1, n), X) %*% beta + err
Y
= adaHuber.cv.lasso(X, Y)
fit.lasso = fit.lasso$coef beta.lasso
GPL-3.0
C++11
Xiaoou Pan xip024@ucsd.edu, Wen-Xin Zhou wez243@ucsd.edu
Xiaoou Pan xip024@ucsd.edu
Eddelbuettel, D. and Francois, R. (2011). Rcpp: Seamless R and C++ integration. J. Stat. Softw. 40 1-18. Paper
Fan, J., Liu, H., Sun, Q. and Zhang, T. (2018). I-LAMM for sparse learning: Simultaneous control of algorithmic complexity and statistical error. Ann. Statist. 46 814–841. Paper
Ke, Y., Minsker, S., Ren, Z., Sun, Q. and Zhou, W.-X. (2019). User-friendly covariance estimation for heavy-tailed distributions. Statis. Sci. 34 454-471. Paper
Pan, X., Sun, Q. and Zhou, W.-X. (2021). Iteratively reweighted l1-penalized robust regression. Electron. J. Stat. 15 3287-3348. Paper
Sun, Q., Zhou, W.-X. and Fan, J. (2020). Adaptive Huber regression. J. Amer. Stat. Assoc. 115 254-265. Paper
Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2021). A new principle for tuning-free Huber regression. Stat. Sinica 31 2153-2177. Paper