Maciej Nasinski
Check the miceFast website for more details
Fast imputations under the object-oriented programming
paradigm.
Moreover there are offered a few functions built to work with popular R
packages such as ‘data.table’ or ‘dplyr’. The biggest improvement in
time performance could be achieve for a calculation where a grouping
variable have to be used. A single evaluation of a quantitative model
for the multiple imputations is another major enhancement. A new major
improvement is one of the fastest predictive mean matching in the R
world because of presorting and binary search.
Performance benchmarks (check performance_validity.R file at extdata).
install.packages('miceFast')
or
# install.packages("devtools")
::install_github("polkas/miceFast") devtools
Recommended to download boosted BLAS library, even x100 faster:
sudo apt-get install libopenblas-dev
cd /Library/Frameworks/R.framework/Resources/lib
ln -sf /System/Library/Frameworks/Accelerate.framework/Frameworks/vecLib.framework/Versions/Current/libBLAS.dylib libRblas.dylib
library(miceFast)
set.seed(1234)
data(air_miss)
# plot NA structure
upset_NA(air_miss, 6)
naive_fill_NA(air_miss)
# Check out the vignette for an advance usage
# There is required a thorough examination
# Other packages - popular simple solutions
# Hmisc
data.frame(Map(function(x) Hmisc::impute(x, 'random'), air_miss))
#mice
::complete(mice::mice(air_miss, printFlag = FALSE)) mice
Quick Reference Table
Function | Description |
---|---|
new(miceFast) |
OOP instance with bunch of methods - check out vignette |
fill_NA() |
imputation - lda,lm_pred,lm_bayes,lm_noise |
fill_NA_N() |
multiple imputation - pmm,lm_bayes,lm_noise |
VIF() |
Variance inflation factor |
naive_fill_NA() |
auto imputations |
compare_imp() |
comparing imputations |
upset_NA() |
visualize NA structure - UpSetR::upset |
Summing up, miceFast
offer a relevant reduction of a
calculations time for:
mice
algorithm was improved
too).Environment: R 4.2.1 Mac M1
If you are interested about the procedure of testing performance and validity check performance_validity.R file at the extdata folder.