ML algorithms
qeAdaBoost(): Ada Boosting, wraps Jousboost pkg
qeDeepnet(): wraps deepnet pkg
qeDT(): decision trees, wraps party pkg
qeGBoost(): gradient boosting, wraps gbm pkg
qeKNN(): k-Nearest Neighbors, wraps regtools pkg; includes predictor importance settings; allows linear interpolation within a bin
qeLASSO(): LASSO and ridge regression, wraps glmment pkg
qeLightGBoost(): gradient boosting, wraps lightgbm pkg
qeliquidSVM: wraps liquidSVM pkg
qeLin(): wraps R’s lm()
qeLinKNN(): first fits qeLin(), followed by k-NN on the residuals to correct deviations from linearity
qeLogit(): wraps R’s glm()
qeNCVregCV: wraps ncvreg package, linear gen. linear regression regularized via SCAD etc.
qeNeural(): wraps keras package, including CNN
qePolyLASSO(): LASSO/ridge applied to polynomial regression; wraps glmnet, polyreg pkgs
qePolyLin(): polynomial regression on linear models; uses Moore-Penrose inverse if overfitting; wraps polyreg pkg
qePolyLog(): polynomial regression on logistic models; wraps polyreg pkg
qeRF(): random forests, wraps randomforest pkg
qeRFgrf: random forests, wraps grf pkg; allows linear interpolation within a bin
qeRpart(): decision trees, wraps Rpart pkg; colorful tree plot
qeRFranger(): random forests, wraps ranger pkg
qeskRF(): random forests, wraps Python Scilearn pkg
qeskSVM(): SVM, wraps Python Scilearn pkg
qeSVM(): SVM, wraps e1071 pkg
qeSVMliquid(): SVM, wraps liquid SVM pkg
qeXGBoost() wraps the xgboost pkg
feature importance/selection
qeFOCI(), qeFOCIrand(): fully nonparametric method for feature selection
qeLASSO(): for feature importance, apply coef() to return value
qeLeaveOut1Var: fits full model, then with all features but 1, for each feature, reporting difference in predictive power; use with any qeML predictive function
qeRareLevels(): investigates whether rare levels of a feature that is an R factor should be included
qeRFranger: variable.importance component of return value
model development
doubleD(): computation and plotting for exploring Double Descent
plotClassesUMAP(): plot first two UMAP components, color-coding classes
plotPairedResiduals(): plot residuals against pairs of features
qeCompare(): compare the accuracy various ML methods on a given dataset
qeFT(): automated grid hyperparameter search, with Bonferroni-Dunn corrected standard errors
qePCA(): find principal components, number specified by user, then fit the resulting model, according to qe* function specified by user
qeROC(): ROC computation and plotting, wraps pROC pkg
qeUMAP(): same as qePCA() but using UMAP
replicMeans(): (from regtools, included in qeML) averages output, e.g. testAcc, over many holdout sets
application-specific functions (elementary)
qeText() text classification
qeTS(): time series
prediction with missing values
utilities, exploratory tools
cartesianFactor(): with inputs of R factors of n1, n2… levels, creates a combined “superfactor” of n1n2… levels
dataToTopLevels(): applies factorToTopLevels() to all fadtors in the given data frame
factorToTopLevels(): removes rare levels from a factor
levelCounts(): performs a census of levels for each R factor in the dataset
newDFRow(): creates a new case to input to predict()
qeParallel(): apply “Software Alchemy” to parallelize qe functions