keras
is an R based interface to the Keras: the Python Deep Learning library. It
uses the TensorFlow backend
engine.
The basic workflow is to define a model object of class
keras.engine.training.Model
by initialising it using the
keras_model_sequential
function and then adding layers to
it. Function fit
trains a Keras model. It requires the
predictors (inputs) and responses (targets/labels) to be passed a two
separate data objects as vector, matrix, or arrays.
Use the Diabetes in Pima Indian Women dataset from library
MASS
library(keras)
library(condvis2)
library(MASS)
set.seed(123)
Prepare data for Keras and Condvis:
# Training features
<- Pima.tr[,1:7]
Pima.training # Testing features
<- Pima.te[,1:7]
Pima.testing
# Scale the data
<-as.matrix(scale(Pima.training))
Pima.training <- attr(Pima.training,"scaled:center")
means <- attr(Pima.training,"scaled:scale")
sds<- as.matrix(scale(Pima.testing, center=means, scale=sds))
Pima.testing
# One hot encode training target values
<- to_categorical(as.numeric(Pima.tr[,8]) -1)[, 2]
Pima.trainLabels # One hot encode test target values
<- to_categorical(as.numeric(Pima.te[,8]) -1)[, 2]
Pima.testLabels
# Create dataframes for Condvis
<- data.frame(Pima.training)
dtf $Pima.trainLabels <- Pima.tr[,8]
dtf
<- data.frame(Pima.testing)
dtf.te $Pima.testLabels <- Pima.te[,8] dtf.te
Define and fit the model:
<- keras_model_sequential() # Add layers to the model
model %>%
model layer_dense(units = 8, activation = 'tanh', input_shape = c(7)) %>%
layer_dense(units = 1, activation = 'sigmoid')
# Print a summary of a model
summary(model)
# Compile the model
%>% compile(
model loss = 'binary_crossentropy',
optimizer = 'adam',
metrics = 'accuracy'
)
# Fit the model
<-model %>% fit(Pima.training, Pima.trainLabels,
history epochs = 500,
batch_size = 50,
validation_split = 0.2,
class_weight = as.list(c("0" = 1, "1"=3))
)
Condvis uses a generic CVpredict
to provide a uniform
interface to predict
methods. For classification, the
choice of ptype
allows for output for each observation
as:
ptype
= “pred” (default)ptype
= “prob”
(e.g. \(P(X=1)\) in binary
classification).ptype
= “probmatrix”<- "Pima.testLabels"
kresponse <- setdiff(names(dtf.te),kresponse)
kpreds CVpredict(model, dtf.te[1:10,], response=kresponse, predictors=kpreds)
CVpredict(model, dtf.te[1:10,], response=kresponse, predictors=kpreds, ptype="prob")
CVpredict(model, dtf.te[1:10,], response=kresponse, predictors=kpreds, ptype="probmatrix")
Note that for keras
models so one needs to specify the
name of response and predictors for CVpredict
. When
creating the Condvis shiny app, arguments for CVpredict
can
be passed in condvis
using predictArgs
argument.
Calculate model accuracy from:
mean(CVpredict(model, dtf.te, response=kresponse, predictors=kpreds) == dtf.te$Pima.testLabels)
Compare to LDA:
<- lda(Pima.trainLabels~., data = dtf)
fit.lda mean(CVpredict(fit.lda, dtf.te) == dtf.te$Pima.testLabels)
LDA scores higher on accuracy. It is known that a linear model performs best for this dataset.
<- "Pima.trainLabels"
kresponse <- list(response=kresponse,predictors=kpreds)
kArgs1 condvis(dtf, list(model.keras = model, model.lda = fit.lda), sectionvars = c("bmi", "glu"), response="Pima.trainLabels",predictArgs = list(kArgs1, NULL), pointColor = "Pima.trainLabels")
Click the showprobs button to see class probabilities.
To view a tour through the space where the fits differ: select
Choose tour
option Diff fits
and click on the
arrow below Tour Step
to watch. You can increase the number
of points via the Tour Length
slider.
Use the Boston housing data. This example comes from one of the original keras tutorial vignettes, which is no longer available.
Prepare data:
<- dataset_boston_housing()
boston_housing c(train_data, train_labels) %<-% boston_housing$train
c(test_data, test_labels) %<-% boston_housing$test
# Normalize training data
<- scale(train_data)
train_data
# Use means and standard deviations from training set to normalize test set
<- attr(train_data, "scaled:center")
col_means_train <- attr(train_data, "scaled:scale")
col_stddevs_train <- scale(test_data, center = col_means_train, scale = col_stddevs_train) test_data
Fit the model:
<- function() {
build_model
<- keras_model_sequential() %>%
model layer_dense(units = 64, activation = "relu",
input_shape = dim(train_data)[2]) %>%
layer_dense(units = 64, activation = "relu") %>%
layer_dense(units = 1)
%>% compile(
model loss = "mse",
optimizer = optimizer_rmsprop(),
metrics = list("mean_absolute_error")
)
model
}
<- build_model()
model %>% summary()
model
# Display training progress by printing a single dot for each completed epoch.
<- callback_lambda(
print_dot_callback on_epoch_end = function(epoch, logs) {
if (epoch %% 80 == 0) cat("\n")
cat(".")
}
)
<- 500
epochs
# Fit the model
<- callback_early_stopping(monitor = "val_loss", patience = 20)
early_stop
<- build_model()
model <- model %>% fit(
history
train_data,
train_labels,epochs = epochs,
validation_split = 0.2,
verbose = 0,
callbacks = list(early_stop, print_dot_callback)
)
Create dataframes for condvis
:
<- c('CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE',
column_names 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT')
<- data.frame(train_data)
train_df colnames(train_df) <- column_names
$medv <- as.numeric(train_labels)
train_df
<- data.frame(test_data)
test_df colnames(test_df) <- column_names
$medv <- as.numeric(test_labels)
test_df<- column_names
kpreds <- "medv" kresponse
Fit some other models for comparison (a random forest and a generalised additive model):
suppressMessages(library(mgcv))
= gam(medv ~ s(LSTAT) + s(RM) + s(CRIM), data=train_df)
gam.model
suppressMessages(library(randomForest))
<- randomForest(formula = medv ~ ., data = train_df) rf.model
Use CVpredict
to compare RMSE in the scaled data:
mean((test_labels - CVpredict(model, test_df, response=kresponse, predictors=kpreds))^2)
mean((test_labels - CVpredict(gam.model, test_df))^2)
mean((test_labels - CVpredict(rf.model, test_df))^2, na.rm=TRUE)
RF gives the best fit.
<- list(response=kresponse,predictors=kpreds)
kArgs condvis(train_df, list(gam = gam.model, rf = rf.model, kerasmodel = model), sectionvars = c("LSTAT", "RM"),predictArgs = list(NULL, NULL, kArgs) )
Ticking Show 3d surface
shows the 3d-surface of the fit
and you can use the Rotate
slider to rotate them around the
z-axis.
Random forest gives a blockier fit, compared to the smooth gams and the neural net. The RF fit is more flexible in the areas where there is more data points.