A Keras model consists of multiple components:
The Keras API saves all of these pieces together in a unified format,
marked by the .keras
extension. This is a zip archive
consisting of the following:
model.weights.h5
(for
the whole model), with directory keys for layers and their weights.Let’s take a look at how this works.
If you only have 10 seconds to read this guide, here’s what you need to know.
Saving a Keras model:
# Get model (Sequential, Functional Model, or Model subclass)
model <- ...
# The filename needs to end with the .keras extension
model |> save_model('path/to/location.keras')
Loading the model back:
Now, let’s look at the details.
This section is about saving an entire model to a single file. The file will include:
compile()
was
called)You can save a model with save_model()
. You can load it
back with load_model()
.
The only supported format in Keras 3 is the “Keras v3” format, which
uses the .keras
extension.
Example:
get_model <- function() {
# Create a simple model.
inputs <- keras_input(shape(32))
outputs <- inputs |> layer_dense(1)
model <- keras_model(inputs, outputs)
model |> compile(optimizer = optimizer_adam(), loss = "mean_squared_error")
model
}
model <- get_model()
# Train the model.
test_input <- random_uniform(c(128, 32))
test_target <- random_uniform(c(128, 1))
model |> fit(test_input, test_target)
# Calling `save('my_model.keras')` creates a zip archive `my_model.keras`.
model |> save_model("my_model.keras")
# It can be used to reconstruct the model identically.
reconstructed_model <- load_model("my_model.keras")
# Let's check:
stopifnot(all.equal(
model |> predict(test_input),
reconstructed_model |> predict(test_input)
))
This section covers the basic workflows for handling custom layers, functions, and models in Keras saving and reloading.
When saving a model that includes custom objects, such as a
subclassed Layer, you must define a
get_config()
method on the object class. If the arguments
passed to the constructor (initialize()
method) of the
custom object aren’t simple objects (anything other than types like
ints, strings, etc.), then you must also explicitly
deserialize these arguments in the from_config()
class
method.
Like this:
layer_custom <- Layer(
"CustomLayer",
initialize = function(sublayer, ...) {
super$initialize(...)
self$sublayer <- sublayer
},
call = function(x) {
self$sublayer(x)
},
get_config = function() {
base_config <- super$get_config()
config <- list(
sublayer = serialize_keras_object(self$sublayer)
)
c(base_config, config)
},
from_config = function(cls, config) {
sublayer_config <- config$sublayer
sublayer <- deserialize_keras_object(sublayer_config)
cls(sublayer, !!!config)
}
)
Please see the Defining the config methods section for more details and examples.
The saved .keras
file is lightweight and does not store
the Python code for custom objects. Therefore, to reload the model,
load_model
requires access to the definition of any custom
objects used through one of the following methods:
Below are examples of each workflow:
This is the preferred method, as custom object registration greatly
simplifies saving and loading code. Calling
register_keras_serializable()
on a custom object registers
the object globally in a master list, allowing Keras to recognize the
object when loading the model.
Let’s create a custom model involving both a custom layer and a custom activation function to demonstrate this.
Example:
## named list()
layer_custom <- Layer(
"CustomLayer",
initialize = function(self, factor) {
super$initialize()
self$factor = factor
},
call = function(self, x) {
x * self$factor
},
get_config = function(self) {
list(factor = self$factor)
}
)
# Upon registration, you can optionally specify a package or a name.
# If left blank, the package defaults to "Custom" and the name defaults to
# the class name.
register_keras_serializable(layer_custom, package = "MyLayers")
custom_fn <- keras3:::py_func2(function(x) x^2, name = "custom_fn", convert = TRUE)
register_keras_serializable(custom_fn, name="custom_fn", package="my_package")
# Create the model.
get_model <- function() {
inputs <- keras_input(shape(4))
mid <- inputs |> layer_custom(0.5)
outputs <- mid |> layer_dense(1, activation = custom_fn)
model <- keras_model(inputs, outputs)
model |> compile(optimizer = "rmsprop", loss = "mean_squared_error")
model
}
# Train the model.
train_model <- function(model) {
input <- random_uniform(c(4, 4))
target <- random_uniform(c(4, 1))
model |> fit(input, target, verbose = FALSE, epochs = 1)
model
}
test_input <- random_uniform(c(4, 4))
test_target <- random_uniform(c(4, 1))
model <- get_model() |> train_model()
model |> save_model("custom_model.keras", overwrite = TRUE)
# Now, we can simply load without worrying about our custom objects.
reconstructed_model <- load_model("custom_model.keras")
# Let's check:
stopifnot(all.equal(
model |> predict(test_input, verbose = FALSE),
reconstructed_model |> predict(test_input, verbose = FALSE)
))
load_model()
model <- get_model() |> train_model()
# Calling `save_model('my_model.keras')` creates a zip archive `my_model.keras`.
model |> save_model("custom_model.keras", overwrite = TRUE)
# Upon loading, pass a named list containing the custom objects used in the
# `custom_objects` argument of `load_model()`.
reconstructed_model <- load_model(
"custom_model.keras",
custom_objects = list(CustomLayer = layer_custom,
custom_fn = custom_fn),
)
# Let's check:
stopifnot(all.equal(
model |> predict(test_input, verbose = FALSE),
reconstructed_model |> predict(test_input, verbose = FALSE)
))
Any code within the custom object scope will be able to recognize the custom objects passed to the scope argument. Therefore, loading the model within the scope will allow the loading of our custom objects.
Example:
model <- get_model() |> train_model()
model |> save_model("custom_model.keras", overwrite = TRUE)
# Pass the custom objects dictionary to a custom object scope and place
# the `keras.models.load_model()` call within the scope.
custom_objects <- list(CustomLayer = layer_custom, custom_fn = custom_fn)
with_custom_object_scope(custom_objects, {
reconstructed_model <- load_model("custom_model.keras")
})
# Let's check:
stopifnot(all.equal(
model |> predict(test_input, verbose = FALSE),
reconstructed_model |> predict(test_input, verbose = FALSE)
))
This section is about saving only the model’s configuration, without its state. The model’s configuration (or architecture) specifies what layers the model contains, and how these layers are connected. If you have the configuration of a model, then the model can be created with a freshly initialized state (no weights or compilation information).
The following serialization APIs are available:
clone_model(model)
: make a (randomly initialized) copy
of a model.get_config()
and cls.from_config()
:
retrieve the configuration of a layer or model, and recreate a model
instance from its config, respectively.keras.models.model_to_json()
and
keras.models.model_from_json()
: similar, but as JSON
strings.keras.saving.serialize_keras_object()
: retrieve the
configuration any arbitrary Keras object.keras.saving.deserialize_keras_object()
: recreate an
object instance from its configuration.You can do in-memory cloning of a model via
clone_model()
. This is equivalent to getting the config
then recreating the model from its config (so it does not preserve
compilation information or layer weights values).
Example:
get_config()
and from_config()
Calling get_config(model)
or
get_config(layer)
will return a named list containing the
configuration of the model or layer, respectively. You should define
get_config()
to contain arguments needed for the
initialize()
method of the model or layer. At loading time,
the from_config(config)
method will then call
initialize()
with these arguments to reconstruct the model
or layer.
Layer example:
## List of 12
## $ name : chr "dense_4"
## $ trainable : logi TRUE
## $ dtype :List of 4
## ..$ module : chr "keras"
## ..$ class_name : chr "DTypePolicy"
## ..$ config :List of 1
## .. ..$ name: chr "float32"
## ..$ registered_name: NULL
## $ units : int 3
## $ activation : chr "relu"
## $ use_bias : logi TRUE
## $ kernel_initializer:List of 4
## ..$ module : chr "keras.initializers"
## ..$ class_name : chr "GlorotUniform"
## ..$ config :List of 1
## .. ..$ seed: NULL
## ..$ registered_name: NULL
## $ bias_initializer :List of 4
## ..$ module : chr "keras.initializers"
## ..$ class_name : chr "Zeros"
## ..$ config : Named list()
## ..$ registered_name: NULL
## $ kernel_regularizer: NULL
## $ bias_regularizer : NULL
## $ kernel_constraint : NULL
## $ bias_constraint : NULL
## - attr(*, "__class__")=<class 'keras.src.layers.core.dense.Dense'>
Now let’s reconstruct the layer using the from_config()
method:
Sequential model example:
model <- keras_model_sequential(input_shape = c(32)) |>
layer_dense(1)
config <- get_config(model)
new_model <- from_config(config)
Functional model example:
save_model_config()
and
load_model_config()
This is similar to get_config
/
from_config
, except it turns the model into a JSON file,
which can then be loaded without the original model class. It is also
specific to models, it isn’t meant for layers.
Example:
The serialize_keras_object()
and
deserialize_keras_object()
APIs are general-purpose APIs
that can be used to serialize or deserialize any Keras object and any
custom object. It is at the foundation of saving model architecture and
is behind all serialize()
/deserialize()
calls
in keras.
Example:
## List of 4
## $ module : chr "keras.regularizers"
## $ class_name : chr "L1"
## $ config :List of 1
## ..$ l1: num 0.005
## $ registered_name: NULL
Note the serialization format containing all the necessary information for proper reconstruction:
module
containing the name of the Keras module or other
identifying module the object comes fromclass_name
containing the name of the object’s
class.config
with all the information needed to reconstruct
the objectregistered_name
for custom objects. See here.Now we can reconstruct the regularizer.
## <keras.src.regularizers.regularizers.L1 object>
## signature: (x)
You can choose to only save & load a model’s weights. This can be useful if:
Weights can be copied between different objects by using
get_weights()
and set_weights()
:
get_weights(<layer>)
: Returns a list of arrays of
weight values.set_weights(<layer>weights)
: Sets the model/layer
weights to the values provided (as arrays).Examples:
Transferring weights from one layer to another, in memory
create_layer <- function() {
layer <- layer_dense(, 64, activation = "relu", name = "dense_2")
layer$build(shape(NA, 784))
layer
}
layer_1 <- create_layer()
layer_2 <- create_layer()
# Copy weights from layer 1 to layer 2
layer_2 |> set_weights(get_weights(layer_1))
Transferring weights from one model to another model with a compatible architecture, in memory
# Create a simple functional model
inputs <- keras_input(shape=c(784), name="digits")
outputs <- inputs |>
layer_dense(64, activation = "relu", name = "dense_1") |>
layer_dense(64, activation = "relu", name = "dense_2") |>
layer_dense(10, name = "predictions")
functional_model <- keras_model(inputs = inputs, outputs = outputs,
name = "3_layer_mlp")
# Define a subclassed model with the same architecture
SubclassedModel <- new_model_class(
"SubclassedModel",
initialize = function(output_dim, name = NULL) {
super$initialize(name = name)
self$output_dim <- output_dim |> as.integer()
self$dense_1 <- layer_dense(, 64, activation = "relu",
name = "dense_1")
self$dense_2 <- layer_dense(, 64, activation = "relu",
name = "dense_2")
self$dense_3 <- layer_dense(, self$output_dim,
name = "predictions")
},
call = function(inputs) {
inputs |>
self$dense_1() |>
self$dense_2() |>
self$dense_3()
},
get_config = function(self) {
list(output_dim = self$output_dim,
name = self$name)
}
)
subclassed_model <- SubclassedModel(10)
# Call the subclassed model once to create the weights.
subclassed_model(op_ones(c(1, 784))) |> invisible()
# Copy weights from functional_model to subclassed_model.
set_weights(subclassed_model, get_weights(functional_model))
stopifnot(all.equal(
get_weights(functional_model),
get_weights(subclassed_model)
))
The case of stateless layers
Because stateless layers do not change the order or number of weights, models can have compatible architectures even if there are extra/missing stateless layers.
input <- keras_input(shape = c(784), name = "digits")
output <- input |>
layer_dense(64, activation = "relu", name = "dense_1") |>
layer_dense(64, activation = "relu", name = "dense_2") |>
layer_dense(10, name = "predictions")
functional_model <- keras_model(inputs, outputs,
name = "3_layer_mlp")
input <- keras_input(shape = c(784), name = "digits")
output <- input |>
layer_dense(64, activation = "relu", name = "dense_1") |>
layer_dense(64, activation = "relu", name = "dense_2") |>
# Add a dropout layer, which does not contain any weights.
layer_dropout(0.5) |>
layer_dense(10, name = "predictions")
functional_model_with_dropout <-
keras_model(input, output, name = "3_layer_mlp")
set_weights(functional_model_with_dropout,
get_weights(functional_model))
Weights can be saved to disk by calling
save_model_weights(filepath)
. The filename should end in
.weights.h5
.
Example:
sequential_model = keras_model_sequential(input_shape = c(784),
input_name = "digits") |>
layer_dense(64, activation = "relu", name = "dense_1") |>
layer_dense(64, activation = "relu", name = "dense_2") |>
layer_dense(10, name = "predictions")
sequential_model |> save_model_weights("my_model.weights.h5")
sequential_model |> load_model_weights("my_model.weights.h5")
Note that using freeze_weights()
may result in a
different output from get_weights(layer)
ordering when the
model contains nested layers.
When loading pretrained weights from a weights file, it is recommended to load the weights into the original checkpointed model, and then extract the desired weights/layers into a new model.
Example:
create_functional_model <- function() {
inputs <- keras_input(shape = c(784), name = "digits")
outputs <- inputs |>
layer_dense(64, activation = "relu", name = "dense_1") |>
layer_dense(64, activation = "relu", name = "dense_2") |>
layer_dense(10, name = "predictions")
keras_model(inputs, outputs, name = "3_layer_mlp")
}
functional_model <- create_functional_model()
functional_model |> save_model_weights("pretrained.weights.h5")
# In a separate program:
pretrained_model <- create_functional_model()
pretrained_model |> load_model_weights("pretrained.weights.h5")
# Create a new model by extracting layers from the original model:
extracted_layers <- pretrained_model$layers |> head(-1)
model <- keras_model_sequential(layers = extracted_layers) |>
layer_dense(5, name = "dense_3")
summary(model)
## [1mModel: "sequential_4"[0m
## ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
## ┃[1m [0m[1mLayer (type) [0m[1m [0m┃[1m [0m[1mOutput Shape [0m[1m [0m┃[1m [0m[1m Param #[0m[1m [0m┃
## ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
## │ dense_1 ([38;5;33mDense[0m) │ ([38;5;45mNone[0m, [38;5;34m64[0m) │ [38;5;34m50,240[0m │
## ├─────────────────────────────────┼────────────────────────┼───────────────┤
## │ dense_2 ([38;5;33mDense[0m) │ ([38;5;45mNone[0m, [38;5;34m64[0m) │ [38;5;34m4,160[0m │
## ├─────────────────────────────────┼────────────────────────┼───────────────┤
## │ dense_3 ([38;5;33mDense[0m) │ ([38;5;45mNone[0m, [38;5;34m5[0m) │ [38;5;34m325[0m │
## └─────────────────────────────────┴────────────────────────┴───────────────┘
## [1m Total params: [0m[38;5;34m54,725[0m (213.77 KB)
## [1m Trainable params: [0m[38;5;34m54,725[0m (213.77 KB)
## [1m Non-trainable params: [0m[38;5;34m0[0m (0.00 B)
Specifications:
get_config()
should return a JSON-serializable named
list in order to be compatible with the Keras architecture and
model-saving APIs.from_config(config)
(a class method) should return a
new layer or model object that is created from the config. The default
implementation returns do.call(cls, config)
.NOTE: If all your constructor arguments are already
serializable, e.g. strings and ints, or non-custom Keras objects,
overriding from_config()
is not necessary. However, for
more complex objects such as layers or models passed to
initialize()
, deserialization must be handled explicitly
either in initialize
itself or overriding the
from_config()
method.
Example:
layer_my_dense <- register_keras_serializable(
package = "MyLayers", name = "KernelMult",
object = Layer(
"MyDense",
initialize = function(units,
...,
kernel_regularizer = NULL,
kernel_initializer = NULL,
nested_model = NULL) {
super$initialize(...)
self$hidden_units <- units
self$kernel_regularizer <- kernel_regularizer
self$kernel_initializer <- kernel_initializer
self$nested_model <- nested_model
},
get_config = function() {
config <- super$get_config()
# Update the config with the custom layer's parameters
config <- modifyList(config, list(
units = self$hidden_units,
kernel_regularizer = self$kernel_regularizer,
kernel_initializer = self$kernel_initializer,
nested_model = self$nested_model
))
config
},
build = function(input_shape) {
input_units <- tail(input_shape, 1)
self$kernel <- self$add_weight(
name = "kernel",
shape = shape(input_units, self$hidden_units),
regularizer = self$kernel_regularizer,
initializer = self$kernel_initializer,
)
},
call = function(inputs) {
op_matmul(inputs, self$kernel)
}
)
)
layer <- layer_my_dense(units = 16,
kernel_regularizer = "l1",
kernel_initializer = "ones")
layer3 <- layer_my_dense(units = 64, nested_model = layer)
config <- serialize_keras_object(layer3)
str(config)
## List of 4
## $ module : chr "<r-globalenv>"
## $ class_name : chr "MyDense"
## $ config :List of 5
## ..$ name : chr "my_dense_1"
## ..$ trainable : logi TRUE
## ..$ dtype :List of 4
## .. ..$ module : chr "keras"
## .. ..$ class_name : chr "DTypePolicy"
## .. ..$ config :List of 1
## .. .. ..$ name: chr "float32"
## .. ..$ registered_name: NULL
## ..$ units : num 64
## ..$ nested_model:List of 4
## .. ..$ module : chr "<r-globalenv>"
## .. ..$ class_name : chr "MyDense"
## .. ..$ config :List of 6
## .. .. ..$ name : chr "my_dense"
## .. .. ..$ trainable : logi TRUE
## .. .. ..$ dtype :List of 4
## .. .. .. ..$ module : chr "keras"
## .. .. .. ..$ class_name : chr "DTypePolicy"
## .. .. .. ..$ config :List of 1
## .. .. .. .. ..$ name: chr "float32"
## .. .. .. ..$ registered_name: NULL
## .. .. ..$ units : num 16
## .. .. ..$ kernel_regularizer: chr "l1"
## .. .. ..$ kernel_initializer: chr "ones"
## .. ..$ registered_name: chr "MyLayers>KernelMult"
## $ registered_name: chr "MyLayers>KernelMult"
## <MyDense name=my_dense_1, built=False>
## signature: (*args, **kwargs)
Note that overriding from_config
is unnecessary above
for MyDense
because hidden_units
,
kernel_initializer
, and kernel_regularizer
are
ints, strings, and a built-in Keras object, respectively. This means
that the default from_config
implementation of
cls(!!!config)
will work as intended.
For more complex objects, such as layers and models passed to
initialize()
, for example, you must explicitly deserialize
these objects. Let’s take a look at an example of a model where a
from_config
override is necessary.
Example:
`%||%` <- \(x, y) if(is.null(x)) y else x
layer_custom_model <- register_keras_serializable(
package = "ComplexModels",
object = Layer(
"CustomModel",
initialize = function(first_layer, second_layer = NULL, ...) {
super$initialize(...)
self$first_layer <- first_layer
self$second_layer <- second_layer %||% layer_dense(, 8)
},
get_config = function() {
config <- super$get_config()
config <- modifyList(config, list(
first_layer = self$first_layer,
second_layer = self$second_layer
))
config
},
from_config = function(config) {
config$first_layer %<>% deserialize_keras_object()
config$second_layer %<>% deserialize_keras_object()
# note that the class is available in methods under the classname symbol,
# (`CustomModel` for this class), and also under the symbol `__class__`
cls(!!!config)
# CustomModel(!!!config)
},
call = function(self, inputs) {
inputs |>
self$first_layer() |>
self$second_layer()
}
)
)
# Let's make our first layer the custom layer from the previous example (MyDense)
inputs <- keras_input(c(32))
outputs <- inputs |> layer_custom_model(first_layer=layer)
model <- keras_model(inputs, outputs)
config <- get_config(model)
new_model <- from_config(config)
The serialization format has a special key for custom objects
registered via register_keras_serializable()
. This
registered_name
key allows for easy retrieval at
loading/deserialization time while also allowing users to add custom
naming.
Let’s take a look at the config from serializing the custom layer
MyDense
we defined above.
Example:
layer <- layer_my_dense(
units = 16,
kernel_regularizer = regularizer_l1_l2(l1 = 1e-5, l2 = 1e-4),
kernel_initializer = "ones",
)
config <- serialize_keras_object(layer)
str(config)
## List of 4
## $ module : chr "<r-globalenv>"
## $ class_name : chr "MyDense"
## $ config :List of 6
## ..$ name : chr "my_dense_2"
## ..$ trainable : logi TRUE
## ..$ dtype :List of 4
## .. ..$ module : chr "keras"
## .. ..$ class_name : chr "DTypePolicy"
## .. ..$ config :List of 1
## .. .. ..$ name: chr "float32"
## .. ..$ registered_name: NULL
## ..$ units : num 16
## ..$ kernel_regularizer:List of 4
## .. ..$ module : chr "keras.regularizers"
## .. ..$ class_name : chr "L1L2"
## .. ..$ config :List of 2
## .. .. ..$ l1: num 1e-05
## .. .. ..$ l2: num 1e-04
## .. ..$ registered_name: NULL
## ..$ kernel_initializer: chr "ones"
## $ registered_name: chr "MyLayers>KernelMult"
As shown, the registered_name
key contains the lookup
information for the Keras master list, including the package
MyLayers
and the custom name KernelMult
that
we gave when calling register_keras_serializables()
. Take a
look again at the custom class definition/registration here.
Note that the class_name
key contains the original name
of the class, allowing for proper re-initialization in
from_config
.
Additionally, note that the module
key is
NULL
since this is a custom object.