BayesGrowth
combines length-at-age modelling with MCMC
implemented using Stan and the rstan package. Growth modelling
using models such as the von Bertalanffy growth function involves three
parameters: \(L_{\infty}\), \(k\) and either \(L_{0}\) or \(t_{0}\). Two of these parameters: \(L_{0}\) and \(L_{\infty}\) have direct biological meaning
as the length-at-birth and maximum length, respectively. This package
provides the tools to run an MCMC model with these two parameters
treated as size-at-birth and maximum length using a Stan model. This
MCMC model is pre-specified and built into wrapper functions.
The user can therefore run an MCMC growth model using knowledge of species length-at-birth and maximum size as priors.
This package provides a series of wrapper functions to the
rstan
package which will run MCMC models in Stan using No
U-turn sampling (NUTS). rstan
models are automatically run
by the package. However, you’ll probably want to install the
rstan
package anyway in order to work with the resulting
model outputs.
You can install the released version of BayesGrowth
from
Github
with:
The main BayesGrowth
function is
Estimate_MCMC_Growth()
which is the wrapper function around
an rstan
model. It requires a data input that includes
columns that can be identified “Age” and “Length”, the model needs to be
specified (several options are available) and the priors must be
specified. Priors include the max size with an error, length-at-birth
with an error and upper limits for \(k\) and \(\sigma\). These latter two parameters have
no informative priors and only require sensible upper bounds. Many fish
species (including this example) have a size at birth of zero.
Therefore, this can value can be used as a prior along with a very small
error to indicate high certainty of this prior. The L0.se
argument cannot be zero, but the model is specified to truncate \(L_{0}\) at zero and keep growth
positive.
Note that rstan
estimates parameter precision (\(\tau\)) rather than parameter standard
error (\(\theta{\sigma}\)). Therefore,
standard error is converted to precision as \(\tau = (1/\theta{\sigma})^2\). This is
handled by the BayesGrowth
package so the user does not
need to do this conversion themselves.
Here is an example of Estimate_MCMC_Growth()
in
action:
library(BayesGrowth)
# built in dataset for running examples
data("example_data")
## Biological info - lengths in mm
max_size <- 440
max_size_se <- 5
birth_size <- 0
birth_size_se <- 0.001 # cannot be zero
# Use the function to estimate the rstan model
MCMC_example_results <- Estimate_MCMC_Growth(example_data,
Model = "VB" ,
iter = 10000,
n.chains = 4, # minimum of 3 chains recommended
BurnIn = 1000, # default is 10% of iterations
thin = 10, # a thinning rate of 10 is applied to overcome autocorrelation
Linf = max_size,
Linf.se = max_size_se,
L0 = birth_size,
L0.se = birth_size_se,
sigma.max = 100,
verbose = TRUE,
k.max = 1)
rstan
and
bayesplot
packagesEstimate_MCMC_Growth
returns the rstan
outputs which is an object of class stanfit
. Therefore, all
accompanying R packages to rstan
can be used to interrogate
the results. A useful package for analysing the posteriors is
bayesplot
which has several functions to explore chain
convergence, posterior densities, autocorrelation, etc.
As an rstan
model object is returned from the function,
all of the diagnostics from the rstan
library can be used.
For example, a summary and structure of results can be produced.
library(rstan)
summary(MCMC_example_results,pars = c("Linf", "k", "L0", "sigma"))$summary
#> mean se_mean sd 2.5% 25%
#> Linf 3.180221e+02 6.880817e-02 4.1518326541 3.109460e+02 3.151457e+02
#> k 6.603822e-01 5.857158e-04 0.0344834676 5.928279e-01 6.373009e-01
#> L0 7.937532e-04 1.083977e-05 0.0005947255 3.223029e-05 3.237756e-04
#> sigma 2.431159e+01 1.502347e-02 0.8879123712 2.271614e+01 2.369463e+01
#> 50% 75% 97.5% n_eff Rhat
#> Linf 3.176650e+02 3.205548e+02 3.269781e+02 3640.824 0.9994656
#> k 6.602703e-01 6.843590e-01 7.264078e-01 3466.155 0.9996347
#> L0 6.680529e-04 1.142935e-03 2.152788e-03 3010.183 0.9998376
#> sigma 2.426977e+01 2.489364e+01 2.617737e+01 3493.011 0.9995634
str(MCMC_example_results,max.level = 2)
#> Formal class 'stanfit' [package "rstan"] with 10 slots
#> ..@ model_name: chr "VB_stan_model"
#> ..@ model_pars: chr [1:6] "L0" "Linf" "k" "sigma" ...
#> ..@ par_dims :List of 6
#> ..@ mode : int 0
#> ..@ sim :List of 12
#> ..@ inits :List of 4
#> ..@ stan_args :List of 4
#> ..@ stanmodel :Formal class 'stanmodel' [package "rstan"] with 5 slots
#> ..@ date : chr "Thu Nov 05 17:13:23 2020"
#> ..@ .MISC :<environment: 0x000002042e574400>
The chains of the MCMC model can be examined using a trace plot. The right hand panels show the chains for each parameter as they progress through the MCMC iterations. The left hand panels show the posterior distributions of each parameter. As this model is specified to have a normal distribution, all parameter posteriors should be normally distributed once convergence has been reached. \(L_{0}\) is an exception as this parameter is truncated to remain above zero. Therefore, as \(L_{0}\) approaches zero it will be right skewed. This plot shows good model convergence. All of the chains overlap and each of the parameters has the correct posterior distribution with a single mode.
Convergence can also be checked using a Gelman Rubin test. If all
point estimates of the Gelman-Rubin diagnostic (\(\hat{R}\)) for each parameter are less than
1.2 then convergence has been reached. Ideally these should be close to
one. If not, then the number of iterations should be increased. This
result is returned as part of the Estimate_MCMC_Growth
summary and can also be accessed using the rhat
function.
These can also be plotted using mcmc_rhat
.
Autocorrelation can be examined using the mcmc_acf
function in the bayesplot
package. This plot shows the
autocorrelation lag that occurs in the MCMC model for each parameter. If
each parameter has an autocorrelation value of zero at the end of the
lag series, then no autocorrelation has occurred.
BayesGrowth
functionsAdditional BayesGrowth
functions are available that help
the user manipulate the returned Estimate_MCMC_Growth
object. For example Get_MCMC_parameters
will return a
tibble of parameter results and statistics as a more manageable object
than summary()
.
Get_MCMC_parameters(MCMC_example_results)
#> Parameter mean se_mean sd 2.5% 50% 97.5% n_eff Rhat
#> 1 Linf 318.02 0.07 4.15 310.95 317.66 326.98 3640.82 1
#> 2 k 0.66 0.00 0.03 0.59 0.66 0.73 3466.16 1
#> 3 L0 0.00 0.00 0.00 0.00 0.00 0.00 3010.18 1
#> 4 sigma 24.31 0.02 0.89 22.72 24.27 26.18 3493.01 1
The Calculate_MCMC_growth_curve
function will provide
confidence intervals around the growth curve based on MCMC parameter
percentiles. This is essentially a wrapper around the
tidybayes::mean_qi()
function which means it can be passed
straight into a ggplot with the tidybayes::geom_line_ribbon
function.
# Return a growth curve with 50th and 95th percentiles
growth_curve <- Calculate_MCMC_growth_curve(obj = MCMC_example_results, Model = "VB",
max.age = max(example_data$Age), probs = c(.5,.95))
library(tidybayes)
library(ggplot2)
ggplot(growth_curve, aes(Age, LAA))+
geom_point(data = example_data, aes(Age, Length), alpha = .3)+
geom_lineribbon(aes( ymin = .lower, ymax = .upper,
fill = factor(.width)), size = .8) + labs(y = "Total Length (mm)", x = "Age (yrs)")+
scale_fill_brewer(palette="BuPu", direction=1,name =NULL)+
scale_y_continuous(expand = c(0,0))+
scale_x_continuous(expand = c(0,0), breaks = seq(0,13,1))+
theme_bw()+
theme(text = element_text(size = 14))
This represents a much improved fit over a standard non-linear
estimated model, even if the length-at-birth were fixed at zero. Here
the fit is compared using an nls model fit using the
AquaticLifeHistory
package (https://jonathansmart.github.io/AquaticLifeHistory/).
The Leave One Out Information Criterion (LOOIC) and/or Widely
Applicable Information Critereon (WAIC) should be calculated for
different candidate models to determine the best fitting growth model.
The Compare_Growth_Models
function can calculate both for
all three models in a single function call. The user should specify the
same arguments as the original Estimate_MCMC_Growth
function call to ensure consistency. If the arguments of
Estimate_MCMC_Growth
are updated (for example priors are
changed, more iterations are run, thinning is used, etc.) then
Compare_Growth_Models
should be re-run accordingly.
This function has a longer runtime than the initial model run as it will fit all three candidate Bayesian growth models. This is necessary as all models will be run with identical priors and model options which is important for determining the best model.
The model with the highest LOOIC or WAIC weights provides the best
fit to the data and should be used as the final growth model. LOOIC is
considered more robust than WAIC but comparing both can be useful. The
stats
argument in the function call lets the user decide if
they would like to return results for LooIC, WAIC or both. This model
comparison analysis should be the first step in the model fitting
process so that subsequent diagnostics and results are only produced for
the best fitting model.
From these results we can see that the best fitting model was the Von Bertalanffy growth model. Note that this function does not check model convergence (a large number of warning messages suggests that one or more models has struggled to converge). Therefore, full diagnostics need to be run on the best fitting model. If the model configurations are updated to deal with issues such as autocorrelation then model comparisons and selection should be undertaken again.
Looic_example_results <- Compare_Growth_Models(data = example_data,
stats = "LooIC",
iter = 10000,
n_cores = 3,
n.chains = 4,
BurnIn = 1000,
thin = 1,
Linf = max_size,
Linf.se = max_size_se,
L0 = birth_size,
L0.se = birth_size_se,
verbose = TRUE,
sigma.max = 100,
k.max = 1)
#> Model elpd_diff se_diff elpd_loo se_elpd_loo p_loo se_p_loo
#> 1 VB 0.000 0.00000 -3640.812 33.08032 8.864896494 1.476362e+00
#> 2 Gompertz -3669.257 40.17058 -7310.070 16.80033 0.295843815 2.751125e-02
#> 3 Logistic -3832.046 40.65479 -7472.858 20.21025 0.001686265 3.736302e-05
#> looic se_looic looic_Weight
#> 1 7281.625 66.16063 1
#> 2 14620.139 33.60067 0
#> 3 14945.716 40.42050 0