We will illustrate the use of BayesGP using the
covid_canada
dataset, which contains the daily death count
of COVID-19 in Canada.
head(covid_canada)
#> Date new_deaths t weekdays1 weekdays2 weekdays3 weekdays4
#> 1 2020-03-01 0 591.0323 -1 -1 -1 -1
#> 2 2020-03-02 0 591.0645 1 0 0 0
#> 3 2020-03-03 0 591.0968 0 1 0 0
#> 4 2020-03-04 0 591.1290 0 0 1 0
#> 5 2020-03-05 0 591.1613 0 0 0 1
#> 6 2020-03-06 0 591.1935 0 0 0 0
#> weekdays5 weekdays6 index
#> 1 -1 -1 1
#> 2 0 0 2
#> 3 0 0 3
#> 4 0 0 4
#> 5 0 0 5
#> 6 1 0 6
For simplicity, let’s consider the following model: \[Y_i|\lambda_i \sim \text{Poisson}(\lambda_i)\] \[\log(\lambda_i) = \mathbf{x}_i^T\boldsymbol{\beta} + f(t_i)\] where \(\mathbf{x}_i\) denotes the fixed effect of weekdays, and \(f\) is an unknown function to be inferred.
To make inference of the unknown function \(f\), we use the \(\text{IWP}_3(\sigma)\) model: \[\frac{\partial^p{f}(t)}{\partial t^p} = \sigma \xi(t),\] with the boundary (initial) conditions that \(\frac{\partial^q{f}(0)}{\partial t^q} = 0\) for all \(0\leq q <p\). Here \(\xi(t)\) is the standard Gaussian white noise process, or can be viewed as the distributional derivative of the standard Brownian motion.
To fit the above model using the BayesGP package, we simply do the following:
fit_result <- model_fit(new_deaths ~ weekdays1 + weekdays2 + weekdays3 + weekdays4 + weekdays5 + weekdays6 +
f(smoothing_var = t, model = "IWP", order = 3, k = 100, sd.prior = list(prior = "exp", param = list(u = 0.02, alpha = 0.5), h = 1)),
data = covid_canada, method = "aghq", family = "Poisson")
We can take a look at the posterior summary of this model:
summary(fit_result)
#> Here are some posterior/prior summaries for the parameters:
#> name median q0.025 q0.975 prior prior:P1 prior:P2
#> 1 intercept 3.663 3.585 3.737 Normal 0.00 1e+03
#> 2 weekdays1 0.093 0.069 0.117 Normal 0.00 1e+03
#> 3 weekdays2 0.080 0.055 0.104 Normal 0.00 1e+03
#> 4 weekdays3 0.127 0.102 0.151 Normal 0.00 1e+03
#> 5 weekdays4 0.125 0.102 0.149 Normal 0.00 1e+03
#> 6 weekdays5 0.049 0.025 0.074 Normal 0.00 1e+03
#> 7 weekdays6 -0.152 -0.179 -0.125 Normal 0.00 1e+03
#> 8 t (PSD) 1.081 0.870 1.339 Exponential 0.02 5e-01
#> For Normal prior, P1 is its mean and P2 is its variance.
#> For Exponential prior, prior is specified as P(theta > P1) = P2.
We can also see the inferred function \(f\):
We can use the predict
function to obtain the posterior
summary of \(f\) or its derivative at
new_data
:
For the function \(f\):
predict_f <- predict(fit_result, variable = "t", newdata = data.frame(t = seq(from = 605, to = 615, by = 0.1)))
matplot(x = predict_f[,1], y = predict_f[,2:4], type = 'l', ylab = "f", xlab = "t", col = c("grey", "grey", "black"), lty = c("dashed", "dashed", "solid"))
For the first derivative:
predict_f1st <- predict(fit_result, variable = "t", newdata = data.frame(t = seq(from = 605, to = 615, by = 0.1)), deriv = 1)
matplot(x = predict_f1st[,1], y = predict_f1st[,2:4], type = 'l', ylab = "f'", xlab = "t", col = c("grey", "grey", "black"), lty = c("dashed", "dashed", "solid"))
For the second derivative: