BayesGP: COVID-19 Example

library(BayesGP)

COVID-19 Example

Data and Model

We will illustrate the use of BayesGP using the covid_canada dataset, which contains the daily death count of COVID-19 in Canada.

head(covid_canada)
#>         Date new_deaths        t weekdays1 weekdays2 weekdays3 weekdays4
#> 1 2020-03-01          0 591.0323        -1        -1        -1        -1
#> 2 2020-03-02          0 591.0645         1         0         0         0
#> 3 2020-03-03          0 591.0968         0         1         0         0
#> 4 2020-03-04          0 591.1290         0         0         1         0
#> 5 2020-03-05          0 591.1613         0         0         0         1
#> 6 2020-03-06          0 591.1935         0         0         0         0
#>   weekdays5 weekdays6 index
#> 1        -1        -1     1
#> 2         0         0     2
#> 3         0         0     3
#> 4         0         0     4
#> 5         0         0     5
#> 6         1         0     6

For simplicity, let’s consider the following model: \[Y_i|\lambda_i \sim \text{Poisson}(\lambda_i)\] \[\log(\lambda_i) = \mathbf{x}_i^T\boldsymbol{\beta} + f(t_i)\] where \(\mathbf{x}_i\) denotes the fixed effect of weekdays, and \(f\) is an unknown function to be inferred.

To make inference of the unknown function \(f\), we use the \(\text{IWP}_3(\sigma)\) model: \[\frac{\partial^p{f}(t)}{\partial t^p} = \sigma \xi(t),\] with the boundary (initial) conditions that \(\frac{\partial^q{f}(0)}{\partial t^q} = 0\) for all \(0\leq q <p\). Here \(\xi(t)\) is the standard Gaussian white noise process, or can be viewed as the distributional derivative of the standard Brownian motion.

Inference

To fit the above model using the BayesGP package, we simply do the following:

fit_result <- model_fit(new_deaths ~ weekdays1 + weekdays2 + weekdays3 + weekdays4 + weekdays5 + weekdays6 +
                          f(smoothing_var = t, model = "IWP", order = 3, k = 100, sd.prior = list(prior = "exp", param = list(u = 0.02, alpha = 0.5), h = 1)), 
                        data = covid_canada, method = "aghq", family = "Poisson")

We can take a look at the posterior summary of this model:

summary(fit_result)
#> Here are some posterior/prior summaries for the parameters: 
#>        name median q0.025 q0.975       prior prior:P1 prior:P2
#> 1 intercept  3.663  3.585  3.737      Normal     0.00    1e+03
#> 2 weekdays1  0.093  0.069  0.117      Normal     0.00    1e+03
#> 3 weekdays2  0.080  0.055  0.104      Normal     0.00    1e+03
#> 4 weekdays3  0.127  0.102  0.151      Normal     0.00    1e+03
#> 5 weekdays4  0.125  0.102  0.149      Normal     0.00    1e+03
#> 6 weekdays5  0.049  0.025  0.074      Normal     0.00    1e+03
#> 7 weekdays6 -0.152 -0.179 -0.125      Normal     0.00    1e+03
#> 8   t (PSD)  1.081  0.870  1.339 Exponential     0.02    5e-01
#> For Normal prior, P1 is its mean and P2 is its variance. 
#> For Exponential prior, prior is specified as P(theta > P1) = P2.

We can also see the inferred function \(f\):

plot(fit_result)

We can use the predict function to obtain the posterior summary of \(f\) or its derivative at new_data:

For the function \(f\):

predict_f <- predict(fit_result, variable = "t", newdata = data.frame(t = seq(from = 605, to = 615, by = 0.1)))
matplot(x = predict_f[,1], y = predict_f[,2:4], type = 'l', ylab = "f", xlab = "t", col = c("grey", "grey", "black"), lty = c("dashed", "dashed", "solid"))

For the first derivative:

predict_f1st <- predict(fit_result, variable = "t", newdata = data.frame(t = seq(from = 605, to = 615, by = 0.1)), deriv = 1)
matplot(x = predict_f1st[,1], y = predict_f1st[,2:4], type = 'l', ylab = "f'", xlab = "t", col = c("grey", "grey", "black"), lty = c("dashed", "dashed", "solid"))

For the second derivative:

predict_f2nd <- predict(fit_result, variable = "t", newdata = data.frame(t = seq(from = 605, to = 617, by = 0.1)), deriv = 2)
matplot(x = predict_f2nd[,1], y = predict_f2nd[,2:4], type = 'l', ylab = "f''", xlab = "t", col = c("grey", "grey", "black"), lty = c("dashed", "dashed", "solid"))

par(oldpar)