This vignette aims to illustrate how the inclusion of covariates can
influence the severity of the claims generated using the
SynthETIC
package. The distributional assumptions shown in
this vignette are consistent with the default assumptions of the
SynthETIC
package (an Auto Liability portfolio). The
inclusion of covariates aims to be a minor adjustment step to modelled
claim sizes after Step 2: Claim size discussed in the SynthETIC-demo
vignette.
In particular, with this demo we will construct:
Description | R Object |
---|---|
Covariate Inputs | covariate_obj = various factors, their
levels and relativities for covariate frequency and claim severity |
Covariate Outputs | covariates_data_obj = dataset of assigned
covariates for each claim |
S_adj, claim size | claim_size_w_cov[[i]] = claim size for all
claims that occurred in period i after adjustment for
covariates |
To cite this package in publications, please use:
citation("SynthETIC")
SynthETIC
Set UpWe set up package-wise global parameters demonstrated in the
SynthETIC-demo
vignette (which can be accessed via
vignette("SynthETIC-demo", package = "SynthETIC")
or online
documentation) and perform modelling Steps 1 and 2 to generate the
claim frequency and claim sizes under the default assumptions. Note that
changing these assumptions for Steps 1 and 2 do not affect how
covariates are implemented.
library(SynthETIC)
set.seed(20200131)
set_parameters(ref_claim = 200000, time_unit = 1/4)
<- return_parameters()[1]
ref_claim <- return_parameters()[2]
time_unit
<- 10
years <- years / time_unit
I <- c(rep(12000, I)) # effective annual exposure rates
E <- c(rep(0.03, I))
lambda
# Modelling Steps 1-2
<- claim_frequency(I = I, E = E, freq = lambda)
n_vector <- claim_occurrence(frequency_vector = n_vector)
occurrence_times <- claim_size(frequency_vector = n_vector) claim_sizes
To apply simulated covariates to SynthETIC
claim sizes,
a covariates
is used in conjunction with the
claim_size_adj()
function to both simulate covariate
combinations and apply adjusted claim sizes. The example
covariates
object below includes relativities for
<- SynthETIC::test_covariates_obj
test_covariates_obj print(test_covariates_obj)
#> $factors
#> $factors$`Legal Representation`
#> [1] "Y" "N"
#>
#> $factors$`Injury Severity`
#> [1] "1" "2" "3" "4" "5" "6"
#>
#> $factors$`Age of Claimant`
#> [1] "0-15" "15-30" "30-50" "50-65" "over 65"
#>
#>
#> $relativity_freq
#> factor_i factor_j level_ik level_jl relativity
#> 1 Legal Representation Legal Representation Y Y 1.000
#> 2 Legal Representation Legal Representation N N 1.000
#> 3 Legal Representation Injury Severity Y 1 0.950
#> 4 Legal Representation Injury Severity Y 2 1.000
#> 5 Legal Representation Injury Severity Y 3 1.000
#> 6 Legal Representation Injury Severity Y 4 1.000
#> 7 Legal Representation Injury Severity Y 5 1.000
#> 8 Legal Representation Injury Severity Y 6 1.000
#> 9 Legal Representation Injury Severity N 1 0.050
#> 10 Legal Representation Injury Severity N 2 0.000
#> 11 Legal Representation Injury Severity N 3 0.000
#> 12 Legal Representation Injury Severity N 4 0.000
#> 13 Legal Representation Injury Severity N 5 0.000
#> 14 Legal Representation Injury Severity N 6 0.000
#> 15 Legal Representation Age of Claimant Y 0-15 1.000
#> 16 Legal Representation Age of Claimant Y 15-30 1.000
#> 17 Legal Representation Age of Claimant Y 30-50 1.000
#> 18 Legal Representation Age of Claimant Y 50-65 1.000
#> 19 Legal Representation Age of Claimant Y over 65 1.000
#> 20 Legal Representation Age of Claimant N 0-15 1.000
#> 21 Legal Representation Age of Claimant N 15-30 1.000
#> 22 Legal Representation Age of Claimant N 30-50 1.000
#> 23 Legal Representation Age of Claimant N 50-65 1.000
#> 24 Legal Representation Age of Claimant N over 65 1.000
#> 25 Injury Severity Injury Severity 1 1 0.530
#> 26 Injury Severity Injury Severity 2 2 0.300
#> 27 Injury Severity Injury Severity 3 3 0.100
#> 28 Injury Severity Injury Severity 4 4 0.050
#> 29 Injury Severity Injury Severity 5 5 0.010
#> 30 Injury Severity Injury Severity 6 6 0.010
#> 31 Injury Severity Age of Claimant 1 0-15 1.000
#> 32 Injury Severity Age of Claimant 1 15-30 1.000
#> 33 Injury Severity Age of Claimant 1 30-50 1.000
#> 34 Injury Severity Age of Claimant 1 50-65 1.000
#> 35 Injury Severity Age of Claimant 1 over 65 1.000
#> 36 Injury Severity Age of Claimant 2 0-15 1.000
#> 37 Injury Severity Age of Claimant 2 15-30 1.000
#> 38 Injury Severity Age of Claimant 2 30-50 1.000
#> 39 Injury Severity Age of Claimant 2 50-65 1.000
#> 40 Injury Severity Age of Claimant 2 over 65 1.000
#> 41 Injury Severity Age of Claimant 3 0-15 1.000
#> 42 Injury Severity Age of Claimant 3 15-30 1.000
#> 43 Injury Severity Age of Claimant 3 30-50 1.000
#> 44 Injury Severity Age of Claimant 3 50-65 1.000
#> 45 Injury Severity Age of Claimant 3 over 65 1.000
#> 46 Injury Severity Age of Claimant 4 0-15 1.000
#> 47 Injury Severity Age of Claimant 4 15-30 1.000
#> 48 Injury Severity Age of Claimant 4 30-50 1.000
#> 49 Injury Severity Age of Claimant 4 50-65 1.000
#> 50 Injury Severity Age of Claimant 4 over 65 1.000
#> 51 Injury Severity Age of Claimant 5 0-15 1.000
#> 52 Injury Severity Age of Claimant 5 15-30 1.000
#> 53 Injury Severity Age of Claimant 5 30-50 1.000
#> 54 Injury Severity Age of Claimant 5 50-65 1.000
#> 55 Injury Severity Age of Claimant 5 over 65 1.000
#> 56 Injury Severity Age of Claimant 6 0-15 1.000
#> 57 Injury Severity Age of Claimant 6 15-30 1.000
#> 58 Injury Severity Age of Claimant 6 30-50 1.000
#> 59 Injury Severity Age of Claimant 6 50-65 1.000
#> 60 Injury Severity Age of Claimant 6 over 65 1.000
#> 61 Age of Claimant Age of Claimant 0-15 0-15 0.183
#> 62 Age of Claimant Age of Claimant 15-30 15-30 0.192
#> 63 Age of Claimant Age of Claimant 30-50 30-50 0.274
#> 64 Age of Claimant Age of Claimant 50-65 50-65 0.180
#> 65 Age of Claimant Age of Claimant over 65 over 65 0.171
#>
#> $relativity_sev
#> factor_i factor_j level_ik level_jl relativity
#> 1 Legal Representation Legal Representation Y Y 2.00
#> 2 Legal Representation Legal Representation N N 1.00
#> 3 Legal Representation Injury Severity Y 1 1.00
#> 4 Legal Representation Injury Severity Y 2 1.00
#> 5 Legal Representation Injury Severity Y 3 1.00
#> 6 Legal Representation Injury Severity Y 4 1.00
#> 7 Legal Representation Injury Severity Y 5 1.00
#> 8 Legal Representation Injury Severity Y 6 1.00
#> 9 Legal Representation Injury Severity N 1 1.00
#> 10 Legal Representation Injury Severity N 2 1.00
#> 11 Legal Representation Injury Severity N 3 1.00
#> 12 Legal Representation Injury Severity N 4 1.00
#> 13 Legal Representation Injury Severity N 5 1.00
#> 14 Legal Representation Injury Severity N 6 1.00
#> 15 Legal Representation Age of Claimant Y 0-15 1.00
#> 16 Legal Representation Age of Claimant Y 15-30 1.00
#> 17 Legal Representation Age of Claimant Y 30-50 1.00
#> 18 Legal Representation Age of Claimant Y 50-65 1.00
#> 19 Legal Representation Age of Claimant Y over 65 1.00
#> 20 Legal Representation Age of Claimant N 0-15 1.00
#> 21 Legal Representation Age of Claimant N 15-30 1.00
#> 22 Legal Representation Age of Claimant N 30-50 1.00
#> 23 Legal Representation Age of Claimant N 50-65 1.00
#> 24 Legal Representation Age of Claimant N over 65 1.00
#> 25 Injury Severity Injury Severity 1 1 0.60
#> 26 Injury Severity Injury Severity 2 2 1.20
#> 27 Injury Severity Injury Severity 3 3 2.50
#> 28 Injury Severity Injury Severity 4 4 5.00
#> 29 Injury Severity Injury Severity 5 5 8.00
#> 30 Injury Severity Injury Severity 6 6 0.40
#> 31 Injury Severity Age of Claimant 1 0-15 1.00
#> 32 Injury Severity Age of Claimant 1 15-30 1.00
#> 33 Injury Severity Age of Claimant 1 30-50 1.00
#> 34 Injury Severity Age of Claimant 1 50-65 1.00
#> 35 Injury Severity Age of Claimant 1 over 65 1.00
#> 36 Injury Severity Age of Claimant 2 0-15 1.00
#> 37 Injury Severity Age of Claimant 2 15-30 1.00
#> 38 Injury Severity Age of Claimant 2 30-50 1.00
#> 39 Injury Severity Age of Claimant 2 50-65 1.00
#> 40 Injury Severity Age of Claimant 2 over 65 1.00
#> 41 Injury Severity Age of Claimant 3 0-15 1.00
#> 42 Injury Severity Age of Claimant 3 15-30 1.00
#> 43 Injury Severity Age of Claimant 3 30-50 1.00
#> 44 Injury Severity Age of Claimant 3 50-65 1.00
#> 45 Injury Severity Age of Claimant 3 over 65 1.00
#> 46 Injury Severity Age of Claimant 4 0-15 1.00
#> 47 Injury Severity Age of Claimant 4 15-30 1.00
#> 48 Injury Severity Age of Claimant 4 30-50 1.00
#> 49 Injury Severity Age of Claimant 4 50-65 0.97
#> 50 Injury Severity Age of Claimant 4 over 65 0.95
#> 51 Injury Severity Age of Claimant 5 0-15 1.00
#> 52 Injury Severity Age of Claimant 5 15-30 1.00
#> 53 Injury Severity Age of Claimant 5 30-50 1.00
#> 54 Injury Severity Age of Claimant 5 50-65 0.95
#> 55 Injury Severity Age of Claimant 5 over 65 0.90
#> 56 Injury Severity Age of Claimant 6 0-15 1.00
#> 57 Injury Severity Age of Claimant 6 15-30 1.00
#> 58 Injury Severity Age of Claimant 6 30-50 1.00
#> 59 Injury Severity Age of Claimant 6 50-65 1.00
#> 60 Injury Severity Age of Claimant 6 over 65 1.00
#> 61 Age of Claimant Age of Claimant 0-15 0-15 1.25
#> 62 Age of Claimant Age of Claimant 15-30 15-30 1.15
#> 63 Age of Claimant Age of Claimant 30-50 30-50 1.00
#> 64 Age of Claimant Age of Claimant 50-65 50-65 0.85
#> 65 Age of Claimant Age of Claimant over 65 over 65 0.70
#>
#> attr(,"class")
#> [1] "covariates"
The claim_size_adj()
function simulates the covariate
levels for each claim and then adjusts the claim sizes according to the
relativities defined above. The covariate levels for each claim can be
accessed in the covariates_data$data
attribute of the
function output.
<- claim_size_adj(test_covariates_obj, claim_sizes)
claim_size_covariates <- claim_size_covariates$covariates_data
covariates_data_obj head(data.frame(covariates_data_obj$data))
#> Legal.Representation Injury.Severity Age.of.Claimant
#> 1 Y 2 30-50
#> 2 Y 4 15-30
#> 3 Y 1 50-65
#> 4 N 1 50-65
#> 5 Y 1 30-50
#> 6 Y 1 50-65
The adjusted claim sizes are stored in the
claim_size_adj
attribute.
<- claim_size_covariates$claim_size_adj
claim_size_w_cov 1]]
claim_size_w_cov[[#> [1] 7.477094e+05 9.804361e+05 1.252920e+04 1.010922e+01 6.833450e+03
#> [6] 2.757582e+05 3.752743e+03 5.747681e+03 4.664947e+04 4.286590e+03
#> [11] 4.350318e+04 3.435734e+04 5.755410e+04 2.216435e+03 2.415741e+05
#> [16] 7.590815e+05 3.886404e+02 2.408333e+05 1.302333e+03 7.377977e+05
#> [21] 1.857832e+03 1.898380e+05 1.457273e+04 1.791514e+05 9.434147e+03
#> [26] 5.853758e+05 2.092573e+05 4.875323e+04 6.683424e+04 8.283404e+04
#> [31] 2.477044e+04 6.766438e+04 4.677135e+04 2.032482e+05 1.490704e+05
#> [36] 7.542428e+04 3.037978e+02 6.649810e+04 3.941017e+04 1.392369e+04
#> [41] 1.324766e+05 2.873885e+04 6.168764e+03 1.463038e+03 1.933198e+05
#> [46] 8.009703e+04 6.263902e+04 2.216208e+04 1.344780e+03 3.021240e+04
#> [51] 5.451734e+04 3.538998e+05 2.802154e+05 5.368758e+05 2.883399e+03
#> [56] 6.251865e+04 4.694719e+02 1.140956e+04 1.099047e+04 1.303593e+04
#> [61] 9.220802e+04 5.803419e+04 4.771224e+04 1.600308e+05 1.731276e+04
#> [66] 6.139491e+04 2.290593e+06 1.286403e+04 1.338381e+04 1.305596e+05
#> [71] 9.896327e+04 1.186354e+05 1.152881e+05 1.141948e+04 5.474989e+04
#> [76] 6.483547e+04 1.019475e+06 3.984001e+05 1.160131e+05 2.451292e+04
#> [81] 1.801576e+05 2.217363e+05 1.209265e+05 5.536457e+03 6.819783e+04
#> [86] 1.028982e+03 6.795593e+03 5.854764e+04 5.908643e+04 7.346891e+05
Just as in Steps 1-2, Steps 3 onwards also do not require any
specific adjustment in relation to implementing covariates. Guidance on
implementing these modelling steps can be found in the
SynthETIC-demo
vignette. We can see from the example below
that the inclusion of covariates primarily has an impact on claim sizes
and thus any following modelling steps that are also impacted from the
adjusted claim sizes. Note that the number of claims
(n_vector
) and the time at which they occur
(occurrence_times
) are unaffected by covariates.
<- function(claim_size_list) {
generate_claims_dataset
# SynthETIC Steps 3-5
<- claim_notification(n_vector, claim_size_list)
notidel <- claim_closure(n_vector, claim_size_list)
setldel <- claim_payment_no(n_vector, claim_size_list)
no_payments
<- generate_claim_dataset(
claim_dataset frequency_vector = n_vector,
occurrence_list = occurrence_times,
claim_size_list = claim_size_list,
notification_list = notidel,
settlement_list = setldel,
no_payments_list = no_payments
)
claim_dataset
}
<- generate_claims_dataset(claim_size_list = claim_sizes)
claim_dataset <- generate_claims_dataset(claim_size_list = claim_size_w_cov)
claim_dataset_w_cov
head(claim_dataset)
#> claim_no occurrence_period occurrence_time claim_size notidel setldel
#> 1 1 1 0.6238351 783769.11073 0.1884900 10.8513767
#> 2 2 1 0.1206679 214480.60483 1.9166580 25.7448252
#> 3 3 1 0.2220436 30902.21786 0.1652771 3.6818905
#> 4 4 1 0.4538309 49.86708 1.8143808 0.6020508
#> 5 5 1 0.5910992 14326.01244 2.3496118 3.0210980
#> 6 6 1 0.9524492 680134.40835 0.8705656 24.7423916
#> no_payment
#> 1 7
#> 2 4
#> 3 5
#> 4 1
#> 5 3
#> 6 8
head(claim_dataset_w_cov)
#> claim_no occurrence_period occurrence_time claim_size notidel setldel
#> 1 1 1 0.6238351 747709.38874 2.0436447 7.139480
#> 2 2 1 0.1206679 980436.14768 0.7016332 46.925459
#> 3 3 1 0.2220436 12529.19794 0.7584349 2.828873
#> 4 4 1 0.4538309 10.10922 0.5253115 2.173519
#> 5 5 1 0.5910992 6833.44997 4.5098711 2.073417
#> 6 6 1 0.9524492 275758.15658 1.7474071 7.026188
#> no_payment
#> 1 7
#> 2 15
#> 3 3
#> 4 2
#> 5 1
#> 6 12
This section shows the impact of using a set of covariates different
than the default values within the SynthETIC
package.
The included framework allows a user to easily construct any set of covariates required for simulation and/or analysis. This gives the user flexibility in choosing both the number of factors in the set of covariates and the number of levels within each factor.
The below example compares
SynthETIC
<- list(
factors_tmp "Vehicle Type" = c("Passenger", "Light Commerical", "Medium Goods", "Heavy Goods"),
"Business Use" = c("Y", "N")
)
<- relativity_template(factors_tmp)
relativity_freq_tmp <- relativity_template(factors_tmp)
relativity_sev_tmp
# Default Values
$relativity <- c(
relativity_freq_tmp5, 1.5, 0.35, 0.25,
1, 4,
1, 0.6,
0.35, 0.01,
0.25, 0,
2.5, 5
)
$relativity <- c(
relativity_sev_tmp0.25, 0.75, 1, 3,
1, 1,
1, 1,
1, 1,
1, 1,
1.3, 1
)
<- covariates(factors_tmp)
test_covariates_obj_veh <- set.covariates_relativity(
test_covariates_obj_veh covariates = test_covariates_obj_veh,
relativity = relativity_freq_tmp,
freq_sev = "freq"
)<- set.covariates_relativity(
test_covariates_obj_veh covariates = test_covariates_obj_veh,
relativity = relativity_sev_tmp,
freq_sev = "sev"
)
<- claim_size_adj(test_covariates_obj_veh, claim_sizes)
claim_size_covariates_veh
# Comparison of the same claim size except with adjustments due to covariates
data.frame(
Claim_Size = head(round(claim_sizes[[1]]))
Claim_Size_Original_Covariates = head(round(claim_size_covariates$claim_size_adj[[1]]))
,Claim_Size_New_Covariates = head(round(claim_size_covariates_veh$claim_size_adj[[1]]))
,
)#> Claim_Size Claim_Size_Original_Covariates Claim_Size_New_Covariates
#> 1 783769 747709 2399307
#> 2 214481 980436 218859
#> 3 30902 12529 72769
#> 4 50 10 39
#> 5 14326 6833 14618
#> 6 680134 275758 533861
# Covariate Levels
head(claim_size_covariates$covariates_data$data)
#> Legal Representation Injury Severity Age of Claimant
#> 1 Y 2 30-50
#> 2 Y 4 15-30
#> 3 Y 1 50-65
#> 4 N 1 50-65
#> 5 Y 1 30-50
#> 6 Y 1 50-65
head(claim_size_covariates_veh$covariates_data$data)
#> Vehicle Type Business Use
#> 1 Light Commerical Y
#> 2 Passenger Y
#> 3 Light Commerical N
#> 4 Passenger N
#> 5 Passenger Y
#> 6 Passenger N
To apply specific covariate values for each claim occurrence, we can
use the parameter covariates_id
when constructing the
covariates_data
object. This would map the each claim to a
corresponding known covariate value from a dataset and apply the
relevant severity relativities. Note that in this case, the frequency
relativities would not be used, as no simulation of covariate values are
performed.
In the example below, we have a known dataset of covariates, which can be mapped to each of the claim sizes. In the covariates dataset, we know:
As a result, we can use the indices for each of these rows to map each set of covariates to its associated claim. In this case, the first 50 claims are related to the last 50 rows in the covariates dataset in reverse order, and claims 51–100 are related to the first 50 rows in the covariates dataset.
<- list(c(
claim_sizes_known rexp(n = 100, rate = 1.5)
))
<- data.frame(
known_covariates_dataset "Vehicle Type" = rep(rep(c("Passenger", "Light Commerical"), each = 25), times = 2),
"Business Use" = c(rep("N", times = 50), rep("Y", times = 50))
)colnames(known_covariates_dataset) <- c("Vehicle Type", "Business Use")
<- covariates_data(
covariates_data_veh
test_covariates_obj_veh, data = known_covariates_dataset,
covariates_id = list(c(100:51, 1:50))
)
<- claim_size_adj.fit(
claim_sizes_adj_tmp covariates_data = covariates_data_veh,
claim_size = claim_sizes_known
)
head(claim_sizes_adj_tmp[[1]])
#> [1] 2.0607335 0.4954515 1.4427630 0.4708760 3.8770597 2.4275293