Main Functions
1. Generating a Covariate Table with COVID-19 Risk Factors
The risk_factor
function generates a covariate table
with risk factors using UKBB main tab data. Automatically returns sex,
age at birthday in 2020, SES, self-reported ethnicity, most recently
reported BMI, most recently reported pack-years, whether they reside in
aged care (based on hospital admissions data, and covid test data) and
blood type. Function also allows user to specify fields of interest
(field codes, provided by UK Biobank), and allows the users to specify
more intuitive names, for selected fields.
Note: the ukb.tab file must include fields: f.eid, f.31.0.0, f.34.0.0, f.189.0.0, f.21001., f.21000., f.20161.
covar <- risk_factor(ukb.data=covid_example("sim_ukb.tab.gz"),
ABO.data=covid_example("sim_covid19_misc.txt.gz"),
hesin.file=covid_example("sim_hesin.txt.gz"),
res.eng=covid_example("sim_result_england.txt.gz"))
head(covar)
#> ID sex age bmi ethnic other.ppl black asian mixed white SES smoke
#> 1 1 1 74 39.0947 1001 0 0 0 0 1 5.43719 0.000
#> 2 2 1 58 25.3177 1001 0 0 0 0 1 -2.10787 0.000
#> 3 3 0 51 32.2349 1002 0 0 0 0 1 7.36321 25.625
#> 4 4 0 56 21.7955 1001 0 0 0 0 1 -5.62047 0.000
#> 5 5 0 72 22.6203 1001 0 0 0 0 1 -1.09293 0.000
#> 6 6 1 67 25.9823 1001 0 0 0 0 1 -3.90245 0.000
#> blood_group O AB B A inAgedCare
#> 1 AO 0 0 0 1 0
#> 2 AO 0 0 0 1 0
#> 3 AO 0 0 0 1 0
#> 4 AO 0 0 0 1 0
#> 5 AO 0 0 0 1 0
#> 6 OO 1 0 0 0 0
2. Summarizing COVID-19 Test Results
The makePhenotypes
function summarizes COVID-19 test
results, death register data, and hospital inpatient data. It generates
phenotypes for COVID-19 susceptibility, severity, and mortality.
COVID-19 susceptibility
susceptibility <- makePhenotypes(ukb.data=covid_example("sim_ukb.tab.gz"),
res.eng=covid_example("sim_result_england.txt.gz"),
death.file=covid_example("sim_death.txt.gz"),
death.cause.file=covid_example("sim_death_cause.txt.gz"),
hesin.file=covid_example("sim_hesin.txt.gz"),
hesin_diag.file=covid_example("sim_hesin_diag.txt.gz"),
hesin_oper.file=covid_example("sim_hesin_oper.txt.gz"),
hesin_critical.file=covid_example("sim_hesin_critical.txt.gz"),
code.file=covid_example("coding240.txt.gz"),
pheno.type = "susceptibility")
#> [1] "965 participants got tested until 2021-04-05."
#> [1] "218 participants got positive test results until 2021-04-05."
#> [1] "There are 21 deaths with COVID-19. 20 of them primary death cause is COVID-19."
#> [1] "50 patients admitted to hospital were diagnosed as COVID-19 until 2021-04-05."
#> [1] "32 patients' primary diagnosis is COVID-19."
#> [1] "1 patients in hospitalization with COVID-19 diagnosis but show negative in the result file. Modified their test results."
#> [1] "There are 219 COVID-19 patients identified. 32 individuals are admitted to hospital. 3 had been in ICU. 1 had been in advanced ICU."
COVID-19 severity
severity <- makePhenotypes(ukb.data=covid_example("sim_ukb.tab.gz"),
res.eng=covid_example("sim_result_england.txt.gz"),
death.file=covid_example("sim_death.txt.gz"),
death.cause.file=covid_example("sim_death_cause.txt.gz"),
hesin.file=covid_example("sim_hesin.txt.gz"),
hesin_diag.file=covid_example("sim_hesin_diag.txt.gz"),
hesin_oper.file=covid_example("sim_hesin_oper.txt.gz"),
hesin_critical.file=covid_example("sim_hesin_critical.txt.gz"),
code.file=covid_example("coding240.txt.gz"),
pheno.type = "severity")
#> [1] "726 participants got tested until 2020-12-31."
#> [1] "150 participants got positive test results until 2020-12-31."
#> [1] "There are 10 deaths with COVID-19. 9 of them primary death cause is COVID-19."
#> [1] "50 patients admitted to hospital were diagnosed as COVID-19 until 2020-12-31."
#> [1] "32 patients' primary diagnosis is COVID-19."
#> [1] "1 patients in hospitalization with COVID-19 diagnosis but show negative in the result file. Modified their test results."
#> [1] "There are 151 COVID-19 patients identified. 32 individuals are admitted to hospital. 3 had been in ICU. 1 had been in advanced ICU."
COVID-19 mortality
mortality <- makePhenotypes(ukb.data=covid_example("sim_ukb.tab.gz"),
res.eng=covid_example("sim_result_england.txt.gz"),
death.file=covid_example("sim_death.txt.gz"),
death.cause.file=covid_example("sim_death_cause.txt.gz"),
hesin.file=covid_example("sim_hesin.txt.gz"),
hesin_diag.file=covid_example("sim_hesin_diag.txt.gz"),
hesin_oper.file=covid_example("sim_hesin_oper.txt.gz"),
hesin_critical.file=covid_example("sim_hesin_critical.txt.gz"),
code.file=covid_example("coding240.txt.gz"),
pheno.type = "mortality")
#> [1] "939 participants got tested until 2021-03-19."
#> [1] "217 participants got positive test results until 2021-03-19."
#> [1] "There are 21 deaths with COVID-19. 20 of them primary death cause is COVID-19."
#> [1] "50 patients admitted to hospital were diagnosed as COVID-19 until 2021-03-19."
#> [1] "32 patients' primary diagnosis is COVID-19."
#> [1] "1 patients in hospitalization with COVID-19 diagnosis but show negative in the result file. Modified their test results."
#> [1] "There are 218 COVID-19 patients identified. 32 individuals are admitted to hospital. 3 had been in ICU. 1 had been in advanced ICU."
3. Perfroming Association Tests
The log_cov
function performs association tests using
logistic regressions. This is an example of association tests between
COVID-19 susceptibility and three risk factors: sex, age and BMI.
log_cov(pheno=susceptibility, covariates=covar, phe.name="pos.neg", cov.name=c("sex", "age", "bmi"))
#> Waiting for profiling to be done...
#> Estimate OR 2.5 % 97.5 % p
#> (Intercept) -0.16475743 0.8480994 0.1954585 3.6381032 0.824991899
#> sex1 0.04207813 1.0429760 0.7644672 1.4215535 0.790121307
#> age -0.03080456 0.9696651 0.9519878 0.9876397 0.001009957
#> bmi 0.03625193 1.0369170 1.0076088 1.0667564 0.012568486
4. Generating a Comorbidity Summary File
The comorbidity.summary
function scans all the
hospitalisation records with a given time period and generates
comorbidity summary table. The following example is to generate a
comorbidity summary table that includes all the primary and secondary
diagnoses in the hospital inpatient data after 16 March 2020.
comorb <- comorbidity_summary(ukb.data=covid_example("sim_ukb.tab.gz"),
hesin.file=covid_example("sim_hesin.txt.gz"),
hesin_diag.file=covid_example("sim_hesin_diag.txt.gz"),
ICD10.file=covid_example("ICD10.coding19.txt.gz"),
primary = FALSE,
Date.start = "16/03/2020")
comorb[1:6,1:10]
#> ID A00-A09 A15-A19 A20-A28 A30-A49 A50-A64 A65-A69 A70-A74 A75-A79 A80-A89
#> 1 1 1 0 0 1 0 0 0 0 0
#> 2 10 0 0 0 0 0 0 0 0 0
#> 3 100 0 0 0 0 0 0 0 0 0
#> 4 1000 0 0 0 0 0 0 0 0 0
#> 5 101 0 0 0 0 0 0 0 0 0
#> 6 102 0 0 0 0 0 0 0 0 0
5. Performing Association Tests between COVID-19 Phenotype and Comorbidities
The comorbidity.asso
function performs association tests
between comorbidity categories and selected phenotype using logistic
regression models.This is an example of association tests between
COVID-19 susceptibility and all comorbidities. It shows NAs when fitted
probabilities numerically 0 or 1 occurred in the logistic regression
models.
comorb.asso <- comorbidity_asso(pheno=susceptibility,
covariates=covar,
cormorbidity=comorb,
population="white",
cov.name=c("sex","age","bmi","SES","smoke","inAgedCare"),
phe.name="pos.neg",
ICD10.file=covid_example("ICD10.coding19.txt.gz"))
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning in regularize.values(x, y, ties, missing(ties), na.rm = na.rm):
#> collapsing to unique 'x' values
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
#> Waiting for profiling to be done...
head(comorb.asso, 4)
#> ICD10 Estimate OR 2.5%
#> A00-A09 A00-A09 Intestinal infectious diseases 0.4722864 1.603657 0.756784
#> A15-A19 A15-A19 Tuberculosis NA NA NA
#> A20-A28 A20-A28 Certain zoonotic bacterial diseases NA NA NA
#> A30-A49 A30-A49 Other bacterial diseases 1.2246077 3.402831 1.633209
#> 97.5% p
#> A00-A09 3.240022 0.199664372
#> A15-A19 NA NA
#> A20-A28 NA NA
#> A30-A49 6.978689 0.000873076