The fourth Intergovernmental Panel on Climate Change (IPCC) report utilizes verbal phrases such as “likely” and “unlikely” to describe uncertainties in climate science (e.g., “The Greenland ice sheet and other Arctic ice fields likely contributed no more than 4 m of the observed sea level rise.”). The IPCC report also provided guidelines to enable readers to interpret these phrases as numerical intervals (e.g., “likely” was characterized as referring to probabilities between .66 and 1).
Budescu, Broomell, and Por (2009) conducted an experimental study of lay interpretations of these phrases, using 13 sentences from the IPCC report. They asked participants to provide lower, “best”, and upper numerical estimates of the probabilities to which they believed each sentence referred. They found that participants’ “best” estimates were nearer to the middle of the [0, 1] interval than the IPCC guidelines. In a reanalysis of their data using beta regression, Smithson, et al. (2012) reported that this tendency was stronger for negatively-worded phrases (e.g., “unlikely”) than for positively-worded phrases. Moreover, they found greater dispersion of responses (i.e., less consensus) for negative than for positive phrases.
The IPCC data-set comprises the lower, best, and upper estimates for
the phrases “likely” and “unlikely” in six IPCC report sentences. There
are 18 observations for each of 223 participants, consisting of lower,
best, and upper estimates for 6 sentences. The “likely” sentence data
are in the rows where max(Q4
, Q5
,
Q6
) = 1, and the “unlikely” sentence data are in the rows
where max(Q8
, Q9
, Q10
) = -1. A
variable named valence
takes a value of 1 for “likely” and
0 for “unlikely”. Lower, best, and upper estimates are identified by the
variables “mid” and “high”, such that both are 0 for the lower
estimates, mid = 1
and high = 0
for the best
estimates, and mid = 1
and high = 1
for the
upper estimates.
The raw estimates themselves are the variable named
prob
, and probm
is a transformation that
shifts prob away from the boundary values of 0 and 1. Thus, probm is the
appropriate dependent variable for a cdfquantreg model.
The remaining three variables (treat
,
narrow
, and wide
) represent the experimental
conditions in the Budescu et al. study. The “treat” variable codes two
conditions: treat = 0
if participants were given a table
with the IPCC guidelines in it, and treat = 1
if the IPCC
guideline was included in the sentence itself. Budescu, et al. (2009)
reported that embedding the guideline in the sentence caused
respondents’ estimates to be less regressive and closer to the IPCC
guidelines.
library(cdfquantreg)
data(cdfqrExampleData)
<- subset(IPCC, mid == 1 & high == 0)
ipcc_mid
# Overview the data
::kable(head(ipcc_mid), row.names=F) knitr
subj | treat | prob | probm | mid | high | Question | valence |
---|---|---|---|---|---|---|---|
1 | 1 | 0.56 | 0.5597309 | 1 | 0 | Q4 | 1 |
1 | 1 | 0.51 | 0.5099552 | 1 | 0 | Q5 | 1 |
1 | 1 | 0.52 | 0.5199103 | 1 | 0 | Q6 | 1 |
1 | 1 | 0.35 | 0.3506726 | 1 | 0 | Q8 | 0 |
1 | 1 | 0.42 | 0.4203587 | 1 | 0 | Q9 | 0 |
1 | 1 | 0.90 | 0.8982063 | 1 | 0 | Q10 | 0 |
# Distribution of the data
::truehist(ipcc_mid$probm) MASS
# Choice of CDF distribution: finite tailed
cdfqrFamily(shape='FT')
## Overview cdfquantreg distributions:
Distributions | fd | sd | shape |
---|---|---|---|
ArcSinh-ArcSinh | arcsinh | arcsinh | Finite-tailed |
ArcSinh-Cauchy | arcsinh | cauchy | Finite-tailed |
Cauchit-ArcSinh | cauchit | arcsinh | Finite-tailed |
Cauchit-Cauchy | cauchit | cauchy | Finite-tailed |
T2-T2 | T2 | T2 | Finite-tailed |
# We use T2-T2 distribution
<- "t2"
fd <- "t2"
sd
# Fit the null model
<- cdfquantreg(probm ~ 1 | 1, fd, sd, data = ipcc_mid)
fit_null
# Fit the target model
<- cdfquantreg(probm ~ valence | valence, fd, sd, data = ipcc_mid)
fit
# Obtain the statistics for the null model
summary(fit)
## Family: t2 t2
## Call: cdfquantreg(formula = probm ~ valence | valence, data = ipcc_mid,
## fd = fd, sd = sd)
##
## Mu coefficients (Location submodel)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.79843 0.03436 23.240 < 2e-16 ***
## valence -0.18599 0.04120 -4.514 6.37e-06 ***
##
## Sigma coefficients (Dispersion submodel)
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -0.36790 0.04500 -8.176 2.22e-16 ***
## valence -0.42062 0.06228 -6.754 1.44e-11 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Converge: successful completion
## Log-Likelihood: 435.2941
##
## Gradient: -0.0376 -0.0387 0.0129 -0.0011
# Compare the empirical distribution and the fitted values distribution
plot(fit)
# Plot the fitted values
plot(fitted(fit, "full"))
# Check Residuals
plot(residuals(fit, "raw"))
Budescu, D. V., Broomell, S., & Por, H. H. (2009). Improving communication of uncertainty in the reports of the Intergovernmental Panel on Climate Change. Psychological science, 20(3), 299-308.
Smithson, M., Budescu, D. V., Broomell, S. B., & Por, H. H. (2012). Never say “not”: Impact of negative wording in probability phrases on imprecise probability judgments. International journal of approximate reasoning, 53(8), 1262-1270.