Aggregate scores refer to rating best-worst scaling scores across the entire sample. Sometimes, that is what researchers want to know: What are the most-preferred options? The bwsTools
package provides two functions for doing so: ae_mnl
and elo
. First, we load needed packages.
library(bwsTools)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidyr)
library(ggplot2)
First, I aggregate up my individual-level data, vdata
, to the format needed for ae_mnl
. See more on the tidying vignette for this.
dat <- vdata %>%
group_by(issue) %>%
summarise(
totl = n(),
best = sum(value == 1),
wrst = sum(value == -1)
)
The data look like:
dat
#> # A tibble: 13 x 4
#> issue totl best wrst
#> <chr> <int> <int> <int>
#> 1 abortion 1400 302 352
#> 2 biasmedia 1400 124 808
#> 3 corruption 1400 318 351
#> 4 crime 1400 286 282
#> 5 drugs 1400 163 463
#> 6 economy 1400 634 119
#> 7 education 1400 467 252
#> 8 foreignaffairs 1400 121 545
#> 9 guns 1400 354 290
#> 10 healthcare 1400 731 125
#> 11 natsecurity 1400 364 221
#> 12 race 1400 300 392
#> 13 taxes 1400 386 350
Such that abortion was picked as the most important issue facing our country (“best”) 302 times and the least important issue (“worst”) 352 times. This goes into the ae_mnl
function, which stands for analytical estimation of the multinomial logistic model. See the documentation for references on how this is calculated.
This result shows the regression coefficient for the multinomial model, the standard error, lower and upper bounds, and the choice probability for each item.
res1
#> b se lb ub choice_p
#> 1 -0.071458964 0.03782058 -0.14558729 0.002669364 0.06323940
#> 2 -1.068364236 0.04331852 -1.15326854 -0.983459929 0.02333658
#> 3 -0.047151591 0.03780695 -0.12125322 0.026950034 0.06479542
#> 4 0.005714301 0.03779660 -0.06836704 0.079795640 0.06831305
#> 5 -0.435318071 0.03869530 -0.51116086 -0.359475284 0.04395070
#> 6 0.771885257 0.04064648 0.69221815 0.851552367 0.14697636
#> 7 0.309592182 0.03825019 0.23462181 0.384562551 0.09257126
#> 8 -0.625324584 0.03965899 -0.70305621 -0.547592957 0.03634519
#> 9 0.091492340 0.03783600 0.01733378 0.165650906 0.07443147
#> 10 0.926814507 0.04192792 0.84463579 1.008993227 0.17160598
#> 11 0.205000644 0.03799517 0.13053011 0.279471181 0.08337822
#> 12 -0.131618249 0.03787832 -0.20585976 -0.057376738 0.05954714
#> 13 0.051439911 0.03780895 -0.02266563 0.125545452 0.07150922
I like to bind these columns back to my original data so that I can get the item labels. Then, I usually plot them in ggplot
the following way:
dat %>%
bind_cols(res1) %>%
arrange(b) %>%
mutate(issue = factor(issue, issue)) %>%
ggplot(aes(x = issue, y = b)) +
geom_point() +
geom_errorbar(aes(ymin = lb, ymax = ub), width = 0) +
coord_flip()
From here, it is pretty easy to see that healthcare and economy are what people said they cared about most. There’s a bunch in the middle, and then drugs, foreign affairs, and bias in the media all fall in the least important range.
The ae_mnl
function assumes a balanced incomplete block design. But sometimes that is not possible. What if you wanted to compare, say, 150 different items? A balanced incomplete block design would be too much for any one participant to realistically be able to do. One way is to calculate Elo scores (again, see the documentation for references).
This time, the function looks like an individual-level scoring function, even though it does aggregate ratings. You give the elo
function the participant identification, the block number, the item, and the value (1 for best, 0 for not chosen, -1 for worst).
The result looks similar to the above. All Elo scores start at 1000, so we are looking for how above or below an item is relative to 1000.
res2
#> # A tibble: 13 x 2
#> item elo
#> <chr> <dbl>
#> 1 abortion 984.
#> 2 biasmedia 775.
#> 3 corruption 987.
#> 4 crime 1006.
#> 5 drugs 908.
#> 6 economy 1167.
#> 7 education 1065.
#> 8 foreignaffairs 858.
#> 9 guns 1023.
#> 10 healthcare 1194.
#> 11 natsecurity 1044.
#> 12 race 975.
#> 13 taxes 1014.
Since these data both came from a balanced incomplete block design, we would expect the two methods to agree. And they do: