Calculating Aggregate Scores

Aggregate scores refer to rating best-worst scaling scores across the entire sample. Sometimes, that is what researchers want to know: What are the most-preferred options? The bwsTools package provides two functions for doing so: ae_mnl and elo. First, we load needed packages.

library(bwsTools)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
library(ggplot2)

First, I aggregate up my individual-level data, vdata, to the format needed for ae_mnl. See more on the tidying vignette for this.

dat <- vdata %>% 
  group_by(issue) %>% 
  summarise(
    totl = n(),
    best = sum(value == 1),
    wrst = sum(value == -1)
  )

dat
#> # A tibble: 13 x 4
#>    issue           totl  best  wrst
#>    <chr>          <int> <int> <int>
#>  1 abortion        1400   302   352
#>  2 biasmedia       1400   124   808
#>  3 corruption      1400   318   351
#>  4 crime           1400   286   282
#>  5 drugs           1400   163   463
#>  6 economy         1400   634   119
#>  7 education       1400   467   252
#>  8 foreignaffairs  1400   121   545
#>  9 guns            1400   354   290
#> 10 healthcare      1400   731   125
#> 11 natsecurity     1400   364   221
#> 12 race            1400   300   392
#> 13 taxes           1400   386   350

Such that abortion was picked as the most important issue facing our country (“best”) 302 times and the least important issue (“worst”) 352 times. This goes into the ae_mnl function, which stands for analytical estimation of the multinomial logistic model. See the documentation for references on how this is calculated.

res1 <- ae_mnl(dat, "totl", "best", "wrst")

This result shows the regression coefficient for the multinomial model, the standard error, lower and upper bounds, and the choice probability for each item.

res1
#>               b         se          lb           ub   choice_p
#> 1  -0.071458964 0.03782058 -0.14558729  0.002669364 0.06323940
#> 2  -1.068364236 0.04331852 -1.15326854 -0.983459929 0.02333658
#> 3  -0.047151591 0.03780695 -0.12125322  0.026950034 0.06479542
#> 4   0.005714301 0.03779660 -0.06836704  0.079795640 0.06831305
#> 5  -0.435318071 0.03869530 -0.51116086 -0.359475284 0.04395070
#> 6   0.771885257 0.04064648  0.69221815  0.851552367 0.14697636
#> 7   0.309592182 0.03825019  0.23462181  0.384562551 0.09257126
#> 8  -0.625324584 0.03965899 -0.70305621 -0.547592957 0.03634519
#> 9   0.091492340 0.03783600  0.01733378  0.165650906 0.07443147
#> 10  0.926814507 0.04192792  0.84463579  1.008993227 0.17160598
#> 11  0.205000644 0.03799517  0.13053011  0.279471181 0.08337822
#> 12 -0.131618249 0.03787832 -0.20585976 -0.057376738 0.05954714
#> 13  0.051439911 0.03780895 -0.02266563  0.125545452 0.07150922

I like to bind these columns back to my original data so that I can get the item labels. Then, I usually plot them in ggplot the following way:

dat %>% 
  bind_cols(res1) %>% 
  arrange(b) %>% 
  mutate(issue = factor(issue, issue)) %>% 
  ggplot(aes(x = issue, y = b)) +
  geom_point() +
  geom_errorbar(aes(ymin = lb, ymax = ub), width = 0) +
  coord_flip()

From here, it is pretty easy to see that healthcare and economy are what people said they cared about most. There’s a bunch in the middle, and then drugs, foreign affairs, and bias in the media all fall in the least important range.

The ae_mnl function assumes a balanced incomplete block design. But sometimes that is not possible. What if you wanted to compare, say, 150 different items? A balanced incomplete block design would be too much for any one participant to realistically be able to do. One way is to calculate Elo scores (again, see the documentation for references).

This time, the function looks like an individual-level scoring function, even though it does aggregate ratings. You give the elo function the participant identification, the block number, the item, and the value (1 for best, 0 for not chosen, -1 for worst).

set.seed(1839)
res2 <- elo(vdata, "id", "block", "issue", "value")

The result looks similar to the above. All Elo scores start at 1000, so we are looking for how above or below an item is relative to 1000.

res2
#> # A tibble: 13 x 2
#>    item             elo
#>    <chr>          <dbl>
#>  1 abortion        984.
#>  2 biasmedia       775.
#>  3 corruption      987.
#>  4 crime          1006.
#>  5 drugs           908.
#>  6 economy        1167.
#>  7 education      1065.
#>  8 foreignaffairs  858.
#>  9 guns           1023.
#> 10 healthcare     1194.
#> 11 natsecurity    1044.
#> 12 race            975.
#> 13 taxes          1014.

Since these data both came from a balanced incomplete block design, we would expect the two methods to agree. And they do:

cor(res1$b, res2$elo)
#> [1] 0.9995125
plot(res1$b, res2$elo)