Forced-choice (FC) tests are gaining researcher’s interest increasingly for its faking resistance when well-designed. Well-designed FC tests should often be characterized by items within a block measuring different latent traits, and items within a block having similar magnitude, or high inter-item agreement (IIA) in terms of their social desirability. Other scoring models may also require factor loading differences or item locations within a block to be maximized or minimized.
Either way, decision on which items should be assigned to the same block - item pairing - is a crucial issue in building a well-designed FC test, which is currently carried out manually. However, given that we often need to simultaneously meet multiple objectives, manual pairing will turn out to be impractical and even infeasible, especially when the number of latent traits and/or the number of items per trait become relatively large.
The R package autoFC is developed to address these difficulties and provides a tool for facilitating automatic FC test construction. It offers users the functionality to:
Customize one or more item pairing criteria and calculate a composite pairing index, termed “energy” with user-specified weights for each criterion.
Automatically optimize the energy for the whole test by sequentially or simultaneously optimizing each matching rule, through the exchange of items among blocks or replacement with unused items.
Construct parallel forms of the same test following the same pairing rules.
Users are allowed to create an FC test of any block size (e.g. Pairs, Triplets, Quadruplets).
Below is a brief explanation of all functions provided by autoFC. Details and usage can be found in the next section.
cal_block_energy()
and
cal_block_energy_with_iia()
both calculate the total energy
for a single item block, or a full FC test with multiple blocks, given a
data frame of item characteristics. The latter function incorporates IIA
metrics into energy calculation.cal_block_energy_with_iia()
incorporates
four IIA metrics in which items are paired by maximizing the IIA within
each block. Each IIA has a default weight of 1.make_random_block()
takes in number of items and
block size as input arguments and produces a test with blocks of
randomly paired item numbers. Information about item characteristics is
not required.
get_iia()
takes in item responses and a single item
block (Or a full FC test with multiple blocks), then returns IIA metrics
for each item block.
sa_pairing_generalized()
is the automatic pairing
function which takes in item characteristics (and also individual
responses for all items) and an initial FC test, then optimizes the
energy of the test based on Simulated Annealing (SA) algorithm.
SA is a probabilistic technique for approximating the global
optimum of a given function, in which each iteration involves the cool
down of the “Temperature” until it reaches a certain value. Within each
iteration, a new solution (FC test) is produced and
compared with current solution in terms of their energy (Which is
calculated by calling cal_block_energy()
or
cal_block_energy_with_iia()
. Acceptance or rejection of new
solution is determined as follows:
If all items in the item characteristic data frame are used to
construct the FC test, sa_pairing_generalized()
will
produce new solutions by randomly exchanging items between two blocks;
Otherwise, it will randomly select between exchanging items and
replacing with unused items based on proportion of items used to
construct the test.
sa_pairing_generalized()
has built-in default values
for most of the arguments if they are not given; For example, an FC test
with block size 2 using all of the given items will be constructed by
default if the block
argument is not provided. See the
tutorial below for meanings and default values for other
arguments.
In this tutorial, we suppose that 60 5-Point Likert items measuring Big Five traits, each with a certain item location, are used to build a FC scale with block size 3. We also simulate responses from 1,000 participants for the 60 items on their social desirability.
set.seed(2021)
# Simulation of 1,000 respondents on 60 items. A better simulation should be
# consisting of responses produced by specific IRT parameters.
<- sample(seq(1:5), 500*60, replace = TRUE,
s1 prob = c(0.10, 0.15, 0.20, 0.25, 0.30))
<- sample(seq(1:5), 500*60, replace = TRUE,
s2 prob = c(0.50, 0.10, 0.10, 0.15, 0.15))
<- matrix(c(s1, s2), ncol = 60)
item_responses
<- sample(c("Openness","Conscientiousness","Neuroticism",
item_dims "Extraversion","Agreeableness"), 60, replace = TRUE)
<- colMeans(item_responses)
item_mean <- runif(60, -1, 1)
item_difficulty
# Then we build a data frame with item characteristics
<- data.frame(DIM = item_dims, SD_Mean = item_mean, DIFF = item_difficulty)
item_chars
= c(1, -1, -3) char_weights
Next, we build a random FC scale using the 60 items with block size
3. You can see from initial_FC
that now all 60 items are
divided into 20 triplets.
<- make_random_block(total_items = 60, item_per_block = 3)
initial_FC ::kable(initial_FC) knitr
59 | 14 | 43 |
42 | 16 | 34 |
41 | 60 | 22 |
26 | 19 | 2 |
28 | 10 | 45 |
12 | 52 | 8 |
35 | 1 | 51 |
23 | 49 | 33 |
40 | 7 | 57 |
55 | 54 | 18 |
38 | 5 | 4 |
31 | 11 | 39 |
13 | 3 | 6 |
32 | 58 | 48 |
15 | 29 | 47 |
27 | 20 | 56 |
24 | 9 | 30 |
44 | 21 | 46 |
53 | 17 | 25 |
50 | 36 | 37 |
Also let’s see the how the item characteristics look like for each of the 20 triplets.
First, the underlying latent traits. We see that there are some cases where two items measuring the same traits appear in the same block, which is something we want to avoid.
::kable(matrix(item_chars$DIM[t(initial_FC)], ncol = 3, byrow = TRUE)) knitr
Conscientiousness | Conscientiousness | Neuroticism |
Extraversion | Openness | Agreeableness |
Openness | Conscientiousness | Openness |
Conscientiousness | Extraversion | Openness |
Neuroticism | Extraversion | Extraversion |
Openness | Openness | Conscientiousness |
Neuroticism | Openness | Agreeableness |
Openness | Extraversion | Extraversion |
Extraversion | Conscientiousness | Extraversion |
Openness | Agreeableness | Extraversion |
Neuroticism | Neuroticism | Neuroticism |
Extraversion | Conscientiousness | Openness |
Neuroticism | Conscientiousness | Neuroticism |
Extraversion | Extraversion | Neuroticism |
Openness | Openness | Neuroticism |
Neuroticism | Conscientiousness | Conscientiousness |
Neuroticism | Openness | Neuroticism |
Agreeableness | Neuroticism | Openness |
Conscientiousness | Conscientiousness | Agreeableness |
Neuroticism | Extraversion | Openness |
Then, scores on social desirability. We do see many cases where items differ in their social desirability on a magnitude of >1 on a 5-point scale, within a block. That’s not good.
<- matrix(item_chars$SD_Mean[t(initial_FC)], ncol = 3, byrow = TRUE)
sd_initial ::kable(sd_initial) knitr
2.327 | 3.549 | 2.257 |
2.269 | 3.417 | 2.410 |
2.331 | 2.386 | 3.549 |
3.575 | 3.481 | 3.492 |
3.628 | 3.469 | 2.430 |
3.470 | 2.326 | 3.418 |
2.306 | 3.518 | 2.299 |
3.455 | 2.367 | 2.382 |
2.372 | 3.429 | 2.291 |
2.383 | 2.275 | 3.501 |
2.346 | 3.499 | 3.493 |
2.371 | 3.476 | 2.381 |
3.429 | 3.540 | 3.478 |
2.375 | 2.383 | 2.283 |
3.551 | 3.484 | 2.367 |
3.458 | 3.521 | 2.296 |
3.515 | 3.474 | 3.524 |
2.275 | 3.501 | 2.408 |
2.354 | 3.526 | 3.449 |
2.314 | 2.294 | 2.344 |
Lastly, item difficulty. We also see that item difficulties within a block are also inconsistent.
<- matrix(item_chars$DIFF[t(initial_FC)], ncol = 3, byrow = TRUE)
diff_initial ::kable(diff_initial) knitr
0.5768038 | -0.1649317 | -0.1609612 |
0.7562551 | 0.3665809 | -0.8802425 |
-0.3477869 | 0.4033506 | 0.9447096 |
0.8602120 | -0.5168748 | -0.1734361 |
0.3712718 | -0.6267511 | 0.0917744 |
-0.9688906 | 0.9139843 | 0.3986206 |
0.2507761 | -0.0490111 | -0.5887666 |
-0.1364442 | 0.7372855 | 0.2383125 |
0.6598719 | 0.5910397 | 0.3847467 |
0.1906729 | -0.1072381 | -0.2764722 |
0.8652686 | 0.3189794 | -0.7635046 |
-0.8770999 | 0.5814682 | 0.7113918 |
0.9814553 | -0.0455446 | 0.6105311 |
-0.7874764 | -0.1775934 | -0.2743858 |
0.5481408 | -0.1896666 | -0.7339903 |
-0.7614264 | 0.6233055 | 0.2106074 |
-0.4487723 | 0.9388348 | -0.7846000 |
0.8782261 | -0.9898177 | 0.0105288 |
0.4304538 | 0.3800486 | -0.4484543 |
0.9194196 | 0.8647261 | -0.0951617 |
Next we calculate the energy for initial_FC
, with
FUN
set to be default. weights
is set to -1
for social desirability and -3 for item difficulty because we want the
discrepancy of these characteristics to be as low as possible within a
block.
The weight for item difficulty is higher to scale for its smaller range than social desirability. Beware about the scaling difference among different item characteristics and use different weights accordingly.
cal_block_energy(block = initial_FC, item_chars = item_chars, weights = char_weights)
#> [1] -25.28441
If IIAs are to be involved we have lower energy value. This is because these randomly generated responses are not likely to be consistent with each other, hence very low and even negative IIAs:
cal_block_energy_with_iia(block = initial_FC, item_chars = item_chars,
weights = char_weights,
rater_chars = item_responses)
#> [,1]
#> [1,] -33.31589
Notice that if we give zero weights to all IIAs we will get the same
energy value as cal_block_energy
:
cal_block_energy_with_iia(block = initial_FC, item_chars = item_chars,
weights = char_weights,
rater_chars = item_responses,
iia_weights = c(0, 0, 0, 0))
#> [,1]
#> [1,] -25.28441
Also, if you want to see the inter-item agreement metrics for each
block, you can use get_iia()
. It should not be too
impressive and is for demonstration purposes only. Users are suggested
to use real world response data to see the IIA within each block.
::kable(get_iia(block = initial_FC, data = item_responses)) knitr
BPlin | BPquad | AClin | ACquad |
---|---|---|---|
-0.17500 | -0.38133 | -0.07767 | -0.17000 |
-0.11958 | -0.27900 | -0.03922 | -0.10768 |
-0.17125 | -0.35733 | -0.09333 | -0.18799 |
0.09583 | 0.14883 | 0.13905 | 0.22648 |
-0.10458 | -0.20183 | -0.08040 | -0.15034 |
-0.09542 | -0.18250 | -0.08069 | -0.15112 |
-0.15000 | -0.31867 | -0.06953 | -0.14620 |
-0.14583 | -0.30900 | -0.08054 | -0.16786 |
-0.13167 | -0.29967 | -0.04673 | -0.11820 |
-0.14208 | -0.33417 | -0.05416 | -0.14343 |
-0.11500 | -0.21550 | -0.09354 | -0.16960 |
-0.18083 | -0.36500 | -0.09805 | -0.18615 |
0.07375 | 0.10000 | 0.11266 | 0.17256 |
-0.03125 | -0.19833 | 0.14874 | 0.15781 |
-0.12875 | -0.26000 | -0.09955 | -0.19646 |
-0.12708 | -0.22517 | -0.11082 | -0.19031 |
0.11208 | 0.16633 | 0.15302 | 0.23982 |
-0.15458 | -0.32067 | -0.07820 | -0.15677 |
-0.09208 | -0.18267 | -0.07283 | -0.14170 |
-0.05833 | -0.26133 | 0.14720 | 0.14891 |
To produce an optimized paired FC scale, we have the objective
of:
* Keeping items in the same block being from different latent
traits;
* Minimizing variance of social desirability within each
block;
* Minimizing variance of item difficulty within each
block.
For IIAs, we also want to maximize the mean of the four IIAs within each block.
Below is an example run of producing an automatically paired FC. Arguments that may be of interest for users include:
block
: The initial paired FC scale, which can be
produced in Step 2. If left empty, an FC scale with block size 2 and
items presented sequentially will be produced, with total number of
items equals to number of rows in item_chars
.
total_items
: Default to be number of unique values in
block
. Can be a value larger than this value which
represents cases where only some items in the item pool are used to
build an FC scale.
Temperature
: The initial temperature value of the
automatic pairing method. Higher temperature is associated with higher
probability of accepting a worse solution. It is recommended to leave
this value blank and let it be scaled on the energy of
block
by specifying eta_Temperature
.
r
: Determines the decrease rate of
Temperature
. Should be a value between 0 and 1. Larger r
values allows more iterations in the optimization process but will slow
down the program.
end_criteria
: Determines the end criteria for the
automatic pairing process. A proportion value scaled on
Temperature
. Should be a value between 0 and 1. Smaller
values allows more iterations in the optimization process but will slow
down the program.
item_chars
: A data frame with item characteristics for
all items. It is recommended that information irrelevant to pairing be
discarded beforehand, but users can also set the corresponding position
in weights
to be 0 to bypass these irrelevant item
characteristics (Such is item ID).
FUN
: A vector of function names for optimizing each item
characteristic within each block. For example:
FUN = c('mean', 'var', 'sum')
. Also supports customized
functions. Defaults to var
for numeric variables and
facfun
for factor/character variables.
n_exchange
: Determines how many blocks are exchanged in
order to produce a new solution for each iteration. Should be a value
less than nrow(block)
.
weights
: A vector of integer values indicating relative
weights for each item characteristic after calculated by
FUN
. Default to be a vector of all 1s.
prob_newitem
: Probability of choosing the strategy of
picking a new item, when not all candidate items are used to build the
FC scale.
If you wish to use IIAs as pairing criterion, here are some arguments
that might be useful. Note that rater_chars
and
iia_weights
are ignored when use_IIA
is
FALSE
.
use_IIA
: Logical. Indicates whether IIA metrics are used
as matching criteria.
rater_chars
: Item responses for all items by a certain
number of participants.
iia_weights
: A vector of length 4 indicating weights
given for the 4 IIA metrics, including linearly and quadratically
weighted AC (Gwet, 2008; 2014) and Brennan-Prediger Index (Prennan &
Prediger, 1981; Gwet, 2014). Default to a vector of all 1s.
# Note that this will take some time to run! (~ 1-2 minutes with this setting)
# Weights for social desirability score and item difficulty should be set to -1,
# because we don't want variance for these characteristics to be big.
<- sa_pairing_generalized(block = initial_FC, eta_Temperature = 0.01,
result r = 0.995, end_criteria = 10^(-6),
weights = char_weights,
item_chars = item_chars, use_IIA = TRUE,
rater_chars = item_responses)
Finally, let’s see how this pairing method improves from the initial solution!
Let’s first see the total energy compared to the previous one. First are the initial energy, which is identical to what we have calculated in Step 4.
# Initial energy with IIA
cal_block_energy_with_iia(block = result$block_initial, item_chars = item_chars,
weights = char_weights, rater_chars = item_responses)
#> [,1]
#> [1,] -33.31589
# Alternative way to calculate initial energy
print(result$energy_initial)
#> [1] -33.31589
And the final result:
# Final energy with IIA
cal_block_energy_with_iia(block = result$block_final, item_chars = item_chars,
weights = char_weights, rater_chars = item_responses)
#> [,1]
#> [1,] 21.69018
# Alternative way to calculate final energy
print(result$energy_final)
#> [1] 21.69018
Let’s take a look at how items are matched within each block. First are underlying latent traits.
This time, within each block, the three items are already coming from three distinct latent traits!
(Note: It does not guarantee that items will ALWAYS come from different latent traits after pairing. But if you want to increase the likelihood for such a result, you can increase the weight corresponding to item dimension)
::kable(matrix(item_chars$DIM[t(result$block_final)], ncol = 3, byrow = TRUE)) knitr
Neuroticism | Openness | Extraversion |
Extraversion | Neuroticism | Agreeableness |
Openness | Conscientiousness | Neuroticism |
Neuroticism | Extraversion | Conscientiousness |
Openness | Conscientiousness | Neuroticism |
Extraversion | Conscientiousness | Openness |
Openness | Extraversion | Neuroticism |
Conscientiousness | Openness | Neuroticism |
Openness | Agreeableness | Neuroticism |
Agreeableness | Extraversion | Neuroticism |
Neuroticism | Openness | Extraversion |
Conscientiousness | Neuroticism | Openness |
Extraversion | Openness | Neuroticism |
Conscientiousness | Extraversion | Openness |
Agreeableness | Extraversion | Conscientiousness |
Neuroticism | Agreeableness | Extraversion |
Conscientiousness | Extraversion | Openness |
Neuroticism | Openness | Conscientiousness |
Extraversion | Openness | Conscientiousness |
Neuroticism | Openness | Conscientiousness |
Next let’s look at difference in social desirability within each block. Item social desirability scores are much closer to each other within each block, but we see that there are still big discrepancies two blocks.
<- matrix(item_chars$SD_Mean[t(result$block_final)], ncol = 3, byrow = TRUE)
sd_final ::kable(sd_final) knitr
3.493 | 3.518 | 3.481 |
3.469 | 3.501 | 3.449 |
3.549 | 3.429 | 3.478 |
2.306 | 2.382 | 2.354 |
3.417 | 3.418 | 3.499 |
2.372 | 2.327 | 2.383 |
3.470 | 2.375 | 3.458 |
3.575 | 3.474 | 3.429 |
2.344 | 2.299 | 2.283 |
2.275 | 2.269 | 2.314 |
2.346 | 2.326 | 2.294 |
3.540 | 3.524 | 3.492 |
2.383 | 2.331 | 2.257 |
2.386 | 2.291 | 2.408 |
2.275 | 2.430 | 2.296 |
2.367 | 2.410 | 2.371 |
3.476 | 2.367 | 2.381 |
3.515 | 3.455 | 3.526 |
3.501 | 3.484 | 3.549 |
3.628 | 3.551 | 3.521 |
A more intuitive way to present this: how much have we improved on the average variance for all blocks? Still, it is much lower, but we see there is always space to improve.
# Initial
print(mean(apply(sd_initial, 1, var)))
#> [1] 0.3317092
# Final
print(mean(apply(sd_final, 1, var)))
#> [1] 0.04195382
Finally we look at item difficulty. Good improvement is also observed. We see that difference in item difficulty within a block also decreases:
<- matrix(item_chars$DIF[t(result$block_final)], ncol = 3, byrow = TRUE)
diff_final ::kable(diff_final) knitr
-0.7635046 | -0.0490111 | -0.5168748 |
-0.6267511 | -0.9898177 | -0.4484543 |
0.9447096 | 0.5910397 | 0.6105311 |
0.2507761 | 0.2383125 | 0.4304538 |
0.3665809 | 0.3986206 | 0.3189794 |
0.6598719 | 0.5768038 | 0.1906729 |
-0.9688906 | -0.7874764 | -0.7614264 |
0.8602120 | 0.9388348 | 0.9814553 |
-0.0951617 | -0.5887666 | -0.2743858 |
0.8782261 | 0.7562551 | 0.9194196 |
0.8652686 | 0.9139843 | 0.8647261 |
-0.0455446 | -0.7846000 | -0.1734361 |
-0.1775934 | -0.3477869 | -0.1609612 |
0.4033506 | 0.3847467 | 0.0105288 |
-0.1072381 | 0.0917744 | 0.2106074 |
-0.7339903 | -0.8802425 | -0.8770999 |
0.5814682 | 0.7372855 | 0.7113918 |
-0.4487723 | -0.1364442 | 0.3800486 |
-0.2764722 | -0.1896666 | -0.1649317 |
0.3712718 | 0.5481408 | 0.6233055 |
How much have we improved on the average variance for all blocks in this case?
print(mean(apply(diff_initial, 1, var)))
#> [1] 0.4275037
print(mean(apply(diff_final, 1, var)))
#> [1] 0.04305659
We also list IIAs for demonstration purposes, which also improves.
colMeans(get_iia(result$block_final, data = item_responses))
#> BPlin BPquad AClin ACquad
#> 0.0195425 -0.0498910 0.1263995 0.1595815
In some cases, users may want to optimize item characteristics sequentially, rather than in a simultaneous manner. This makes sense because it is possible that simultaneous optimization will inevitably favor the improvement in one characteristic at the cost of losing the best fit for the other, as we have observed in Step 5.
Two solutions can be made to address this problem:
Pay careful attention to the distribution of each item
characteristic and try out different weights for characteristics with
different scales. Alternatively, try smaller end_criteria
or larger r
and n_exchange
values to allow for
more iterations to be run;
Use a multi-step optimization process, where some item
characteristics are optimized first, then others. This involves running
sa_pairing_generalized()
several times, which each time
optimizing more and more item characteristics. Those characteristics
optimized will remain their weight in later stages, but have 0 weights
if they are not yet optimized.
With the previous example, we show how method 2 will work, starting
from initial_FC
. First we perform optimization on latent
traits:
<- sa_pairing_generalized(initial_FC, eta_Temperature = 0.01,
FC_1 r = 0.995, end_criteria = 10^(-6),
weights = c(1, 0, 0),
item_chars = item_chars, use_IIA = TRUE,
rater_chars = item_responses)
Then, we optimize based on minimizing variance in social desirability within a block.
<- sa_pairing_generalized(FC_1$block_final, eta_Temperature = 0.01,
FC_2 r = 0.995, end_criteria = 10^(-6),
weights = c(1, -1, 0),
item_chars = item_chars, use_IIA = TRUE,
rater_chars = item_responses)
Finally, we optimize bease on minimizing variance in item difficulty.
<- sa_pairing_generalized(FC_2$block_final, eta_Temperature = 0.01,
FC_3 r = 0.995, end_criteria = 10^(-6),
weights = c(1, -1, -3),
item_chars = item_chars, use_IIA = TRUE,
rater_chars = item_responses)
First, underlying latent traits. It does look nice like what we have in Step 5.
::kable(matrix(item_chars$DIM[t(FC_3$block_final)], ncol = 3, byrow = TRUE)) knitr
Extraversion | Openness | Neuroticism |
Agreeableness | Extraversion | Neuroticism |
Conscientiousness | Neuroticism | Openness |
Extraversion | Openness | Conscientiousness |
Agreeableness | Openness | Extraversion |
Neuroticism | Openness | Conscientiousness |
Neuroticism | Agreeableness | Extraversion |
Neuroticism | Openness | Extraversion |
Openness | Neuroticism | Extraversion |
Openness | Neuroticism | Conscientiousness |
Neuroticism | Openness | Agreeableness |
Neuroticism | Openness | Conscientiousness |
Conscientiousness | Extraversion | Neuroticism |
Conscientiousness | Neuroticism | Openness |
Extraversion | Conscientiousness | Openness |
Extraversion | Agreeableness | Neuroticism |
Conscientiousness | Conscientiousness | Openness |
Conscientiousness | Openness | Extraversion |
Openness | Neuroticism | Extraversion |
Conscientiousness | Neuroticism | Extraversion |
Next let’s look at difference in social desirability within each block. It performs better than in Step 5, where the discrepancy within block is removed!
<- matrix(item_chars$SD_Mean[t(FC_3$block_final)], ncol = 3, byrow = TRUE)
sd_FC3 ::kable(sd_FC3) knitr
2.430 | 2.408 | 2.257 |
2.299 | 2.371 | 2.283 |
3.429 | 3.478 | 3.549 |
2.291 | 2.383 | 2.296 |
2.275 | 2.331 | 2.383 |
3.458 | 3.484 | 3.549 |
2.367 | 2.410 | 2.375 |
3.501 | 3.470 | 3.469 |
2.326 | 2.346 | 2.367 |
3.518 | 3.499 | 3.526 |
3.493 | 3.455 | 3.449 |
3.429 | 3.474 | 3.575 |
3.540 | 3.501 | 3.515 |
3.418 | 3.628 | 3.417 |
2.294 | 2.354 | 2.381 |
2.269 | 2.275 | 2.314 |
3.521 | 3.476 | 3.551 |
2.386 | 2.344 | 2.382 |
3.492 | 3.524 | 3.481 |
2.327 | 2.306 | 2.372 |
As before, let’s see how much have we improved on the average variance for all blocks. Improvement in social desirability is verified by seeing the decrease in variance.
# Initial solution
print(mean(apply(sd_initial, 1, var)))
#> [1] 0.3317092
# Simultaneous optimization
print(mean(apply(sd_final, 1, var)))
#> [1] 0.04195382
# Sequential optimization
print(mean(apply(sd_FC3, 1, var)))
#> [1] 0.002573133
Finally we look at item difficulty.
<- matrix(item_chars$DIF[t(FC_3$block_final)], ncol = 3, byrow = TRUE)
diff_fc3 ::kable(diff_final) knitr
-0.7635046 | -0.0490111 | -0.5168748 |
-0.6267511 | -0.9898177 | -0.4484543 |
0.9447096 | 0.5910397 | 0.6105311 |
0.2507761 | 0.2383125 | 0.4304538 |
0.3665809 | 0.3986206 | 0.3189794 |
0.6598719 | 0.5768038 | 0.1906729 |
-0.9688906 | -0.7874764 | -0.7614264 |
0.8602120 | 0.9388348 | 0.9814553 |
-0.0951617 | -0.5887666 | -0.2743858 |
0.8782261 | 0.7562551 | 0.9194196 |
0.8652686 | 0.9139843 | 0.8647261 |
-0.0455446 | -0.7846000 | -0.1734361 |
-0.1775934 | -0.3477869 | -0.1609612 |
0.4033506 | 0.3847467 | 0.0105288 |
-0.1072381 | 0.0917744 | 0.2106074 |
-0.7339903 | -0.8802425 | -0.8770999 |
0.5814682 | 0.7372855 | 0.7113918 |
-0.4487723 | -0.1364442 | 0.3800486 |
-0.2764722 | -0.1896666 | -0.1649317 |
0.3712718 | 0.5481408 | 0.6233055 |
Average variance of item difficulty also decreases!
# Initial solution
print(mean(apply(diff_initial, 1, var)))
#> [1] 0.4275037
# Simultaneous optimization
print(mean(apply(diff_final, 1, var)))
#> [1] 0.04305659
# Sequential optimization
print(mean(apply(diff_fc3, 1, var)))
#> [1] 0.04012038
How about IIAs?
colMeans(get_iia(FC_3$block_final, data = item_responses))
#> BPlin BPquad AClin ACquad
#> 0.0335635 -0.0304330 0.1459820 0.1892760
In this tutorial, we have shown how the functionality of
autoFC
can be used to automatically build forced choice
scales with better matches in various item characteristics.
We note that this automatic tool does not guarantee the production of a scale with best matches in these characteristic simultaneously, but instead provides solutions close to optimal among a search space where exhaustive enumeration of each solution is unrealistic.
We also note that, to produce scales with better overall fit, users
are encouraged to: 1. customize their own functions for optimizing each
item characteristic; 2. Try out different weights
,
iia_weights
, Temperature (eta_Temperature)
or
r
values; 3. Use a sequential process to optimize item
characteristics one by one.
Good luck!