kstMatrix

Cord Hockemeyer

2024-10-03

Knowledge space theory applies prerequisite relationships between items of knowledge within a given domain for efficient adaptive assessment and training (Doignon & Falmagne, 1999). The kstMatrix package implements some basic functions for working with knowledge space. Furthermore, it provides several empirically obtained knowledge spaces in form of their bases.

There is a certain overlap in functionality between the kstand kstMatrix packages, however the former uses a set representation and the latter a matrix representation. The packages are to be seen as complementary, not as a replacement for each other.

Different representations for knowledge spaces

Knowledge spaces can easily grow very large. Therefore, their bases are often used to store the knowledge spaces with reduced space requirements. kstmatrix offers two functions for computing bases from spaces and vice versa.

kmbasis()

The kmbasis function computes the basis for a given knowledge space (actually, it can be any family of sets represented by a binary matrix).

kmbasis(xpl$space)
#>      a b c d
#> [1,] 1 0 0 0
#> [2,] 0 1 0 0
#> [3,] 1 0 1 0
#> [4,] 0 1 1 0
#> [5,] 1 1 0 1

kmunionclosure()

The kmunionclosure function computes the knowledge space for a basis (mathematically spoken it computes the closure under union of the given family of sets).

kmunionclosure(xpl$basis)
#>       a b c d
#>  [1,] 0 0 0 0
#>  [2,] 1 0 0 0
#>  [3,] 0 1 0 0
#>  [4,] 1 1 0 0
#>  [5,] 1 0 1 0
#>  [6,] 1 1 1 0
#>  [7,] 0 1 1 0
#>  [8,] 1 1 0 1
#>  [9,] 1 1 1 1

kmsurmiserelation()

The kmsurmiserelation function determines the surmise relation for a quasi-ordinal knowledge space. For a more general family of sets, it computes the surmise relation for the smallest quasi-ordinal knowledge space including that family.

kmsurmiserelation(xpl$space)
#>   a b c d
#> a 1 0 0 1
#> b 0 1 0 1
#> c 0 0 1 0
#> d 0 0 0 1

The surmise relation can also be used to easily close a knowledge space under intersection:

kmunionclosure(t(kmsurmiserelation(xpl$space)))
#>       a b c d
#>  [1,] 0 0 0 0
#>  [2,] 1 0 0 0
#>  [3,] 0 1 0 0
#>  [4,] 1 1 0 0
#>  [5,] 0 0 1 0
#>  [6,] 1 0 1 0
#>  [7,] 0 1 1 0
#>  [8,] 1 1 1 0
#>  [9,] 1 1 0 1
#> [10,] 1 1 1 1

kmsurmisefunction()

The kmsurmisefunctionfunction computes the surmise function for a knowledge space or basis. For a more general family of sets, it computes the surmise function for the smallest knowledge space including that family.

kmsurmisefunction(xpl$space)
#> 000 001 002 003 005 007 006 00b 00f 
#> 000 001 002 000 004 000 004 008 000 
#>      a b c d
#> [1,] 1 0 0 0
#> [2,] 0 1 0 0
#> [3,] 0 0 1 0
#> [4,] 0 0 1 0
#> [5,] 0 0 0 1
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    0    0    0
#> [1] 1
#>      [,1] [,2] [,3] [,4]
#> [1,]    0    1    0    0
#> [1] 1
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    0    1    0
#> [2,]    0    1    1    0
#> [1] 2
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    1    0    1
#> [1] 1
#>   Item a b c d
#> 1    a 1 0 0 0
#> 2    b 0 1 0 0
#> 3    c 1 0 1 0
#> 4    c 0 1 1 0
#> 5    d 1 1 0 1

kmsf2basis()

Determine the basis of the knowledge space corresponding to a given surmise function.

sf <- kmsurmisefunction(xpl$space)
#> 000 001 002 003 005 007 006 00b 00f 
#> 000 001 002 000 004 000 004 008 000 
#>      a b c d
#> [1,] 1 0 0 0
#> [2,] 0 1 0 0
#> [3,] 0 0 1 0
#> [4,] 0 0 1 0
#> [5,] 0 0 0 1
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    0    0    0
#> [1] 1
#>      [,1] [,2] [,3] [,4]
#> [1,]    0    1    0    0
#> [1] 1
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    0    1    0
#> [2,]    0    1    1    0
#> [1] 2
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    1    0    1
#> [1] 1
kmsf2basis(sf)
#>      a b c d
#> [1,] 1 0 0 0
#> [2,] 0 1 0 0
#> [3,] 1 0 1 0
#> [4,] 0 1 1 0
#> [5,] 1 1 0 1

Properties of knowledge structures

kmiswellgraded()

The kmiswellgraded function determines whether a knowledge structure is wellgraded.

kmiswellgraded(xpl$space)
#> [1] TRUE

kmnotions()

The kmnotions function returns a matrix specifying the notions of a knowledge strucure, i.e. the classes of equivalent items.

x <- matrix(c(0,0,0, 1,0,0, 1,1,1), nrow = 3, byrow = TRUE)
kmnotions(x)
#>      [,1] [,2] [,3]
#> [1,]    1    0    0
#> [2,]    0    1    1

kmeqreduction()

The kmeqreduction function returns a matrix with only one item per equivalence class.

x <- matrix(c(0,0,0, 1,0,0, 1,1,1), nrow = 3, byrow = TRUE)
kmeqreduction(x)
#>      [,1] [,2]
#> [1,]    0    0
#> [2,]    1    0
#> [3,]    1    1

Creating trivial knowledge spaces

For a given item number, there are two trivial knowledge spaces, the maximal knowledge space representing absolutely no prerequisite relationships (the knowledge space is the power set of the item set and the basis matrix is the diagonal matrix), and the minimal knowledge space representing equivalence of all items (the knowledge space contains just the empty set and the full item set, and the basis matrix contains one line full of ’1’s).

kmminimalspace()

Example:

kmminimalspace(5)
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    0    0    0    0    0
#> [2,]    1    1    1    1    1

kmmaximalspace()

Example:

kmmaximalspace(4)
#>       [,1] [,2] [,3] [,4]
#>  [1,]    0    0    0    0
#>  [2,]    1    0    0    0
#>  [3,]    0    1    0    0
#>  [4,]    1    1    0    0
#>  [5,]    0    0    1    0
#>  [6,]    1    0    1    0
#>  [7,]    0    1    1    0
#>  [8,]    1    1    1    0
#>  [9,]    0    0    0    1
#> [10,]    1    0    0    1
#> [11,]    0    1    0    1
#> [12,]    1    1    0    1
#> [13,]    0    0    1    1
#> [14,]    1    0    1    1
#> [15,]    0    1    1    1
#> [16,]    1    1    1    1

Validating knowledge spaces

kmdist()

The kmdist function computes a frequency distribution for the distances between a data set and a knowledge space.

kmdist(xpl$data, xpl$space)
#> 0 1 2 3 4 
#> 5 2 0 0 0

kmvalidate()

The kmvalidate function returns the distance vector, the discrimination index DI, and the distance agreement coefficient DA. The discrepancy index (DI) is the mean distance; the distance agreement coefficient is the ratio between the mean distance between data and space (ddat = DI) and the mean distance between space and power set (dpot).

kmvalidate(xpl$data, xpl$space)
#> $dist
#> 0 1 2 3 4 
#> 5 2 0 0 0 
#> 
#> $DI
#> [1] 0.2857143
#> 
#> $DA
#> [1] 0.5714286

Simulating response patterns

kmsimulate()

The kmsimulate funtion provides a generation of response patterns by applying the BLIM (Basic Local Independence Model; see Doignon & Falmagne, 1999) to a given knowledge structure. The beta and eta parameters of the BLIM can each be either a vector specifying different values for each item or a single numerical where beta or eta is assumed to be equal for all items.

kmsimulate(xpl$space, 10, 0.2, 0.1)
#>       a b c d
#>  [1,] 1 1 1 0
#>  [2,] 1 1 0 0
#>  [3,] 0 1 1 0
#>  [4,] 1 1 0 0
#>  [5,] 0 0 0 0
#>  [6,] 0 1 1 0
#>  [7,] 0 1 1 0
#>  [8,] 1 1 0 1
#>  [9,] 1 1 0 1
#> [10,] 1 1 0 0
kmsimulate(xpl$space, 10, c(0.2, 0.25, 0.15, 0.2), c(0.1, 0.15, 0.05, 0.1))
#>       a b c d
#>  [1,] 1 1 0 0
#>  [2,] 1 1 0 1
#>  [3,] 0 0 0 0
#>  [4,] 1 1 1 1
#>  [5,] 1 1 0 1
#>  [6,] 1 1 0 1
#>  [7,] 0 1 1 0
#>  [8,] 1 1 1 0
#>  [9,] 0 1 0 0
#> [10,] 1 0 1 0
kmsimulate(xpl$space, 10, c(0.2, 0.25, 0.15, 0.2), 0)
#>       a b c d
#>  [1,] 1 1 0 0
#>  [2,] 0 1 1 0
#>  [3,] 1 0 1 0
#>  [4,] 1 1 1 1
#>  [5,] 1 0 0 1
#>  [6,] 0 0 0 0
#>  [7,] 0 0 1 0
#>  [8,] 1 0 0 0
#>  [9,] 1 0 0 0
#> [10,] 0 1 0 0

Neighbourhood & Fringe

kmneighbourhood()

The kmneighbourhood function determines the neighbourhood of a state in a knowledge structure, i.e. the family of all states with a symmetric set diference of 1.

kmneighbourhood(c(1,1,0,0), xpl$space)
#>      a b c d
#> [1,] 1 0 0 0
#> [2,] 0 1 0 0
#> [3,] 1 1 1 0
#> [4,] 1 1 0 1

kmfringe()

The kmfringe function determines the fringe of a knowledge state, i.e. the set of thse items by which the state differs from its neighbouring states.

kmfringe(c(1,0,0,0), xpl$space)
#> a b c d 
#> 1 1 1 0

Utilities

kmsymmsetdiff()

The kmsymmsetdiff function returns the symmetric set difference between two sets represented as binary vectors.

kmsymmsetdiff(c(1,0,0), c(1,1,0))
#> [1] 0 1 0

kmsetdistance()

The kmsetdistance function returns the cardinality of the symmetric set difference between two sets represented as binary vectors.

kmsetdistance(c(1,0,0), c(1,1,0))
#> [1] 1

Plotting with kmhasse()and kmcolors()

The kmhasse function draws a Hasse diagram of a knowledge structure, the kmcolorsfunction returns a color vector to be used with kmhasse().

kmhasse(xpl$space, horizontal = FALSE)

probability_vec <- (0:8)/8
colorvec <- kmcolors(probability_vec, cm.colors)
kmhasse(xpl$space, horizontal = TRUE, colors = colorvec)

Plotting with kmbasisdiagram()

The kmbasisdiagram function draws a Hasse diagram of a basis similarly to the kmahsse function.

kmbasisdiagram(xpl$basis, horizontal=FALSE)

Datasets provided by kstMatrix

The provided datasets were obtained by the research group around Cornelia Dowling by querying experts in the respective fields.

cad

Six experts were queried about prerequisite relationships between 28 AutoCAD knowledge items (Dowling, 1991; 1993a). A seventh basis represents those prerequisite relationships on which the majority (4 out of 6) of the experts agree (Dowling & Hockemeyer, 1998).

summary(cad)
#>        Length Class  Mode   
#> cad1   1764   -none- numeric
#> cad2   2772   -none- numeric
#> cad3   4424   -none- numeric
#> cad4   1932   -none- numeric
#> cad5   2380   -none- numeric
#> cad6    952   -none- numeric
#> cadmaj 7168   -none- numeric

readwrite

Three experts were queried about prerequisite relationships between 48 items on reading and writing abilities (Dowling, 1991; 1993a). A fourth basis represents those prerequisite relationships on which the majority of the experts agree (Dowling & Hockemeyer, 1998).

summary(readwrite)
#>       Length Class  Mode   
#> rw1   6672   -none- numeric
#> rw2   7680   -none- numeric
#> rw3   4896   -none- numeric
#> rwmaj 1440   -none- numeric

fractions

Three experts were queried about prerequisite relationships between 77 items on fractions (Baumunk & Dowling, 1997). A fourth basis represents those prerequisite relationships on which the majority of the experts agree (Dowling & Hockemeyer, 1998).

summary(fractions)
#>         Length Class  Mode   
#> frac1   39039  -none- numeric
#> frac2   24409  -none- numeric
#> frac3   16016  -none- numeric
#> fracmaj  4235  -none- numeric

xpl

This is just a small fictitious 4-item-example used for the examples in the documentation.

summary(xpl)
#>       Length Class  Mode   
#> basis 20     -none- numeric
#> space 36     -none- numeric
#> data  28     -none- numeric
xpl$basis
#>      a b c d
#> [1,] 1 0 0 0
#> [2,] 0 1 0 0
#> [3,] 1 0 1 0
#> [4,] 0 1 1 0
#> [5,] 1 1 0 1
xpl$space
#>       a b c d
#>  [1,] 0 0 0 0
#>  [2,] 1 0 0 0
#>  [3,] 0 1 0 0
#>  [4,] 1 1 0 0
#>  [5,] 1 0 1 0
#>  [6,] 1 1 1 0
#>  [7,] 0 1 1 0
#>  [8,] 1 1 0 1
#>  [9,] 1 1 1 1
xpl$data
#>      a b c d
#> [1,] 0 0 1 0
#> [2,] 1 0 0 0
#> [3,] 0 0 0 1
#> [4,] 1 1 0 0
#> [5,] 1 1 1 0
#> [6,] 1 1 1 1
#> [7,] 1 1 0 0

References