The R
package lakhesis
provides a
heuristic-critical platform for seriating binary data matrices through
the exploration, selection, and consensus of partially seriated
sequences.
In brief, seriation (sequencing, ordination) involves putting a set
of things in an optimal order. In archaeology, seriation can be used to
establish a chronological order of contexts and find-types on the basis
of their similarity, i.e, that things come into and go out of fashion
with a peak moment of popularity. In ecology, the distribution of a
species may occur according to a preferred environmental condition that
diminishes as that environment changes. There are a number of R
functions and packages (especially seriation
and vegan
)
that provide means to seriate or ordinate matrices, especially for
frequency or count data. While binary (presence/absence) data are often
viewed as a reductive case of frequency data, they can also present
their own challenges for seriation. Moreover, not all “incidence
matrices” (the matrix of 0/1s that record the joint incidence or
occurrence for a row-column pairing) will necessarily be well seriated.
The selection of row and column elements in the input is accordingly an
intrinsic part of the task of seriation. In this respect,
lakhesis
seeks to complement existing methods in
R
, by focusing on binary data. It uses correspondence
analysis, a mainstay technique for seriation, which is then fit to a
reference curve that represents “ideally” seriated data. Multiple
seriations can be run on partial subsets of the initial incidence
matrix, which are then recompiled into a single consensus seriation.
Critical measures are also provided.
While command line functions can be run in R
, the
functionality of lakhesis
is primarily achieved via the
Lakhesis Calculator, a graphical platform in shiny
that
enables investigators to explore datasets for potential seriated
sequences, select them, and then harmonize them into a single consensus
seriation. The four panels in the calculator include the following:
The sidebar contains the following commands:
ca.procrustes.curve()
performs this task.lakhesize()
performs this task.element.eval()
performs this task..rds
file, which is a list
containing the
following objects:
results
The results of lakhesize()
, itself
a list
which contains the consensus seriation, the row and
column PCA, and coefficients of agreement and concentration.strands
The strands selected to produce
results
.im.seriated
The seriated incidence matrix (this matrix
only includes row and column elements selected in the strands, not all
rows and columns of the initial dataset).To obtain the current development version of lakhesis
from GitHub, install from GitHub in the R
command line
with:
library(devtools)
install_github("scollinselliott/lakhesis")
To start the Lakhesis Calculator, execute the function
LC()
:
library(lakhesis)
LC()
Note that in uploading a csv
file for analysis inside
the Lakhesis Calculator, the file should consist of just two columms
without headers. If data are already in incidence matrix format, the
im.long()
function in lakhesis
can be used to
convert an incidence matrix to be exported into the necessary long
format, using the write.table()
function to export (see
documentation on im.long()
).
Hahsler M, Hornik K, Buchcta C (2008). “Getting Things in Order: An Introduction to the R Package seriation.” Journal of Statistical Software, 25, 1-34. doi:10.18637/jss.v025.i03.
Ihm P (2005). “A Contribution to the History of Seriation in Archaeology.” In Weihs C, Gaul W (eds.), Classification - The Ubiquitous Challenge, 307-16. Springer, Berlin.
Nenadic O, Greenacre MJ (2007). “Correspondence Analysis in R, with Two- and Three-dimensional Graphics: The ca Package.” Journal of Statistical Software, 20, 1-13. doi:10.18637/jss.v020.i03.
ter Braak CJF, Looman, CWN. (1986). “Weighted Averaging, Logistic Regression and the Gaussian Response Model.” Vegetatio 65, 3-11. doi:10.1007/BF00032121.