library(JAGStree)
#> Registered S3 method overwritten by 'mcmcplots':
#> method from
#> as.mcmc.rjags R2jags
In many real-world applications, individuals or subjects may appear in multiple data sets, setting the stage for the opportunity to synthesize evidence to use all available data for inference. These relational structure of these data sources often admits a graphical representation, where nodes may be represented by data sources and edges representing the relationship between them (e.g., directed edges for sub-groupings or representation of time). One such representation is as a tree (i.e., a cycle-free graph with directed edges). Performing inference on this structure can be achieved using MCMC methods to obtain estimates of posterior distributions of a Bayesian hierarchical model.
This package is geared towards one subset of cases in which this general framework holds (Flynn and Gustafson 2024a).
In particular, we assume a number of related data sources exist with
a known relational structure described by a tree. We also assume that
nodes are associated with integer counts, which are multinomially
distributed from the parent node, and branches are associated with a
probability, which are Dirichlet distributed among a given sibling
group. Then, given a dataframe representing the tree structure, the
makeJAGStree()
function will generate appropriate MCMC
modeling code to be used with ‘JAGS’, to run a Bayesian hierarchical
model.
The data frame must contain only two columns:
‘from’ (string, node label)
‘to’ (string, node label)
The ‘from’ and ‘to’ labels encode the tree structure, describing the directed edge between nodes.
This makeJAGStree
function has three parameters, two of
which are required and one which is optional:
data
: This required argument must be a data frame
with at least two columns: from and to. Together,
these columns specify the tree structure. Other columns may also be
included; the AutoWMM
package can additionally be used to
render tree diagrams and perform root node population size estimation
with the weighted multiplier method, if appropriate Flynn, Gustafson, and Irvine (2024)
prior
: This optional argument specifies the prior
chosen for the root node population size. The default is
lognormal
, while a uniform
prior is also an
option.
filename
: This required argument is used to name the
‘JAGS’ code output file. It must end in .mod
or
.txt
for correct functionality.
The Bayesian model implied behind this ‘JAGS’ code is described at length elsewhere (Flynn and Gustafson 2024a), and is based on previously developed methodology (Flynn and Gustafson 2024b). A more extensive application to real-world data can be found in (Flynn, Gustafson, and Irvine 2024).
The functionality of the package is best demonstrated using simple
trees. The AutoWMM
package can also be used to help render
and visualize each of these trees:
data <- data.frame("from" = c("Z", "Z", "A", "A"),
"to" = c("A", "B", "C", "D"),
"Estimate" = c(4, 34, 9, 1),
"Total" = c(11, 70, 10, 10),
"Count" = c(NA, 500, NA, 50),
"Population" = c(FALSE, FALSE, FALSE, FALSE),
"Description" = c("First child of the root", "Second child of the root",
"First grandchild", "Second grandchild"))
# optional use of the AutoWMM package to show tree structure
Sys.setenv("RGL_USE_NULL" = TRUE)
library(AutoWMM)
library(DiagrammeR)
tree <- makeTree(data)
drawTree(tree)
The function can be directly applied to the data, and will produce a
.mod
or .txt
file, depending on which is
specified. A uniform prior can also be chosen for the root node, with
parameters that will be specified when running ‘JAGS’: