FIESTA
Manual - Population
DataFIESTA
’s Estimation Modules combine multiple functions
from FIESTA
or other packages to generate estimates across
a user-defined population(s) using different estimation strategies. Each
module has an associated mod*pop
function for compiling the
population data and calculations, including adjustments for nonresponse
and standardizing auxiliary data. The output form the
mod*pop
functions are input directly into the
mod*estimation
modules.
All Population functions require similar data inputs, including a set of inventory response data and summarized auxiliary data for post-stratification or other model-assisted and model-based (i.e., small area) estimation strategies.
This vignette describes the required input data sets, parameter
inputs, and outputs from the mod*pop
functions. Refer to
the FIESTA_module_estimates vignette for more information on other
parameter inputs to the mod*
Estimation Modules and the
following vignettes for running specific examples:
The parameters for FIESTA
modules are organized by
different categories based on population data and resulting
estimates.
The population types (i.e., Eval_Type) currently available for
FIESTA
estimation. The population type defines the set of
sampled plots and data used for estimation. For example, if you are only
interested in area estimates (popType=‘CURR’), you do not need the tree
data. Other population types will be available in the future, including
GRM (Growth, mortality, removals), P2VEG (understory vegetation), CHNG
(Change), and DWM (down woody material). These population types may have
different sets of plots based on what was sampled.
The required data tables include forest inventory data from the FIA
national database (Burrill et al. 2018). Data table inputs can be the
name of a comma-delimited file (*.csv), a layer within a database,
(e.g., SQLite), or an R data frame or data table object already loaded
into R. The pltassgn
table can also be a point shapefile
(*.shp), a spatial layer within a database, or an sf
R
object with one point per plot. The unique identifier for a plot must be
provided in the corresponding parameter for each input table, match
default variable names. See required variables section for a list of
variables necessary to include for estimation. All modules require at
least one table.
popTables - A named list of data tables used for estimates (cond, plt, tree, seed, vsubpspp, vsubpstr, subplot, subp_cond). See below for more details about tables.
popTableIDs - A named list of variable names defining unique plot identifiers in each table listed in popTables. See below for more details about tables.
pltassgn - Plot-level data, with 1 record per plot and plot assignment of estimation unit and strata, if applying stratification. If nonsampled plots are included, PLOT_STATUS_CD variable must be in table. These plots are excluded from the analysis. - optional for all estimates.
pltassgnid- Unique identifier for plot in pltassgn (default=“PLT_CN”).
pjoinid - Join variable in plot (or cond) to match pltassgnid. Does not need to be unique.
dsn - Data source name of database where data table layers reside.
Define information for area estimation.
Population filters subset the plot data set before population calculations are generated.
An estimation unit is a population, or area of interest, with known
area and number of plots. As an example, for RMRS FIA, an estimation
unit is generally an individual county. An estimation unit may be a
sub-population of a larger population (e.g., Counties within a State).
For post-stratified estimation, sub-populations are mutually exclusive
and independent within a population, therefore estimated totals and
variances are additive. Each plot is assigned to only one estimation
unit based on plot center and can be stored in either
pltassgn
or cond
. For model-based, small area
estimators, an estimation unit is a sub-population, referred to as a
model domain unit, where each domain unit is a component in a model.
Note: If there are less than minplotnum.unit plots in an estimation/domain unit: if unit.action/dunit.action=‘keep’, NA is returned for the estimation/domain unit; if unit.action/dunit.action=‘remove’, the estimation/domain unit is removed from the returned output; if unit.action/dunit.action=‘combine’, an automated procedure occurs to group estimation/domain units with less than minplotnum.unit plots with the next estimation/domain unit in the stratalut or unitzonal table. If it is the last estimation/domain unit in the table, it is grouped with the estimation/domain unit preceding in the table. A recommended number of plots for post-stratified estimation is provided as defaults (Westfall and others, 2011).
Post-stratification is used to reduce variance in population estimates by partitioning the population into homogenous classes (strata), such as forest and nonforest. For stratified sampling methods, the strata sizes (weights) must be either known or estimated. Remotely-sensed data is often used to generate strata weights with proporation of pixels by strata. If stratification is desired (strata=TRUE), the required data include: stratum assignment for the center location of each plot, stored in either pltassgn or cond; and a look-up table with the area, pixel count, or proportion of the total area (strwt) of each strata value by estimation unit, making sure the name of the strata (and estimation unit) variable and values match the plot assignment name(s) and value(s). If strata (and estimation unit) variables are included in cond, all conditions in a plot must have the same strata (and estimation unit) value.
In FIESTA, the plot assignments, strata proportions, and area are
provided by the user and may be obtained through FIESTA or other means,
given the proper format. These parameters are set by supplying a list to
the strata_opts
parameter. The possible parameters that can
be set within the strata_opts
parameter can be seen by
running help(strata_options)
Note: If there are less than minplotnum.strat plots (default=2 plots) in any strata/estimation unit combination: if stratcombine=FALSE, an error occurs with a message to collapse classes; if stratcombine=TRUE, an automated procedure occurs to collapse all strata less than minplotnum.strat. The function collapses classes based on the order of strata in stratatlut. If a strata within in estimation unit is less than minplotnum.strat, it is grouped with the next strata class in stratalut.
Other Model-Assisted and Small Area estimation strategies require
unit/dunit-level information, including auxiliary data summaries and
predictor names. The following parameters are used to provide this
information in the MA and SA FIESTA
modules.
Data object parameters allow a user to use other functions from FIESTA to input parameters directly.
FIESTA::an*data
functions.FIESTA::pltdat
function.FIESTA::spGetStrata
function (GB module only).FIESTA::spGetAuxiliary
function (MA and SA modules
only).FIESTA
module population functions
(mod*pop
)Variable | Description |
---|---|
ESTN_UNIT | Estimation unit |
STRATUMCD | Strata value |
P1POINTCNT | Number of pixels by strata and estimation unit |
P2POINTCNT | Number of P2 plots in population data |
n.strata | Number of sampled plots in strata |
n.total | Number of sampled plots for estimation unit |
strwt | Proportion of pixels in strata (strata weight) |
CONDPROP_UNADJ_SUM | Summed condition proportion in strata |
cadjfac | Adjustment factor for nonsampled plots in strata (CONDPROP_UNADJ_SUM/n.strata) |
ACRES | Total acres for estimation unit |
expfac | Expansion factor, in acres, area in strata divided by number of sampled plots |
EXPNS | Expanded area, in acres, expfac multiplied by strwt |
The following variables by data table are required for successful
FIESTA
output.
Table | Variable | Description |
---|---|---|
tree | PLT_CN | popTableIDs - Unique identifier for each plot, for joining tables (e.g. PLT_CN) |
TPA_UNADJ | Number of trees per acre each sample tree represents (e.g. DESIGNCD=1: TPA_UNADJ=6.018046 for trees on subplot; 74.965282 for trees on microplot) | |
cond | PLT_CN | popTableIDs - Unique identifier for each plot, for joining tables (e.g., PLT_CN) |
CONDPROP_UNADJ | Unadjusted proportion of condition on each plot. Optional if only 1 condition (record) per plot | |
COND_STATUS_CD | Status of each forested condition on plot (i.e. accessible forest, nonforest, water, etc.) | |
NF_COND_STATUS_CD | Only if ACI=TRUE. Status of each nonforest condition plot (i.e. accessible nonforest, nonsampled nonforest) | |
SITECLCD | Only if landarea=TIMBERLAND. Measure of site productivity | |
RESERVCD | If landarea=TIMBERLAND. Reserved status | |
SUBPROP_UNADJ | Unadjusted proportion of subplot conditions on each plot. Optional if only 1 condition (record) per plot | |
MICRPROP_UNADJ | If microplot tree attributes. Unadjusted proportion of microplot conditions on each plot. Optional if only 1 condition (record) per plot | |
MACRPROP_UNADJ | If macroplot tree attributes. Unadjusted proportion of macroplot conditions on each plot. Optional if only 1 condition (record) per plot | |
plot | CN | popTableIDs - Unique identifier for each plot, for joining tables (e.g. CN) |
STATECD | Identifies state each plot is located in. Optional if only 1 state | |
INVYR | Identifies inventory year of each plot. Optional. Assumes estimation time span is less than inventory cycle |
Burrill, E.A., Wilson, A.M., Turner, J.A., Pugh, S.A., Menlove, J., Christiansen, G., Conkling, B.L., Winnie, D., 2018. Forest Inventory and Analysis Database [WWW Document]. St Paul MN US Dep. Agric. For. Serv. North. Res. Stn. URL https://apps.fs.usda.gov/fia/datamart/datamart.html (accessed 3.6.21).