The rosario package implements a null model analysis to
quantify concurrent temporal niche overlap (i.e., activity or phenology)
among biological identities (e.g., individuals, populations, or species)
using the Rosario randomization algorithm (Castro-Arellano et al. 2010).
It is designed for cyclical time data (e.g., hours of the day, months of
the year) and can be used to assess temporal overlap among two or more
biological identities.
This vignette provides a real-world example showing how to (1) use an
external camera-trap dataset, (2) convert detections into fixed time
bins, (3) build the matrix required by rosario, and (4) run
a null model test of assemblage-wide temporal overlap.
We use a subset of the SIMDeer project (Foster et al. 2020) representing 85,929 capture events from three cervid species recorded in British Columbia, Canada (51,691 Mule deer, 19,395 Elk, and 14,843 White-tailed deer). The full dataset is available through Wildlife Insights (https://app.wildlifeinsights.org/explore).
The dataset used in this vignette is:
Sim_data: a CSV file containing capture events
(timestamped detections) for three cervid species.Note: To run this vignette locally with the full data set, download and load the external data using the code below. The data is hosted on Figshare. This step is not run automatically to comply with CRAN policies.
The rosario workflow requires a matrix where:
We convert event-level timestamps into a single row of binned counts per species. Time intervals are treated as circular (e.g., hours of the day), so the cycle wraps from the last interval back to the first.
The function below converts timestamps to date-time, assigns each
detection to a 30-minute bin, and returns a one-row data frame where
columns are time bins (formatted as "HH:MM:SS") and values
are detection counts.
bin_species <- function(dat, species_code, bin_mins = 30) {
dat %>%
filter(species == species_code) %>%
mutate(
timestamp = mdy_hm(timestamp),
bin = floor_date(timestamp, "hour") +
minutes(floor(minute(timestamp) / bin_mins) * bin_mins),
hour_min_sec = format(as.POSIXct(bin), "%H:%M:%S")
) %>%
count(hour_min_sec, name = "count") %>%
tidyr::pivot_wider(
names_from = hour_min_sec,
values_from = count,
values_fill = 0
)
}Here we generate one binned row per species:
rosario input
matrixWe bind the three species rows into a single matrix. Any missing time bins are filled with zeros by the binning function above, and we ensure all entries are numeric.
binned_df <- dplyr::bind_rows(
MuleDeer = mule_deer,
Elk = elk,
WTD = wtd,
.id = "species"
)
# Convert to a numeric matrix (rows = species; columns = time bins)
rownames(binned_df) <- binned_df$species
data_matrix <- binned_df %>%
select(-species) %>%
mutate(across(everything(), as.numeric)) %>%
as.matrix()
rownames(data_matrix) <- binned_df$species
dim(data_matrix)rosario on the real-world datasetThis section demonstrates how to run the rosario
workflow using the matrix created above.
The rosario() function creates the set of cyclic shifts
and their mirror images (reverse order), preserving shape while changing
location along the cycle and maintaining temporal autocorrelation. The
suite of vectors and mirrors represents a complete set of possible
distributions.
rosario()
functionUse plot_rosario() to visualize hypothetical time-use
distributions produced by rosario() for a single biological
identity. The function plots the first 10 cyclic shifts and their mirror
images. Each panel shows one hypothetical time-use distribution, with
the cyclic shift in dark gray and its mirror image in dark red.
Use temp_overlap() to compute the mean of all pairwise
overlaps among rows (species), using the chosen index: “pianka” or
“czekanowski”. This returns the observed assemblage-wide overlap among
the cervid species.
To test whether the observed overlap differs from random
expectations, get_null_model() generates a null
distribution of assemblage-wide overlap by repeatedly randomizing the
input matrix using rosario_sample() and recomputing mean
pairwise overlap (via temp_overlap()).
get_null_model() can run simulations in parallel by
setting parallel = TRUE. For CRAN checks and maximum
portability, this vignette uses parallel = FALSE.
We recommend using nsim = 1000 for applied analyses, but
runtime depends on hardware and dataset size.
Use temp_overlap_plot() to compare the observed overlap
(from temp_overlap()) against the simulated null
distribution (from get_null_model()). The plot shows a
histogram of simulated overlap values and a vertical line indicating the
observed overlap.
This vignette demonstrates the key steps needed to analyze cyclical
temporal data using rosario: downloading event-level
detections, binning them into time intervals, building a species-by-time
matrix, computing observed assemblage-wide overlap, and comparing
observed overlap against a null model.
For more details on each function, see the package documentation and the examples in each function’s help page.