The raceland package implements a computational framework for a pattern-based, zoneless analysis and visualization of (ethno)racial topography. It is a reimagined approach for analyzing residential segregation and racial diversity based on the concept of ‘landscape’ used in the domain of landscape ecology. An overview of the implemented method is presented in the first vignette. Here we demonstrate, how the raceland package can be used for describing racial landscape at different spatial scales.
Racial landscape method is based on the raster gridded data, and unlike the previous methods, does not depend on the division for specific zones (census tract, census block, etc.). Calculation of racial diversity (entropy) and racial segregation (mutual information) can be performed for the whole area of interests (i.e., metropolitan area) without introducing any arbitrary divisions. Racial landscape method also allows for performing the calculation at different spatial scales.
# install required packages
= c(
pkgs "raceland",
"comat",
"terra",
"sf",
"dplyr"
)= !pkgs %in% installed.packages()
to_install if(any(to_install)) {
install.packages(pkgs[to_install])
}
A computational framework requires a few steps (see the first vignette for details) before calculating IT-metrics:
# reading input data
= list.files(system.file("rast_data", package = "raceland"),
list_raster full.names = TRUE)
= rast(list_raster)
race_raster
# constructing racial landscape
= create_realizations(x = race_raster, n = 100)
real_raster
# calculating local subpopulation densities
= create_densities(real_raster, race_raster, window_size = 10) dens_raster
Let consider an example presented below. The racial landscape covers the area of 16 by 16 cells. Such an area can be divided into a square-shaped block of cells. Each square of cells will represent a local pattern (a local landscape), and for each local pattern, IT metrics (entropy and mutual information) are calculated. The extent of a local pattern is defined by two parameters: size and shift.
shift == size - the input map will be divided into a grid of non-overlaping square windows. Each square window defines the extent of a local pattern.
shift < size - results in the grid of overlapping square windows. A local pattern is calculated from the square window defined by size parameter; the next square window is shifted (in N-S and W-E directions) by the number of cells defined by shift parameter.
The example presented below consists of the racial landscape 16 by 16 cells. Setting size = 4
(and shift= 4
) results in dividing the racial landscape into four squared windows, each 4x4 cells. Each window represents a local pattern. For each local pattern, IT metrics can be calculated, and the results will be assigned to the resultant grid of square windows. In fact, the original racial landscape with 16x16 cells is reduced to the 2x2 ‘large cells’.
Setting size=4
and shift = 2
results in overlapping square windows. First, the window of the size 4x4 defines the local pattern (see dark blue square). In the next step, this window is shifted by two cells to the right, and the new local pattern is selected (see the light blue square). It will create a resultant grid of the cell size defined by the shift parameter.
The create_grid()
function creates spatial object with a grid (each ‘cell’ is defined by size and shift). This function requires the SpatRaster object with realizations (racial landscapes) and size parameter. If the shift parameter is not set, it is assumed that size=shift
. Below such grid is imposed into the racial landscape to show local patterns.
= c("#F16667", "#6EBE44", "#7E69AF", "#C77213", "#F8DF1D")
race_colors = create_grid(real_raster, size = 20)
grid_sf plot_realization(real_raster[[1]], race_raster, hex = race_colors)
plot(st_geometry(grid_sf), add = TRUE, lwd = 2)
The calculate_metrics()
function is used to calculate IT metrics. Parameter size=20
means that the area of interests will be divided into a grid of local patterns of the size 20x20 cells (which in this case corresponds to the square of 0.6 km x 0.6km). The neighboorhood = 4
defines that adjacencies between cells are defined in four directions, fun="mean"
calculate average values of population density from adjacent cells, threshold = 0.5
- calculation will be performed if there is at least 50% of non-NA cells.
IT metrics are calculated for each local pattern for each realization. The output table will have 900 rows (there are nine local patterns of size 20x20 cells and 100 realizations).
= calculate_metrics(x = real_raster, w = dens_raster,
metr_df_20 neighbourhood = 4, fun = "mean",
size = 20, threshold = 0.5)
$realization == 1, ]
metr_df_20[metr_df_20#> realization row col ent joinent condent mutinf
#> 1 1 1 1 1.242992 2.473338 1.230346 0.012645856
#> 2 1 1 2 1.618187 3.101661 1.483474 0.134712712
#> 3 1 1 3 1.391554 2.766716 1.375162 0.016392488
#> 4 1 2 1 1.604952 2.985754 1.380802 0.224149925
#> 5 1 2 2 1.535208 3.029593 1.494386 0.040821771
#> 6 1 2 3 1.535544 2.904158 1.368614 0.166929892
#> 7 1 3 1 1.419478 2.696524 1.277046 0.142432066
#> 8 1 3 2 1.544421 3.080596 1.536174 0.008246697
#> 9 1 3 3 1.371094 2.704907 1.333813 0.037280600
Racial topography at the analyzed scale is quantified as an ensemble average from multiple realizations. First, for each square window is calculated the average value of entropy and mutual information based on 100 realizations. The table below shows the mean (ent_mean
, mutinf_mean
) and standard deviation (ent_sd
, mutinf_sd
) values for each square window.
= metr_df_20 %>%
smr group_by(row, col) %>%
summarize(
ent_mean = mean(ent, na.rm = TRUE),
ent_sd = sd(ent, na.rm = TRUE),
mutinf_mean = mean(mutinf, na.rm = TRUE),
mutinf_sd = sd(mutinf, na.rm = TRUE)
)#> `summarise()` has grouped output by 'row'. You can override using the `.groups`
#> argument.
smr#> # A tibble: 9 × 6
#> # Groups: row [3]
#> row col ent_mean ent_sd mutinf_mean mutinf_sd
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 1.27 0.0460 0.0193 0.00883
#> 2 1 2 1.59 0.0115 0.160 0.0289
#> 3 1 3 1.42 0.0318 0.0462 0.0154
#> 4 2 1 1.61 0.0258 0.151 0.0373
#> 5 2 2 1.52 0.0218 0.0347 0.0134
#> 6 2 3 1.52 0.0285 0.156 0.0293
#> 7 3 1 1.38 0.0470 0.136 0.0424
#> 8 3 2 1.54 0.0302 0.0152 0.00658
#> 9 3 3 1.42 0.0299 0.0477 0.0138
Then the averages from the mean values of entropy and mutual information are calculated.
%>%
smr ungroup() %>%
select(-row, -col) %>%
summarise_all(mean)
#> # A tibble: 1 × 4
#> ent_mean ent_sd mutinf_mean mutinf_sd
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1.47 0.0303 0.0851 0.0218
Racial diversity and segregation can be analyzed and displayed separately. However, much more information can be gained by visualizing those two measures at the same time. For this purpose, entropy and mutual information are reclassified into three classes (1-low, 2-medium, 3-high). By joining those two classification, we obtain 9 classes, each describing the level of diversity and segregation: 11 - low segregation/low diversity; 12-low segregation/medium diversity; 13-low segregation/high diversity; 21 - medium segregation/low diversity; 22-medium segregation/medium diversity; 23-medium segregation/high diversity; 31 - high segregation/low diversity; 32-high segregation/medium diversity; 33-high segregation/high diversity.
Each class is coded by one color using bivariate palette (see below).
The bivariate_classification()
function presented below takes three arguments (entropy
- a vector of entropy values, mutual_information
- a vector of mutual information values, n
- a number of categories in racial landscape) and return a 9-coded vector of racial segregation/divesity classes.
# n is a number of categories in racial landscape
= function(entropy, mutual_information, n) {
bivariate_classification
# calculate bivariate classification
= log2(n)
nent = cut(entropy, breaks = c(0, 0.66, 1.33, nent), labels = c(1, 2, 3),
ent_cat include.lowest = TRUE, right = TRUE)
= as.integer(as.character(ent_cat))
ent_cat
= cut(mutual_information, breaks = c(0, 0.33, 0.66, 1), labels = c(10, 20, 30),
mut_cat include.lowest = TRUE, right = TRUE)
= as.integer(as.character(mut_cat))
mut_cat
= mut_cat + ent_cat
bivar_cls = as.factor(bivar_cls)
bivar_cls
return(bivar_cls)
}
$bivar_cls = bivariate_classification(entropy = smr$ent_mean,
smrmutual_information = smr$mutinf_mean,
n = nlyr(race_raster))
The average value of entropy and mutual information calculated for each square-shaped window from all realizations can be joined to the spatial grid object. Such operation allows for mapping metrics and shows how segregation and racial diversity change over the area.
# join IT-metric to the grid
= dplyr::left_join(grid_sf, smr, by = c("row", "col")) attr_grid
# calculate breaks parameter for plotting entropy and mutual information
# the values of entropy and mutual information are divided into equal breaks
= c(seq(0, 2, by = 0.25), log2(nlyr(race_raster)))
ent_breaks = seq(0, 1, by = 0.1) mut_breaks
plot(attr_grid["ent_mean"], breaks = ent_breaks, key.pos = 1,
pal = rev(hcl.colors(length(ent_breaks) - 1, palette = "RdBu")),
bty = "n", main = "Racial diversity (Entropy)")
plot(attr_grid["mutinf_mean"], breaks = mut_breaks, key.pos = 1,
pal = rev(hcl.colors(length(mut_breaks) - 1, palette = "RdBu")),
bty = "n", main = "Racial segregation (Mutual information)")
= c("11" = "#e8e8e8", "12" = "#e4acac", "13" = "#c85a5a", "21" = "#b0d5df",
biv_colors "22" = "#ad9ea5", "23" = "#985356", "31" = "#64acbe", "32"= "#627f8c",
"33" = "#574249")
= biv_colors[names(biv_colors)%in%unique(attr_grid$bivar_cls)]
bcat plot(attr_grid["bivar_cls"], pal = bcat, main = "Racial diversity and residential segregation")
The next example shows how to calculate a local pattern using overlapping windows. This option is recommended to use, especially for a larger size value. Using overlapping windows does not introduce arbitrary boundaries.
To obtain overlapping windows the calculate_metrics()
function requires additional argument - shift
.
# calculate metrics for overlapping windows
= calculate_metrics(x = real_raster, w = dens_raster,
metr_df_10 neighbourhood = 4, fun = "mean",
size = 20, shift = 10, threshold = 0.5)
= metr_df_10 %>%
smr10 group_by(row, col) %>%
summarize(
ent_mean = mean(ent, na.rm = TRUE),
ent_sd = sd(ent, na.rm = TRUE),
mutinf_mean = mean(mutinf, na.rm = TRUE),
mutinf_sd = sd(mutinf, na.rm = TRUE)
)#> `summarise()` has grouped output by 'row'. You can override using the `.groups`
#> argument.
%>%
smr10 ungroup() %>%
select(-row, -col) %>%
summarise_all(mean)
#> # A tibble: 1 × 4
#> ent_mean ent_sd mutinf_mean mutinf_sd
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1.48 0.0333 0.0811 0.0224
# calculate bivariate classification
$bivar_cls = bivariate_classification(
smr10entropy = smr10$ent_mean,
mutual_information = smr10$mutinf_mean,
n = nlyr(race_raster)
)
The average value of entropy and mutual information calculated for each square-shaped window from all realizations can be joined to spatial grid object (created by create_grid()
). For overlapping windows, the resolution of the grid will be defined by shift parameter.
# create spatial grid object
= create_grid(real_raster, size = 20, shift = 10)
grid_sf10
# join IT-metrics to the grid
= dplyr::left_join(grid_sf10, smr10, by = c("row", "col")) attr_grid10
plot(attr_grid10["ent_mean"], breaks = ent_breaks, key.pos = 1,
pal = rev(hcl.colors(length(ent_breaks) - 1, palette = "RdBu")),
# pal = grDevices::hcl.colors(length(ent_breaks) - 1, palette = "Blue-Red"),
bty = "n", main = "Racial diversity (Entropy)")
plot(attr_grid10["mutinf_mean"], breaks = mut_breaks, key.pos = 1,
pal = rev(hcl.colors(length(mut_breaks) - 1, palette = "RdBu")),
bty = "n", main = "Racial segregation (Mutual information)")
# `biv_color`s defines a bivariate palette,
# `bcat` selects only colors for categories available for analyzed areas
= c("11" = "#e8e8e8", "12" = "#e4acac", "13" = "#c85a5a", "21" = "#b0d5df",
biv_colors "22" = "#ad9ea5", "23" = "#985356", "31" = "#64acbe","32" = "#627f8c",
"33" = "#574249")
= biv_colors[names(biv_colors)%in%unique(attr_grid10$bivar_cls)]
bcat plot(attr_grid10["bivar_cls"], pal = bcat, main = "Racial diversity and residential segregation")