The geoheatmap R package aims to provide an easy way for building cartogram heatmaps for regions worldwide.
For the graph aesthetics, we took inspirations from the maps of the statebins package, and extended on its functionality by allowing for user-defined grids, which means that the cartogram heatmaps can also represent territories outside the US. Though any well-defined grid will technically work, grids from geofacet are an excellent starting point.
The package geoheatmap
imports grids from the geofacet package,
and also includes the option to make hovering graphs, which is based on
functionalities from the plotly package.
The use of package viridisLite here is to make colour-blind friendly graphical options for the examples below.
In this vignette we show multiple implementations of the package, namely for continuous data and discrete data, and we will show how users can specify the column with the grid names and can make an interactive (hovering) plot instead of a stationary one.
The data used for the following examples is available in the
geoheatmap package under the name internet
, and is
originally from the World Bank Group (2024), retrieved from https://data.worldbank.org/indicator/IT.NET.USER.ZS.
data(internet, package = "geoheatmap")
head(internet)
#> # A tibble: 6 × 3
#> country year users
#> <chr> <dbl> <dbl>
#> 1 Aruba 1990 0
#> 2 Afghanistan 1990 0
#> 3 Angola 1990 0
#> 4 Albania 1990 0
#> 5 Andorra 1990 0
#> 6 Arab World 1990 0
Internet.rda lists a good chunk of countries (country
)
worldwide, in an alphabetical and chronological order
(year
: 1990 - 2016), with the column users
depicting the percentage of individuals in a given country having used
internet in some capacity in the previous 3 months.
For the examples in this vignette, we use only the data from one year, specifically from 2015.
In the first example, a cartogram heatmap is shown for continuous
data. This graph shows the internet usage across Europe with a gradient
color scale and the europe_countries_grid1
grid from the
geofacet
package. Though most countries are well above the
halfway mark of population internet use, highest percentages can be
found in Northwestern European countries.
geoheatmap(facet_data= internet_2015, grid_data= europe_countries_grid1,
facet_col = "country", value_col = "users",
low = "#56B1F7", high = "#132B43") +
labs(title = "2015 Internet Usage in Europe")
The gradient scale is the default for continuous data, but you are of
course not limited to this default. You can change the fill scale via
the ggplot2_scale_function
argument, and add additional
arguments to be passed down directly inside the function.
For example, if any middle point is of interest, also a divergent
color scale can be applied for continuous data. This can be done by
specifying the scale_fill_gradient2
for the argument
ggplot2_scale_function
and adding the additional
information for this scale (low
, mid
,
high
, midpoint
).
geoheatmap(facet_data = internet_2015,
grid_data = europe_countries_grid1,
facet_col = "country",
value_col = "users",
name = "Internet users: divergent",
ggplot2_scale_function = scale_fill_gradient2,
low = viridis(10)[1],
mid = "white",
high = viridis(10)[8],
midpoint = 75,
round = TRUE) +
labs(title = "2015 Internet Usage in Europe")
Note that in this example we additionally set the argument
round = TRUE
to show the cartogram heatmap version with
tiles with rounded corners.
To show discrete data, users can either ask ggplot2 to bin the data or bin the data themselves.
In this first example, the data is binned by asking ggplot2 to bin
the data via the scale_fill_binned
function. This time we
focus on Africa by using the grid africa_countries_grid1
from the geofacet
pacakage. The resulting graph shows how
African countries differed in using the internet in year 2015, possibly
tied to economic development of a given country.
geoheatmap(facet_data= internet_2015, grid_data= africa_countries_grid1,
facet_col = "country", value_col = "users",
name= "Internet users: binned",
ggplot2_scale_function = scale_fill_binned,
type= "viridis") +
labs(title = "Internet Usage in Africa")
Another option is to discretize our data ourselves, e.g. by specifying our own breaks, and to then plot this data as is done in the next graph.
internet_2015$users_bin= cut(internet_2015$users, breaks = c(-Inf, 25, 50, Inf), labels = c("0-25", "26-50", "51 and up"))
geoheatmap(facet_data= internet_2015, grid_data= africa_countries_grid1,
facet_col = "country", value_col = "users_bin",
name= "Internet users: binned",
ggplot2_scale_function = scale_fill_brewer,
type = "seq", palette= "Greens", na.value= "grey50" ) +
labs(title = "Internet Usage in Africa")
With the manual breaks we can, for example, put more focus on countries that surpassed the halfway mark (50%) of population internet usage: Morocco, Mauritius, Seychelles and South Africa.
Sometimes, grids have local as well as anglophone location names,
with the default being set to the latter. If you would like to use the
regional version (e.g. because your data frame operates with native
names), you can pass it in as an additional argument using
merge_col
.
As an example, let’s look at the grid de_states_grid1
that contains regions of Germany. In this grid, both name
and name_de
are available, with diverging state names in
most cases. Via the merge_col
argument, users can define
which column they want to use in the cartogram.
de_states_grid1
#> row col code name name_de
#> 1 1 2 SH Schleswig-Holstein Schleswig-Holstein
#> 2 1 3 HH Hamburg Hamburg
#> 3 1 4 MV Mecklenburg-Vorpommern Mecklenburg-Vorpommern
#> 4 2 2 HB Bremen Bremen
#> 5 2 3 NI Lower Saxony Niedersachsen
#> 6 2 4 BE Berlin Berlin
#> 7 2 5 BB Brandenburg Brandenburg
#> 8 3 2 NW North Rhine-Westphalia Nordrhein-Westfalen
#> 9 3 4 ST Saxony-Anhalt Sachsen-Anhalt
#> 10 4 1 SL Saarland Saarland
#> 11 4 2 RP Rhineland-Palatinate Rheinland-Pfalz
#> 12 4 3 HE Hesse Hessen
#> 13 4 4 TH Thuringia Thüringen
#> 14 4 5 SN Saxony Sachsen
#> 15 5 3 BW Baden-Württemberg Baden-Württemberg
#> 16 5 4 BY Bavaria Bayern
For illustration purposes, let’s make up a dataset that works with state names native to a German speaker and plot this data as a cartogram heatmap.
# Dummy data frame with German states and number of football teams
football_teams= data.frame(state = c("Baden-Württemberg", "Bayern",
"Berlin", "Brandenburg",
"Bremen", "Hamburg",
"Hessen", "Mecklenburg-Vorpommern",
"Niedersachsen", "Nordrhein-Westfalen",
"Rheinland-Pfalz", "Saarland",
"Sachsen", "Sachsen-Anhalt",
"Schleswig-Holstein", "Thüringen"),
teams = c(18, 22, 8, 6, 4, 5, 14, 3,
12, 28, 10, 3, 9, 5, 7, 4)
)
geoheatmap(facet_data= football_teams,
grid_data= de_states_grid1,
facet_col = "state",value_col = "teams",merge_col = "name_de",
name= "No. of teams",
low = "lightblue", high = plasma(2)[1],
round = TRUE) +
labs(title = "Football teams in German states")
By specifying merge_col = "name_de"
, the
geoheatmap()
function merges the correct data set and grid
columns together before producing a plot. Though purely fictional, this
plot shows that Nordrhein-Westfalen state is leading in number of
football teams, something the authors suspect to be true regardless, as
the state is Germany’s most populous.
You also have the option to make any given plot created with
geoheatmap()
interactive with plotly
directly
in the function call, by specifying hover = TRUE
.
geoheatmap(facet_data= football_teams,
grid_data= de_states_grid1,
facet_col = "state",value_col = "teams",merge_col = "name_de",
name= "No. of teams",
low = "lightblue", high = plasma(2)[1],
hover = TRUE)
Note that this hovering option is only available for cartogram
heatmaps with un-rounded tiles. This means that calling
round = TRUE
in conjunction with hover = TRUE
does not work (yet).
geoheatmap(facet_data= football_teams,
grid_data= de_states_grid1,
facet_col = "state",value_col = "teams",merge_col = "name_de",
name= "No. of teams",
low = "lightblue", high = plasma(2)[1],
round = TRUE,
hover = TRUE)
#> Warning in geom2trace.default(dots[[1L]][[1L]], dots[[2L]][[1L]], dots[[3L]][[1L]]): geom_GeomRtile() has yet to be implemented in plotly.
#> If you'd like to see this geom implemented,
#> Please open an issue with your example code at
#> https://github.com/ropensci/plotly/issues
In the above examples, we already saw three grids, but many more
grids are available in the geofacet
package. Additionally,
users can specify their own grids.
To get the list of names of available grids in the
geofacet
package so far, call:
geofacet::get_grid_names()
#> Note: More grids are available by name as listed here: https://raw.githubusercontent.com/hafen/grid-designer/master/grid_list.json
#> [1] "us_state_grid1"
#> [2] "us_state_grid2"
#> [3] "eu_grid1"
#> [4] "aus_grid1"
#> [5] "sa_prov_grid1"
#> [6] "gb_london_boroughs_grid"
#> [7] "nhs_scot_grid"
#> [8] "india_grid1"
#> [9] "india_grid2"
#> [10] "argentina_grid1"
#> [11] "br_states_grid1"
#> [12] "sea_grid1"
#> [13] "mys_grid1"
#> [14] "fr_regions_grid1"
#> [15] "de_states_grid1"
#> [16] "us_or_counties_grid1"
#> [17] "us_wa_counties_grid1"
#> [18] "us_in_counties_grid1"
#> [19] "us_in_central_counties_grid1"
#> [20] "se_counties_grid1"
#> [21] "sf_bay_area_counties_grid1"
#> [22] "ua_region_grid1"
#> [23] "mx_state_grid1"
#> [24] "mx_state_grid2"
#> [25] "scotland_local_authority_grid1"
#> [26] "us_state_without_DC_grid1"
#> [27] "italy_grid1"
#> [28] "italy_grid2"
#> [29] "be_province_grid1"
#> [30] "us_state_grid3"
#> [31] "jp_prefs_grid1"
#> [32] "ng_state_grid1"
#> [33] "bd_upazila_grid1"
#> [34] "spain_prov_grid1"
#> [35] "ch_cantons_grid1"
#> [36] "ch_cantons_grid2"
#> [37] "china_prov_grid1"
#> [38] "world_86countries_grid"
#> [39] "se_counties_grid2"
#> [40] "uk_regions1"
#> [41] "us_state_contiguous_grid1"
#> [42] "sk_province_grid1"
#> [43] "ch_aargau_districts_grid1"
#> [44] "jo_gov_grid1"
#> [45] "spain_ccaa_grid1"
#> [46] "spain_prov_grid2"
#> [47] "world_countries_grid1"
#> [48] "br_states_grid2"
#> [49] "china_city_grid1"
#> [50] "kr_seoul_district_grid1"
#> [51] "nz_regions_grid1"
#> [52] "sl_regions_grid1"
#> [53] "us_census_div_grid1"
#> [54] "ar_tucuman_province_grid1"
#> [55] "us_nh_counties_grid1"
#> [56] "china_prov_grid2"
#> [57] "pl_voivodeships_grid1"
#> [58] "us_ia_counties_grid1"
#> [59] "us_id_counties_grid1"
#> [60] "ar_cordoba_dep_grid1"
#> [61] "us_fl_counties_grid1"
#> [62] "ar_buenosaires_communes_grid1"
#> [63] "nz_regions_grid2"
#> [64] "oecd_grid1"
#> [65] "ec_prov_grid1"
#> [66] "nl_prov_grid1"
#> [67] "ca_prov_grid1"
#> [68] "us_nc_counties_grid1"
#> [69] "mx_ciudad_prov_grid1"
#> [70] "bg_prov_grid1"
#> [71] "us_hhs_regions_grid1"
#> [72] "tw_counties_grid1"
#> [73] "tw_counties_grid2"
#> [74] "af_prov_grid1"
#> [75] "us_mi_counties_grid1"
#> [76] "pe_prov_grid1"
#> [77] "sa_prov_grid2"
#> [78] "mx_state_grid3"
#> [79] "cn_bj_districts_grid1"
#> [80] "us_va_counties_grid1"
#> [81] "us_mo_counties_grid1"
#> [82] "cl_santiago_prov_grid1"
#> [83] "us_tx_capcog_counties_grid1"
#> [84] "sg_planning_area_grid1"
#> [85] "in_state_ut_grid1"
#> [86] "cn_fujian_prov_grid1"
#> [87] "ca_quebec_electoral_districts_grid1"
#> [88] "nl_prov_grid2"
#> [89] "cn_bj_districts_grid2"
#> [90] "ar_santiago_del_estero_prov_grid1"
#> [91] "ar_formosa_prov_grid1"
#> [92] "ar_chaco_prov_grid1"
#> [93] "ar_catamarca_prov_grid1"
#> [94] "ar_jujuy_prov_grid1"
#> [95] "ar_neuquen_prov_grid1"
#> [96] "ar_san_luis_prov_grid1"
#> [97] "ar_san_juan_prov_grid1"
#> [98] "ar_santa_fe_prov_grid1"
#> [99] "ar_la_rioja_prov_grid1"
#> [100] "ar_mendoza_prov_grid1"
#> [101] "ar_salta_prov_grid1"
#> [102] "ar_rio_negro_prov_grid1"
#> [103] "uy_departamentos_grid1"
#> [104] "ar_buenos_aires_prov_electoral_dist_grid1"
#> [105] "europe_countries_grid1"
#> [106] "argentina_grid2"
#> [107] "us_state_without_DC_grid2"
#> [108] "jp_prefs_grid2"
#> [109] "na_regions_grid1"
#> [110] "mm_state_grid1"
#> [111] "us_state_with_DC_PR_grid1"
#> [112] "fr_departements_grid1"
#> [113] "ar_salta_prov_grid2"
#> [114] "ie_counties_grid1"
#> [115] "sg_regions_grid1"
#> [116] "us_ny_counties_grid1"
#> [117] "ru_federal_subjects_grid1"
#> [118] "us_ca_counties_grid1"
#> [119] "lk_districts_grid1"
#> [120] "us_state_without_DC_grid3"
#> [121] "co_cali_subdivisions_grid1"
#> [122] "us_in_northern_counties_grid1"
#> [123] "italy_grid3"
#> [124] "us_state_with_DC_PR_grid2"
#> [125] "us_state_grid7"
#> [126] "sg_planning_area_grid2"
#> [127] "ch_cantons_fl_grid1"
#> [128] "europe_countries_grid2"
#> [129] "us_states_territories_grid1"
#> [130] "us_tn_counties_grid1"
#> [131] "us_il_chicago_community_areas_grid1"
#> [132] "us_state_with_DC_PR_grid3"
#> [133] "in_state_ut_grid2"
#> [134] "at_states_grid1"
#> [135] "us_pa_counties_grid1"
#> [136] "us_oh_counties_grid1"
#> [137] "fr_departements_grid2"
#> [138] "us_wi_counties_grid1"
#> [139] "africa_countries_grid1"
#> [140] "no_counties_grid1"
#> [141] "tr_provinces_grid1"
This list is constantly being updated as authors of the geofacet made uploading your own grids possible. You can learn how to submit your own by following the steps in next section.
For detailed instructions on creating a custom grid, see the
“Creating your own grid” section in the geofacet
vignette.
You can find it at: https://cran.r-project.org/package=geofacet/vignettes/geofacet.html
The default theme is set to theme_void()
as cartograms
do not require axes etc., but this can be either overwritten, or added
onto depending on intended plot purposes.