This vignette is intended to demonstrate how to use the functions
rtry_geocoding
and rtry_revgeocoding
within
the ‘rtry’ package to perform geocoding and reverse geocoding for a list
of locations or coordinates.
Geocoding is the process of converting an address into geographic coordinates (latitude and longitude), while reverse geocoding is the process of converting geographic coordinates (latitude and longitude) into an address.
The functions rtry_geocoding
and
rtry_revgeocoding
are based on Nominatim, a search engine
for OpenStreetMap (OSM) data. The data provided are free to use for any
purpose, including commercial use, note that they are governed by the Open Database License
(ODbL). As part of the Nominatim
Usage Policy, an absolute maximum of 1 request per second (no heavy
usage) and a valid email address to identify the request are required
when using this OSM service. For details, please refer to: https://wiki.openstreetmap.org/wiki/Nominatim.
Note that the georeference system used is WGS84 projection.
Make sure you have the ‘rtry’ package installed. If not, you may refer to the vignette “Introduction to rtry” (rtry-introduction).
To start, set the work directory to the desired location:
# Set the working directory
setwd("<path_to_dir>")
# Check the working directory
getwd()
Note: The character “\
” is used as escape character in R
to give the following character special meaning (e.g. “\n
”
for newline, “\t
” for tab, “\r
” for carriage
return and so on). Therefore, for Windows users, it is important to use
the “\
” in the file path of the command instead of
“/
” in order for R to correctly understand the input
path.
Load the required packages using the commands:
# Load the rtry package
library(rtry)
# Check the version of rtry
packageVersion("rtry")
# Load the dplyr package which is used for piping (%>%)
library(dplyr)
rtry_geocoding()
takes two parameters
address
and email
, and returns a data frame
that contains latitude (lat
) and longitude
(lon
) in WGS84 projection.
rtry_geocoding(address = NULL, email = NULL)
Argument | Description |
---|---|
address |
String of an address |
email |
String of an email address |
In the context of this example workflow, we will use the location
data provided within the ‘rtry’ package. In this specific case the input
argument for the file data_locations.csv
can be obtained
via system.file()
that finds the full file path to the
‘rtry’ package:
# Obtain and print the path to the sample dataset within the rtry package
<- system.file("testdata", "data_locations.csv", package = "rtry")
path_to_data path_to_data
## [1] "C:/Program Files/R/R-4.0.5/library/rtry/testdata/data_locations.csv"
To load the .csv
file with location information, use
rtry_import()
:
# Load the locations from a .csv file
<- rtry_import(path_to_data, separator = ",", encoding = "UTF-8", quote = "\"")
input_locations
# View the location data in the data viewer
View(input_locations)
## input: C:/Program Files/R/R-4.0.5/library/rtry/testdata/data_locations.csv
## dim: 20 3
## col: Country code Country Location
Then, the location data should be converted into the required format
for address
,
i.e. <location>, <country>
:
# Extract and combine the location and country names
<- paste(input_locations$Location, input_locations$Country, sep = ", ")
input_addresses
# Display the first six rows
head(input_addresses)
## [1] "Hajdúdorog, Hungary" "Diósd, Hungary" "Fót, Hungary" "Bőcs, Hungary"
## [5] "Regéc, Hungary" "Sáska, Hungary"
Note that file encoding UTF-8
is used, and it is normal
for the RStudio console to display character in Unicode character
(<U+0000>
semantic) depending on the system language
setting. For example, “Bőcs” might be displayed as
“B<U+0151>cs”.
In order to apply the function rtry_geocoding()
to the
list input_addresses
, use lapply()
, and please
remember to change the email address into your own email address.
Since OSM is an absolute maximum of 1 request per second, in the following example, a 2 second delay has been set between each search.
# Prepare counter for printed progress messages
<- 1
counter <- NULL # somethings received error messages 'no object found'
output_coordinates
# Use lapply to apply function to the list of addresses
<- lapply(input_addresses, function(address) {
output_coordinates # Calling the Nominatim OpenStreetMap API
# Please change the email address into your own email address
<- rtry_geocoding(address, email = "john.doe@example.com")
geocode_output
# No heavy uses (an absolute maximum of 1 request per second)
# Here set to 2 seconds between each search
Sys.sleep(2)
# Print message in console to see the progress
message("Geocoding ", counter, "/", nrow(input_locations), " completed.")
<<- counter + 1
counter
# Return data.frame with the input address, output of the rtry_geocoding function
return(data.frame(address = address, geocode_output))
%>%
}) # Stack the list output into data.frame
bind_rows() %>% data.frame()
## Geocoding 1/20 completed.
## Geocoding 2/20 completed.
## Geocoding 3/20 completed.
## ...
## Geocoding 20/20 completed.
The progress of the geocoding can be seen in the console. Once the
geocoding is completed, view the output_coordinates
using
the View
function.
The output_coordinates
would look like the following.
Note that the location which is unknown to OSM, the resulting latitude
and longitude will remain or marked as NA
.
Substitute the coordinates into the corresponding columns of the input data.
# Add the output coordinates to the corresponding columns in the input data
$Latitude <- output_coordinates$lat
input_locations$Longitude <- output_coordinates$lon
input_locations
# If necessary, re-arrange the columns
<- rtry_select_col(input_locations, "Country code", Country, Location, Latitude, Longitude, showOverview = FALSE)
input_locations
# View data
head(input_locations)
# Export into .csv
= file.path(tempdir(), "locations_to_coordinates.csv")
output_file rtry_export(input_locations, output_file)
## File saved at: C:/Users/user/AppData/Local/Temp/Rtmp4wJAvQ/locations_to_coordinates.csv
The rtry_revgeocoding()
takes two parameters
lat_lon
and email
, and returns a data frame
that contains the corresponding location.
rtry_revgeocoding(lat_lon = NULL, email = NULL)
Argument | Description |
---|---|
lat_lon |
A data frame containing latitude and longitude in WGS84 projection |
email |
String of an email address |
Here, we will use the coordinates data provided within the ‘rtry’
package. In this specific case the input argument for the file
data_coordinates.csv
can be obtained via
system.file()
that finds the full file path to the ‘rtry’
package:
# Obtain and print the path to the sample dataset within the rtry package
<- system.file("testdata", "data_coordinates.csv", package = "rtry")
path_to_data path_to_data
## [1] "C:/Program Files/R/R-4.0.5/library/rtry/testdata/data_coordinates.csv"
To load the .csv
file with coordinates information, use
rtry_import()
:
<- rtry_import(path_to_data, separator = ",", encoding = "UTF-8", quote = "\"") input_coordinates
## input: C:/Program Files/R/R-4.0.5/library/rtry/testdata/data_coordinates.csv
## dim: 20 2
## col: Latitude Longitude
Then, the coordinates data should be converted into a
data.frame
:
# Extract and converted the coordinates into a data frame
<- data.frame(lat = input_coordinates$Latitude, lon = input_coordinates$Longitude) input_lat_lon
In order to apply the function rtry_revgeocoding
to the
input_lat_lon
, use apply()
, and please
remember to change the email address into your own email address.
Since OSM is an absolute maximum of 1 request per second, in the following example, a 2 second delay has been set between each search.
# Prepare counter for printed progress messages
<- 1
counter <- NULL # somethings received error messages 'no object found'
output_locations
# Use apply to apply function to the data.frame that contains the coordinates
# Please change the email address to your own email address
<- apply(input_lat_lon, 1, function(lat_lon) {
output_locations # Calling the Nominatim OpenStreetMap API
<- rtry_revgeocoding(lat_lon, email = "john.doe@example.com")
rev_geocode_output
# No heavy uses (an absolute maximum of 1 request per second)
# Here set to 2 seconds between each search
Sys.sleep(2)
# Print message in console to see the progress
message("Reverse Geocoding ", counter, "/", length(input_lat_lon$lat), " completed.")
<<- counter + 1
counter
# Return data.frame with the input coordinates, output of the rtry_revgeocoding function
return(data.frame(lat = lat_lon[1], lon = lat_lon[2], rev_geocode_output))
%>%
}) # Stack the list output into data.frame
bind_rows() %>% data.frame()
## Reverse Geocoding 1/20 completed.
## Reverse Geocoding 2/20 completed.
## Reverse Geocoding 3/20 completed.
## ...
## Reverse Geocoding 20/20 completed.
The progress of the reverse geocoding can be seen in the console.
Once the reverse geocoding is completed, view the
output_locations
using the View
function.
The output location information would look like the following. Note
that for some coordinates, OpenStreetMap might not have the town/city
information, in such case, those columns will be marked as
NA
.
Substitute the country_code
and country
into the corresponding columns of the input list, while the location
information is extracted from either town
or
city
.
# Add the output location information to the corresponding columns in the input data
$'Country code' <- output_locations$country_code
input_coordinates$Country <- output_locations$country
input_coordinates$Location <- ifelse(!is.na(output_locations$town), output_locations$town, output_locations$city)
input_coordinates
# If necessary, re-arrange the columns
<- rtry_select_col(input_coordinates, Latitude, Longitude, "Country code", Country, Location, showOverview = FALSE)
input_coordinates
# View data
head(input_coordinates)
# Export into .csv
= file.path(tempdir(), "coordinates_to_locations.csv")
output_file rtry_export(input_coordinates, output_file)
## File saved at: C:/Users/user/AppData/Local/Temp/Rtmp4wJAvQ/locations_to_coordinates.csv