Downloading data from movebank

library(move2)

User credentials

The credentials of the user are stored using the keyring package. With the following command a user can be added to the keyring. Run this line once, it will store your credentials in keyring. After that every time you load move2 and execute a download function from movebank, these functions will retrieve your credentials from keyring.

movebank_store_credentials("myUserName", "myPassword")
movebank_remove_credentials()
#> There is 1 key removed from the keyring.

The keyring package can use several mechanisms to store credentials, these are called backends. Some of these backends are operating system dependent, others are more general. Some of the operating systems dependent backends have the advantage that they do not require providing credentials when opening a new R session.

The move2 package uses the default backend as is returned by keyring::default_backend(), this function thus shows the backend move2 is using. If you want to change the default you can use the keyring_backend option, for more details see the documentation in the keyring package.

macOS and Windows generally do not require entering an extra password for keyring. The default in Linux is often the file backend which can be confusing as it creates an encrypted file with credentials that need a password to unlock. In this case a separate password for the keyring file has to be entered for each new R session before the movebank password can be accessed. To avoid having to enter each time a keyring password the Secret Service API can be used by installing the libsecret library. (Debian/Ubuntu: libsecret-1-dev; Recent RedHat, Fedora and CentOS systems: libsecret-devel)

Handling multiple Movebank accounts - use key_name

If you have multiple user accounts on movebank, the easiest way is to give each of them a key name with the argument key_name. For the most used account also the default option can be used. The movebank_store_credentials() only has to be executed once for each account. After that the credentials will be retrieved from keyring.

## store credentials for the most used account.
movebank_store_credentials("myUserName", "myPassword")

## store credentials for another movebank account
movebank_store_credentials("myUserName_2", "myPassword_2", key_name = "myOtherAccount")

When you want to download from Movebank using your default movebank account, nothing has to be specified before the download functions. If you want to download from Movebank with another account, than you should execute the line below, specifying the key name of the account to use, before the download functions are executed.

options("move2_movebank_key_name" = "myOtherAccount")

If in one script/Rsession you are using several accounts, to use the credentials of the default account execute the line below:

options("move2_movebank_key_name" = "movebank")

To check which accounts are stored in keyring:

keyring::key_list()
#   service           username
# 1 movebank          myUserName
# 2 myOtherAccount    myUserName_2

The service column corresponds to the names provided in key_name. The account entered without a key name (the default) will be called movebank. Note that the key names have to be unique, if there are several usernames with the same key name (service), it will cause an error.

Removing user credentials from keyring

To deleted credentials from keyring:

## for the default account
movebank_remove_credentials()
#> There is 1 key removed from the keyring.

## for an account with a key name
movebank_remove_credentials(key_name = "myOtherAccount")
#> There is 1 key removed from the keyring.

Next we can check if the keys are successfully removed:

keyring::key_list()

Here you can check if the movebank service is successfully removed.

Downloading data

library(dplyr)

Using the movebank_retrieve function it is possible to directly access the API, here all studies with a creative commons 0 license are returned. These are a good candidate for exploration and testing

movebank_retrieve(entity_type = "study", license_type = "CC_0") |>
  select(id, name, number_of_deployed_locations) |>
  filter(!is.na(number_of_deployed_locations))
#> # A tibble: 326 × 3
#>            id name                                            number_of_deployed_l…¹
#>       <int64> <fct>                                                          [count]
#>  1 1169957016 spectacledEider_USGS_ASC_argos                                   61299
#>  2 1199929756 Spatial ecology of urban copperheads                              2031
#>  3 1605798640 O_BALGZAND - Eurasian oystercatchers (Haematop…                 165891
#>  4 1605803389 O_AMELAND - Eurasian oystercatchers (Haematopu…                 216108
#>  5 1605797471 O_ASSEN - Eurasian oystercatchers (Haematopus …                  20152
#>  6 1605799506 O_SCHIERMONNIKOOG - Eurasian oystercatchers (H…                 602380
#>  7 1605802367 O_VLIELAND - Eurasian oystercatchers (Haematop…                4908942
#>  8 1402467516 Black kites of different age and sex show simi…                 231193
#>  9    7249090 Peregrine Falcon, High Arctic Institute, north…                   3004
#> 10  920008781 Ringed seals Igloolik                                             9519
#> # ℹ 316 more rows
#> # ℹ abbreviated name: ¹​number_of_deployed_locations
This code took: 0.712 [s]

A more quick way to retrieve the information is the following (the selection is performed on movebank and not all data is downloaded):

movebank_download_study_info(license_type = "CC_0")

By default all attributes are downloaded:

movebank_download_study(2911040, sensor_type_id = "gps")
#> A <move2> with `track_id_column` "individual_local_identifier" and `time_column`
#> "timestamp"
#> Containing 28 tracks lasting on average 37.1 days in a
#> Simple feature collection with 16414 features and 18 fields (with 386 geometries empty)
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -91.3732 ymin: -12.79464 xmax: -77.51874 ymax: 0.1821983
#> Geodetic CRS:  WGS 84
#> # A tibble: 16,414 × 19
#>   sensor_type_id individual_local_iden…¹ eobs_battery_voltage eobs_fix_battery_vol…²
#>          <int64> <fct>                                   [mV]                   [mV]
#> 1            653 4264-84830852                           3686                   3437
#> 2            653 4264-84830852                           3701                   3452
#> 3            653 4264-84830852                           3701                   3482
#> 4            653 4264-84830852                           3691                   3476
#> 5            653 4264-84830852                           3691                   3541
#> # ℹ 16,409 more rows
#> # ℹ abbreviated names: ¹​individual_local_identifier, ²​eobs_fix_battery_voltage
#> # ℹ 15 more variables: eobs_horizontal_accuracy_estimate [m],
#> #   eobs_key_bin_checksum <int64>, eobs_speed_accuracy_estimate [m/s],
#> #   eobs_start_timestamp <dttm>, eobs_status <ord>, …
#> First 5 track features:
#> # A tibble: 28 × 52
#>   deployment_id  tag_id individual_id animal_life_stage attachment_type
#>         <int64> <int64>       <int64> <fct>             <fct>          
#> 1       2911170 2911124       2911090 adult             tape           
#> 2       2911150 2911126       2911091 adult             tape           
#> 3       2911167 2911127       2911092 adult             tape           
#> 4       2911168 2911129       2911093 adult             tape           
#> 5       2911178 2911132       2911094 adult             tape           
#> # ℹ 23 more rows
#> # ℹ 47 more variables: deployment_comments <chr>, deploy_on_timestamp <dttm>,
#> #   duty_cycle <chr>, deployment_local_identifier <fct>, manipulation_type <fct>, …
This code took: 3.93 [s]

For speed of download you might want to add the argument attributes = NULL as it reduces the columns to download to the bare minimum. Note still all individual attributes are downloaded as this does not take much time.

movebank_download_study(1259686571, sensor_type_id = "gps", attributes = NULL)
#> ℹ In total 299228 records were omitted as they were not deployed (the
#>   `deployment_id` was `NA`).
#> A <move2> with `track_id_column` "deployment_id" and `time_column` "timestamp"
#> Containing 92 tracks lasting on average 146 days in a
#> Simple feature collection with 845865 features and 2 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -9.097052 ymin: 34.82506 xmax: 10.34339 ymax: 52.88891
#> Geodetic CRS:  WGS 84
#> # A tibble: 845,865 × 3
#>   deployment_id timestamp                      geometry
#>         <int64> <dttm>                      <POINT [°]>
#> 1    3029108353 2021-08-19 21:16:35  (2.84631 51.19662)
#> 2    3029108353 2021-08-20 09:16:35 (2.846492 51.19654)
#> 3    3029108353 2021-08-20 21:16:29 (2.847637 51.20317)
#> 4    3029108353 2021-08-21 09:16:35 (2.849055 51.20314)
#> 5    3029108353 2021-08-21 21:16:35  (2.846533 51.2034)
#> # ℹ 845,860 more rows
#> First 5 track features:
#> # A tibble: 92 × 56
#>   deployment_id    tag_id individual_id alt_project_id animal_life_stage animal_mass
#>         <int64>   <int64>       <int64> <fct>          <fct>                     [g]
#> 1    3029108356       3e9    3029107890 LBBG_JUVENILE  juvenile                  693
#> 2    3029108353       3e9    3029107816 LBBG_JUVENILE  juvenile                   NA
#> 3    3029108347       3e9    3029107819 LBBG_JUVENILE  juvenile                  883
#> 4    3029108346       3e9    3029107822 LBBG_JUVENILE  juvenile                  726
#> 5    3029108345       3e9    3029107891 LBBG_JUVENILE  juvenile                  816
#> # ℹ 87 more rows
#> # ℹ 50 more variables: attachment_type <fct>, deployment_comments <chr>,
#> #   deploy_off_timestamp <dttm>, deploy_on_timestamp <dttm>,
#> #   deployment_end_type <fct>, …
This code took: 26.2 [s]

If only specific attributes want to be download you can state them in the argument attributes. The available attributes vary between studies and sensors. You can retrieve the list of available attributes for a specific sensor in given study. Note that only one sensor at a time can be stated.

movebank_retrieve(
  entity_type = "study_attribute",
  study_id = 2911040,
  sensor_type_id = "gps"
)$short_name
#>  [1] "eobs_battery_voltage"              "eobs_fix_battery_voltage"         
#>  [3] "eobs_horizontal_accuracy_estimate" "eobs_key_bin_checksum"            
#>  [5] "eobs_speed_accuracy_estimate"      "eobs_start_timestamp"             
#>  [7] "eobs_status"                       "eobs_temperature"                 
#>  [9] "eobs_type_of_fix"                  "eobs_used_time_to_get_fix"        
#> [11] "ground_speed"                      "heading"                          
#> [13] "height_above_ellipsoid"            "location_lat"                     
#> [15] "location_long"                     "timestamp"                        
#> [17] "update_ts"                         "visible"
movebank_download_study(
  study_id = 2911040,
  sensor_type_id = "gps",
  attributes = c(
    "height_above_ellipsoid",
    "eobs_temperature"
  )
)
#> A <move2> with `track_id_column` "deployment_id" and `time_column` "timestamp"
#> Containing 28 tracks lasting on average 37.1 days in a
#> Simple feature collection with 16414 features and 4 fields (with 386 geometries empty)
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -91.3732 ymin: -12.79464 xmax: -77.51874 ymax: 0.1821983
#> Geodetic CRS:  WGS 84
#> # A tibble: 16,414 × 5
#>   height_above_ellipsoid eobs_temperature deployment_id timestamp          
#>                      [m]             [°C]       <int64> <dttm>             
#> 1                   16.5               12       9472219 2008-05-31 13:30:02
#> 2                   12.6               19       9472219 2008-05-31 15:00:44
#> 3                   17.4               24       9472219 2008-05-31 16:30:39
#> 4                   24.8               18       9472219 2008-05-31 18:00:49
#> 5                   19                 22       9472219 2008-05-31 19:30:18
#> # ℹ 16,409 more rows
#> # ℹ 1 more variable: geometry <POINT [°]>
#> First 5 track features:
#> # A tibble: 28 × 52
#>   deployment_id  tag_id individual_id animal_life_stage attachment_type
#>         <int64> <int64>       <int64> <fct>             <fct>          
#> 1       2911170 2911124       2911090 adult             tape           
#> 2       2911150 2911126       2911091 adult             tape           
#> 3       2911167 2911127       2911092 adult             tape           
#> 4       2911168 2911129       2911093 adult             tape           
#> 5       2911178 2911132       2911094 adult             tape           
#> # ℹ 23 more rows
#> # ℹ 47 more variables: deployment_comments <chr>, deploy_on_timestamp <dttm>,
#> #   duty_cycle <chr>, deployment_local_identifier <fct>, manipulation_type <fct>, …
This code took: 3.92 [s]

Only load gps records:

movebank_download_study(1259686571, sensor_type_id = 653)
#> ℹ In total 299228 records were omitted as they were not deployed (the
#>   `deployment_id` was `NA`).
#> A <move2> with `track_id_column` "individual_local_identifier" and `time_column`
#> "timestamp"
#> Containing 92 tracks lasting on average 146 days in a
#> Simple feature collection with 845865 features and 25 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -9.097052 ymin: 34.82506 xmax: 10.34339 ymax: 52.88891
#> Geodetic CRS:  WGS 84
#> # A tibble: 845,865 × 26
#>   sensor_type_id individual_local_identifier acceleration_raw_x acceleration_raw_y
#>          <int64> <fct>                                    <dbl>              <dbl>
#> 1            653 H911406                                    177                 60
#> 2            653 H911406                                    283               -262
#> 3            653 H911406                                    278                574
#> 4            653 H911406                                    506                -32
#> 5            653 H911406                                    467               -222
#> # ℹ 845,860 more rows
#> # ℹ 22 more variables: acceleration_raw_z <dbl>, barometric_height [m],
#> #   battery_charge_percent [%], battery_charging_current [mA],
#> #   external_temperature [°C], …
#> First 5 track features:
#> # A tibble: 92 × 56
#>   deployment_id    tag_id individual_id alt_project_id animal_life_stage animal_mass
#>         <int64>   <int64>       <int64> <fct>          <fct>                     [g]
#> 1    3029108356       3e9    3029107890 LBBG_JUVENILE  juvenile                  693
#> 2    3029108353       3e9    3029107816 LBBG_JUVENILE  juvenile                   NA
#> 3    3029108347       3e9    3029107819 LBBG_JUVENILE  juvenile                  883
#> 4    3029108346       3e9    3029107822 LBBG_JUVENILE  juvenile                  726
#> 5    3029108345       3e9    3029107891 LBBG_JUVENILE  juvenile                  816
#> # ℹ 87 more rows
#> # ℹ 50 more variables: attachment_type <fct>, deployment_comments <chr>,
#> #   deploy_off_timestamp <dttm>, deploy_on_timestamp <dttm>,
#> #   deployment_end_type <fct>, …
This code took: 1.05 [min]

Note that the sensor_type_id can either be specified either of an integer or character with respectively the id or name of the sensor. In some cases additional data is added is downloaded if a specific sensor is selected. For example the column eobs_acceleration_raw:

movebank_download_study(2911040, sensor_type_id = "acceleration")
#> A <move2> with `track_id_column` "individual_local_identifier" and `time_column`
#> "timestamp"
#> Containing 28 tracks lasting on average 37.1 days in a
#> Simple feature collection with 98515 features and 10 fields (with 98515 geometries empty)
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
#> Geodetic CRS:  WGS 84
#> # A tibble: 98,515 × 11
#>   sensor_type_id individual_local_identifier eobs_acceleration_axes
#>          <int64> <fct>                       <fct>                 
#> 1        2365683 4264-84830852               XY                    
#> 2        2365683 4264-84830852               XY                    
#> 3        2365683 4264-84830852               XY                    
#> 4        2365683 4264-84830852               XY                    
#> 5        2365683 4264-84830852               XY                    
#> # ℹ 98,510 more rows
#> # ℹ 8 more variables: eobs_acceleration_sampling_frequency_per_axis [Hz],
#> #   eobs_accelerations_raw <chr>, eobs_key_bin_checksum <int64>,
#> #   eobs_start_timestamp <dttm>, timestamp <dttm>, …
#> First 5 track features:
#> # A tibble: 28 × 52
#>   deployment_id  tag_id individual_id animal_life_stage attachment_type
#>         <int64> <int64>       <int64> <fct>             <fct>          
#> 1       2911170 2911124       2911090 adult             tape           
#> 2       2911150 2911126       2911091 adult             tape           
#> 3       2911167 2911127       2911092 adult             tape           
#> 4       2911168 2911129       2911093 adult             tape           
#> 5       2911178 2911132       2911094 adult             tape           
#> # ℹ 23 more rows
#> # ℹ 47 more variables: deployment_comments <chr>, deploy_on_timestamp <dttm>,
#> #   duty_cycle <chr>, deployment_local_identifier <fct>, manipulation_type <fct>, …
This code took: 9.51 [s]

The following list of sensors is available:

movebank_retrieve(
  entity_type = "tag_type",
  attributes = c("external_id", "id")
)
#> # A tibble: 21 × 2
#>    external_id                  id
#>    <chr>                   <int64>
#>  1 bird-ring                   397
#>  2 gps                         653
#>  3 radio-transmitter           673
#>  4 argos-doppler-shift       82798
#>  5 natural-mark            2365682
#>  6 acceleration            2365683
#>  7 solar-geolocator        3886361
#>  8 accessory-measurements  7842954
#>  9 solar-geolocator-raw    9301403
#> 10 barometer              77740391
#> # ℹ 11 more rows

Alternatively more informative names can be used for some arguments. For example you can use a character string to identify a study or a timestamp as a POSIXct:

movebank_download_study("LBBG_JUVENILE",
  sensor_type_id = "gps",
  timestamp_start = as.POSIXct("2021-02-03 00:00:00"),
  timestamp_end = as.POSIXct("2021-03-03 00:00:00")
)
#> ℹ In total 7001 records were omitted as they were not deployed (the `deployment_id`
#>   was `NA`).
#> A <move2> with `track_id_column` "individual_local_identifier" and `time_column`
#> "timestamp"
#> Containing 6 tracks lasting on average 20.3 days in a
#> Simple feature collection with 8763 features and 25 fields
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: -7.169092 ymin: 35.18931 xmax: 3.229445 ymax: 49.06081
#> Geodetic CRS:  WGS 84
#> # A tibble: 8,763 × 26
#>   sensor_type_id individual_local_identifier acceleration_raw_x acceleration_raw_y
#>          <int64> <fct>                                    <dbl>              <dbl>
#> 1            653 L930074                                    313                -18
#> 2            653 L930074                                    308                -18
#> 3            653 L930074                                    310                -18
#> 4            653 L930074                                    314                -17
#> 5            653 L930074                                    312                -18
#> # ℹ 8,758 more rows
#> # ℹ 22 more variables: acceleration_raw_z <dbl>, barometric_height [m],
#> #   battery_charge_percent [%], battery_charging_current [mA],
#> #   external_temperature [°C], …
#> First 5 track features:
#> # A tibble: 6 × 56
#>   deployment_id    tag_id individual_id alt_project_id animal_life_stage animal_mass
#>         <int64>   <int64>       <int64> <fct>          <fct>                     [g]
#> 1    3029108271       3e9    3029107866 LBBG_JUVENILE  juvenile                  661
#> 2    3029108241       3e9    3029107889 LBBG_JUVENILE  juvenile                  885
#> 3    3029108205       3e9    3029107883 LBBG_JUVENILE  juvenile                  738
#> 4    3029108176       3e9    3029107876 LBBG_JUVENILE  juvenile                  711
#> 5    3029108161       3e9    3029107863 LBBG_JUVENILE  juvenile                  841
#> # ℹ 1 more row
#> # ℹ 50 more variables: attachment_type <fct>, deployment_comments <chr>,
#> #   deploy_off_timestamp <dttm>, deploy_on_timestamp <dttm>,
#> #   deployment_end_type <fct>, …
This code took: 5.77 [s]

Deployments

If you are interested in the deployment information you can use the movebank_download_deployment function.

movebank_download_deployment("Galapagos Albatrosses")
#> # A tibble: 28 × 26
#>    deployment_id  tag_id individual_id animal_life_stage attachment_type
#>          <int64> <int64>       <int64> <fct>             <fct>          
#>  1       2911170 2911124       2911090 adult             tape           
#>  2       2911150 2911126       2911091 adult             tape           
#>  3       2911167 2911127       2911092 adult             tape           
#>  4       2911168 2911129       2911093 adult             tape           
#>  5       2911178 2911132       2911094 adult             tape           
#>  6       2911163 2911133       2911095 adult             tape           
#>  7       9472225 2911114       2911061 adult             tape           
#>  8       9472224 2911120       2911062 adult             tape           
#>  9       9472223 2911121       2911086 adult             tape           
#> 10       9472222 2911134       2911065 adult             tape           
#> # ℹ 18 more rows
#> # ℹ 21 more variables: deployment_comments <chr>, deploy_on_timestamp <dttm>,
#> #   duty_cycle <chr>, deployment_local_identifier <fct>, manipulation_type <fct>, …
This code took: 4.1 [s]

Advanced usage

For specific request it might be useful to directly retrieve information from the movebank api. The movebank_retrieve function provides this functionality. The first argument is the entity type you would like to retrieve information for (e.g. tag or event). Other arguments make it possible to select, a study id is always required. For more details how to use the api see the documentation.

Downloading undeployed data

One common reason to use this options is to retrieve undeployed locations. In some cases a set of locations is collected before the tag attached to the animal for quality control or error measurements. The example below shows how all records for a specific tag can be retrieved. Filtering for locations where the deployment_id is NA, returns those locations that were collected while the tag was not deployed. The timestamp_start and timestamp_end might be good argument to filter down the data even more in the call to movebank_retrieve. By omitting the argument tag_local_identifier the entire study can downloaded. With the argument sensor_type_id the sensors can be specified.

movebank_retrieve("event",
  study_id = 1259686571,
  tag_local_identifier = "193967", attributes = "all"
) %>%
  filter(is.na(deployment_id))
#> # A tibble: 57 × 33
#>    individual_id deployment_id tag_id study_id sensor_type_id individual_local_ide…¹
#>          <int64>       <int64> <int6>  <int64>        <int64> <fct>                 
#>  1            NA            NA    3e9      1e9            653 <NA>                  
#>  2            NA            NA    3e9      1e9            653 <NA>                  
#>  3            NA            NA    3e9      1e9            653 <NA>                  
#>  4            NA            NA    3e9      1e9            653 <NA>                  
#>  5            NA            NA    3e9      1e9            653 <NA>                  
#>  6            NA            NA    3e9      1e9            653 <NA>                  
#>  7            NA            NA    3e9      1e9            653 <NA>                  
#>  8            NA            NA    3e9      1e9            653 <NA>                  
#>  9            NA            NA    3e9      1e9            653 <NA>                  
#> 10            NA            NA    3e9      1e9            653 <NA>                  
#> # ℹ 47 more rows
#> # ℹ abbreviated name: ¹​individual_local_identifier
#> # ℹ 27 more variables: tag_local_identifier <fct>,
#> #   individual_taxon_canonical_name <fct>, acceleration_raw_x <dbl>,
#> #   acceleration_raw_y <dbl>, acceleration_raw_z <dbl>, …
This code took: 1.07 [s]