library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
It is common when processing DAS data to use a text-based DAT file to provide additional information. For instance, Ships.dat is used to determine the ship code based on cruise number, while SpCodes.dat is used to match species codes to species names. In this document, we examine how to read in DAT files into R and join them with DAS data. If the exact format of Ships.dat or SpCodes.dat changes in the future, you can change the code introduced in this document to match the new format. Also note that you can use this workflow to join processed DAS data with data from any file type.
First we read in and process the sample DAS data
y <- system.file("das_sample.das", package = "swfscDAS")
y.proc <- das_process(y)
y.sight <- das_sight(y.proc, return.format = "default")
This package includes Ships_sample.dat and SpCodes_sample.dat files,
which have the same format as the commonly used Ships.dat and
SpCodes.dat files. Because these DAT files are fixed width text files,
we use the read_fwf
function from the readr
package to read the DAT files into data frames. You could also use the
read.fwf
file from base R. Note that the
data.frame
call is not necessary if you are comfortable
working with tibbles.
ships.df <- data.frame(read_fwf(
system.file("Ship_sample.dat", package = "swfscDAS"),
col_positions = fwf_widths(c(6, NA), col_names = c("Cruise", "Ship")),
col_types = cols(Cruise = col_double(), Ship = col_character()),
trim_ws = TRUE
), stringsAsFactors = FALSE)
spcodes.df <- data.frame(read_fwf(
system.file("SpCodes_sample.dat", package = "swfscDAS"),
col_positions = fwf_widths(c(4, 13, 40, NA), col_names = c("SpCode", "Abbr", "SciName", "CommonName")),
col_types = cols(.default = col_character()),
trim_ws = TRUE
), stringsAsFactors = FALSE)
Now that we have both the DAS data and external data in data frames,
we can use the the dplyr package, and specifically the
left_join
function, to combine the data