Create the location table

create_location(
  L0_flat,
  location_id,
  location_name,
  latitude = NULL,
  longitude = NULL,
  elevation = NULL
)

Arguments

L0_flat

(tbl_df, tbl, data.frame) The fully joined source L0 dataset, in "flat" format (see details).

location_id

(character) Column in L0_flat containing the identifier assigned to each unique location at the observation level.

location_name

(character) One or more columns in L0_flat of sampling locations ordered from high to low in terms of nesting, where the lowest is the level of observation (e.g. location_name = c("plot", "subplot")).

latitude

(character) An optional column in L0_flat containing the latitude in decimal degrees of location_id. Latitudes south of the equator are negative.

longitude

(character) An optional column in L0_flat containing the longitude in decimal degrees of location_id. Longitudes west of the prime meridian are negative.

elevation

(character) An optional column in L0_flat containing the elevation in meters relative to sea level of location_id. Above sea level is positive. Below sea level is negative.

Value

(tbl_df, tbl, data.frame) The location table.

Details

This function collects specified columns from L0_flat, creates data frames for each location_name, assigns latitude, longitude, and elevation to the lowest nesting level (i.e. the observation level) returning NA for higher levels (these will have to be filled manually afterwards), and determines the relationships between location_id and parent_location_id from L0_flat and location_name.

To prevent the listing of duplicate location_name values, and to enable the return of location_name columns by flatten_data(), location_name values are suffixed with the column they came from according to: paste0(<column name>, "__", <column value>). Example: A column named "plot" with values "1", "2", "3", in L0_flat would be listed in the resulting location table under the location_name column as "1", "2", "3" and therefore no way to discern these values correspond with "plot". Applying the above listed solution returns "plot__1", "plot__2", "plot__3" in the location table and returns the column "plot" with values c("1", "2", "3") by flatten_data().

"flat" format refers to the fully joined source L0 dataset in "wide" form with the exception of the core observation variables, which are in "long" form (i.e. using the variable_name, value, unit columns of the observation table). This "flat" format is the "widest" an L1 ecocomDP dataset can be consistently spread due to the frequent occurrence of L0 source datasets with > 1 core observation variable.

Additionally, latitude, longitude, and elevation of sites nested above the observation level will have to be manually added after the location table is returned.

Examples

flat <- ants_L0_flat

location <- create_location(
  L0_flat = flat, 
  location_id = "location_id", 
  location_name = c("block", "plot"), 
  latitude = "latitude", 
  longitude = "longitude", 
  elevation = "elevation")

location
#> # A tibble: 10 x 6
#>    location_id location_name latitude longitude elevation parent_location_id
#>    <chr>       <chr>            <dbl>     <dbl>     <dbl> <chr>             
#>  1 a1          block__Ridge      NA        NA          NA NA                
#>  2 a2          block__Valley     NA        NA          NA NA                
#>  3 1           plot__1           42.5     -72.2       220 a2                
#>  4 2           plot__2           42.5     -72.2       220 a2                
#>  5 3           plot__3           42.5     -72.2       220 a2                
#>  6 4           plot__4           42.5     -72.2       220 a1                
#>  7 5           plot__5           42.5     -72.2       220 a1                
#>  8 6           plot__6           42.5     -72.2       220 a1                
#>  9 7           plot__7           42.5     -72.2       220 a1                
#> 10 8           plot__8           42.5     -72.2       220 a2