Create the location table
create_location(
L0_flat,
location_id,
location_name,
latitude = NULL,
longitude = NULL,
elevation = NULL
)
(tbl_df, tbl, data.frame) The fully joined source L0 dataset, in "flat" format (see details).
(character) Column in L0_flat
containing the identifier assigned to each unique location at the observation level.
(character) One or more columns in L0_flat
of sampling locations ordered from high to low in terms of nesting, where the lowest is the level of observation (e.g. location_name = c("plot", "subplot")
).
(character) An optional column in L0_flat
containing the latitude in decimal degrees of location_id
. Latitudes south of the equator are negative.
(character) An optional column in L0_flat
containing the longitude in decimal degrees of location_id
. Longitudes west of the prime meridian are negative.
(character) An optional column in L0_flat
containing the elevation in meters relative to sea level of location_id
. Above sea level is positive. Below sea level is negative.
(tbl_df, tbl, data.frame) The location table.
This function collects specified columns from L0_flat
, creates data frames for each location_name
, assigns latitude
, longitude
, and elevation
to the lowest nesting level (i.e. the observation level) returning NA
for higher levels (these will have to be filled manually afterwards), and determines the relationships between location_id and parent_location_id from L0_flat
and location_name
.
To prevent the listing of duplicate location_name values, and to enable the return of location_name
columns by flatten_data()
, location_name values are suffixed with the column they came from according to: paste0(<column name>, "__", <column value>)
. Example: A column named "plot" with values "1", "2", "3", in L0_flat
would be listed in the resulting location table under the location_name column as "1", "2", "3" and therefore no way to discern these values correspond with "plot". Applying the above listed solution returns "plot__1", "plot__2", "plot__3" in the location table and returns the column "plot" with values c("1", "2", "3") by flatten_data()
.
"flat" format refers to the fully joined source L0 dataset in "wide" form with the exception of the core observation variables, which are in "long" form (i.e. using the variable_name, value, unit columns of the observation table). This "flat" format is the "widest" an L1 ecocomDP dataset can be consistently spread due to the frequent occurrence of L0 source datasets with > 1 core observation variable.
Additionally, latitude, longitude, and elevation of sites nested above the observation level will have to be manually added after the location table is returned.
flat <- ants_L0_flat
location <- create_location(
L0_flat = flat,
location_id = "location_id",
location_name = c("block", "plot"),
latitude = "latitude",
longitude = "longitude",
elevation = "elevation")
location
#> # A tibble: 10 x 6
#> location_id location_name latitude longitude elevation parent_location_id
#> <chr> <chr> <dbl> <dbl> <dbl> <chr>
#> 1 a1 block__Ridge NA NA NA NA
#> 2 a2 block__Valley NA NA NA NA
#> 3 1 plot__1 42.5 -72.2 220 a2
#> 4 2 plot__2 42.5 -72.2 220 a2
#> 5 3 plot__3 42.5 -72.2 220 a2
#> 6 4 plot__4 42.5 -72.2 220 a1
#> 7 5 plot__5 42.5 -72.2 220 a1
#> 8 6 plot__6 42.5 -72.2 220 a1
#> 9 7 plot__7 42.5 -72.2 220 a1
#> 10 8 plot__8 42.5 -72.2 220 a2