Create the observation table

create_observation(
  L0_flat,
  observation_id,
  event_id = NULL,
  package_id,
  location_id,
  datetime,
  taxon_id,
  variable_name,
  value,
  unit = NULL
)

Arguments

L0_flat

(tbl_df, tbl, data.frame) The fully joined source L0 dataset, in "flat" format (see details).

observation_id

(character) Column in L0_flat containing the identifier assigned to each unique observation.

event_id

(character) An optional column in L0_flat containing the identifier assigned to each unique sampling event.

package_id

(character) Column in L0_flat containing the identifier of the derived L1 dataset.

location_id

(character) Column in L0_flat containing the identifier assigned to each unique location at the observation level.

datetime

(character) Column in L0_flat containing the date, and if applicable time, of the observation following the ISO-8601 standard format (e.g. YYYY-MM-DD hh:mm:ss).

taxon_id

(character) Column in L0_flat containing the identifier assigned to each unique organism at the observation level.

variable_name

(character) Column in L0_flat containing the names of variables measured.

value

(character) Column in L0_flat containing the values of variable_name.

unit

(character) An optional column in L0_flat containing the units of variable_name.

Value

(tbl_df, tbl, data.frame) The observation table.

Details

This function collects specified columns from L0_flat and returns distinct rows.

"flat" format refers to the fully joined source L0 dataset in "wide" form with the exception of the core observation variables, which are in "long" form (i.e. using the variable_name, value, unit columns of the observation table). This "flat" format is the "widest" an L1 ecocomDP dataset can be consistently spread due to the frequent occurrence of L0 source datasets with > 1 core observation variable.

Examples

flat <- ants_L0_flat

observation <- create_observation(
  L0_flat = flat, 
  observation_id = "observation_id", 
  event_id = "event_id", 
  package_id = "package_id",
  location_id = "location_id", 
  datetime = "datetime", 
  taxon_id = "taxon_id", 
  variable_name = "variable_name",
  value = "value",
  unit = "unit")

observation
#> # A tibble: 2,931 x 9
#>    observation_id event_id package_id location_id datetime   taxon_id
#>    <chr>          <chr>    <chr>      <chr>       <date>     <chr>   
#>  1 1              1        edi.193.5  4           2003-06-01 1       
#>  2 2              1        edi.193.5  4           2003-06-01 2       
#>  3 3              1        edi.193.5  4           2003-06-01 53      
#>  4 4              1        edi.193.5  4           2003-06-01 2       
#>  5 5              1        edi.193.5  4           2003-06-01 2       
#>  6 6              1        edi.193.5  4           2003-06-01 8       
#>  7 7              1        edi.193.5  4           2003-06-01 24      
#>  8 8              1        edi.193.5  4           2003-06-01 42      
#>  9 9              1        edi.193.5  4           2003-06-01 53      
#> 10 10             1        edi.193.5  4           2003-06-01 1       
#> # ... with 2,921 more rows, and 3 more variables: variable_name <chr>,
#> #   value <dbl>, unit <chr>