Flatten a dataset
flatten_data(data)
(list) The dataset object returned by read_data()
, or a named list of ecocoomDP tables.
(tbl_df, tbl, data.frame) A single flat table created by joining and spreading all tables
, except the observation table. See details for more information on this "flat" format.
The "flat" format refers to the fully joined source L0 dataset in "wide" form with the exception of the core observation variables, which are in "long" form (i.e. using the variable_name, value, unit columns of the observation table). This "flat" format is the "widest" an L1 ecocomDP dataset can be consistently spread due to the frequent occurrence of L0 source datasets with > 1 core observation variable.
Warnings/Errors from flatten_data()
can most often be fixed by addressing any validation issues reported by read_data()
(e.g. non-unique composite keys).
Ancillary identifiers are dropped from the returned object.
# Flatten a dataset object
flat <- flatten_data(ants_L1)
flat
#> # A tibble: 2,931 x 46
#> observation_id event_id datetime variable_name value unit trap.type
#> <chr> <chr> <date> <chr> <dbl> <chr> <chr>
#> 1 1 1 2003-06-01 abundance 2 number bait
#> 2 2 1 2003-06-01 abundance 2 number bait
#> 3 3 1 2003-06-01 abundance 1 number bait
#> 4 4 1 2003-06-01 abundance 2 number bait
#> 5 5 1 2003-06-01 abundance 1 number hand
#> 6 6 1 2003-06-01 abundance 1 number hand
#> 7 7 1 2003-06-01 abundance 1 number hand
#> 8 8 1 2003-06-01 abundance 1 number hand
#> 9 9 1 2003-06-01 abundance 1 number hand
#> 10 10 1 2003-06-01 abundance 1 number litter
#> # ... with 2,921 more rows, and 39 more variables: trap.num <chr>,
#> # moose.cage <chr>, location_id <chr>, location_name <chr>, block <chr>,
#> # plot <chr>, latitude <dbl>, longitude <dbl>, elevation <dbl>,
#> # treatment <chr>, taxon_id <chr>, taxon_rank <chr>, taxon_name <chr>,
#> # authority_system <chr>, authority_taxon_id <chr>, behavior <chr>,
#> # biogeographic.affinity <chr>, colony.size <chr>, feeding.preference <chr>,
#> # hl <dbl>, unit_hl <chr>, nest.substrate <chr>, primary.habitat <chr>, ...
# Flatten a list of tables
tables <- ants_L1$tables
flat <- flatten_data(tables)
flat
#> # A tibble: 2,931 x 46
#> observation_id event_id datetime variable_name value unit trap.type
#> <chr> <chr> <date> <chr> <dbl> <chr> <chr>
#> 1 1 1 2003-06-01 abundance 2 number bait
#> 2 2 1 2003-06-01 abundance 2 number bait
#> 3 3 1 2003-06-01 abundance 1 number bait
#> 4 4 1 2003-06-01 abundance 2 number bait
#> 5 5 1 2003-06-01 abundance 1 number hand
#> 6 6 1 2003-06-01 abundance 1 number hand
#> 7 7 1 2003-06-01 abundance 1 number hand
#> 8 8 1 2003-06-01 abundance 1 number hand
#> 9 9 1 2003-06-01 abundance 1 number hand
#> 10 10 1 2003-06-01 abundance 1 number litter
#> # ... with 2,921 more rows, and 39 more variables: trap.num <chr>,
#> # moose.cage <chr>, location_id <chr>, location_name <chr>, block <chr>,
#> # plot <chr>, latitude <dbl>, longitude <dbl>, elevation <dbl>,
#> # treatment <chr>, taxon_id <chr>, taxon_rank <chr>, taxon_name <chr>,
#> # authority_system <chr>, authority_taxon_id <chr>, behavior <chr>,
#> # biogeographic.affinity <chr>, colony.size <chr>, feeding.preference <chr>,
#> # hl <dbl>, unit_hl <chr>, nest.substrate <chr>, primary.habitat <chr>, ...