Create the dataset_summary table

create_dataset_summary(
  L0_flat,
  package_id,
  original_package_id = NULL,
  length_of_survey_years,
  number_of_years_sampled,
  std_dev_interval_betw_years,
  max_num_taxa,
  geo_extent_bounding_box_m2 = NULL
)

Arguments

L0_flat

(tbl_df, tbl, data.frame) The fully joined source L0 dataset, in "flat" format (see details).

package_id

(character) Column in L0_flat containing the identifier of the derived L1 dataset.

original_package_id

(character) An optional column in L0_flat containing the identifier of the source L0 dataset.

length_of_survey_years

(character) Column in L0_flat containing the number of years the study has been ongoing. Use calc_length_of_survey_years() to calculate this value.

number_of_years_sampled

(character) Column in L0_flat containing the number of years within the period of study that samples were taken. Use calc_number_of_years_sampled() to calculate this value.

std_dev_interval_betw_years

(character) Column in L0_flat containing the standard deviation of the interval between sampling events. Use calc_std_dev_interval_betw_years() to calculate this value.

max_num_taxa

(character) Column in L0_flat containing the number of unique taxa in the source L0 dataset.

geo_extent_bounding_box_m2

(character) An optional column in L0_flat containing the area (in meters) of the study location, if applicable (some L0 were collected at a single point). Use calc_geo_extent_bounding_box_m2() to calculate this value.

Value

(tbl_df, tbl, data.frame) The dataset_summary table.

Details

This function collects specified columns from L0_flat and returns distinct rows.

"flat" format refers to the fully joined source L0 dataset in "wide" form with the exception of the core observation variables, which are in "long" form (i.e. using the variable_name, value, unit columns of the observation table). This "flat" format is the "widest" an L1 ecocomDP dataset can be consistently spread due to the frequent occurrence of L0 source datasets with > 1 core observation variable.

Examples

flat <- ants_L0_flat

dataset_summary <- create_dataset_summary(
  L0_flat = flat, 
  package_id = "package_id", 
  original_package_id = "original_package_id", 
  length_of_survey_years = "length_of_survey_years",
  number_of_years_sampled = "number_of_years_sampled", 
  std_dev_interval_betw_years = "std_dev_interval_betw_years", 
  max_num_taxa = "max_num_taxa", 
  geo_extent_bounding_box_m2 = "geo_extent_bounding_box_m2")

dataset_summary
#> # A tibble: 1 x 7
#>   package_id original_packag~ length_of_surve~ number_of_years~ std_dev_interva~
#>   <chr>      <chr>                       <dbl>            <dbl>            <dbl>
#> 1 edi.193.5  knb-lter-hfr.11~               15               13             0.67
#> # ... with 2 more variables: max_num_taxa <dbl>,
#> #   geo_extent_bounding_box_m2 <dbl>