Create the dataset_summary table
create_dataset_summary(
L0_flat,
package_id,
original_package_id = NULL,
length_of_survey_years,
number_of_years_sampled,
std_dev_interval_betw_years,
max_num_taxa,
geo_extent_bounding_box_m2 = NULL
)
(tbl_df, tbl, data.frame) The fully joined source L0 dataset, in "flat" format (see details).
(character) Column in L0_flat
containing the identifier of the derived L1 dataset.
(character) An optional column in L0_flat
containing the identifier of the source L0 dataset.
(character) Column in L0_flat
containing the number of years the study has been ongoing. Use calc_length_of_survey_years()
to calculate this value.
(character) Column in L0_flat
containing the number of years within the period of study that samples were taken. Use calc_number_of_years_sampled()
to calculate this value.
(character) Column in L0_flat
containing the standard deviation of the interval between sampling events. Use calc_std_dev_interval_betw_years()
to calculate this value.
(character) Column in L0_flat
containing the number of unique taxa in the source L0 dataset.
(character) An optional column in L0_flat
containing the area (in meters) of the study location, if applicable (some L0 were collected at a single point). Use calc_geo_extent_bounding_box_m2()
to calculate this value.
(tbl_df, tbl, data.frame) The dataset_summary table.
This function collects specified columns from L0_flat
and returns distinct rows.
"flat" format refers to the fully joined source L0 dataset in "wide" form with the exception of the core observation variables, which are in "long" form (i.e. using the variable_name, value, unit columns of the observation table). This "flat" format is the "widest" an L1 ecocomDP dataset can be consistently spread due to the frequent occurrence of L0 source datasets with > 1 core observation variable.
flat <- ants_L0_flat
dataset_summary <- create_dataset_summary(
L0_flat = flat,
package_id = "package_id",
original_package_id = "original_package_id",
length_of_survey_years = "length_of_survey_years",
number_of_years_sampled = "number_of_years_sampled",
std_dev_interval_betw_years = "std_dev_interval_betw_years",
max_num_taxa = "max_num_taxa",
geo_extent_bounding_box_m2 = "geo_extent_bounding_box_m2")
dataset_summary
#> # A tibble: 1 x 7
#> package_id original_packag~ length_of_surve~ number_of_years~ std_dev_interva~
#> <chr> <chr> <dbl> <dbl> <dbl>
#> 1 edi.193.5 knb-lter-hfr.11~ 15 13 0.67
#> # ... with 2 more variables: max_num_taxa <dbl>,
#> # geo_extent_bounding_box_m2 <dbl>