Create the taxa table that will map the resolved taxa back to the raw taxa in the original data table, and which will be populated with provenance information about the taxa cleaning process.

create_taxa_map(path, x, col)

Arguments

path

A character string specifying the path to which the taxa table will be written.

x

A data frame containing the vector of taxa names to be cleaned.

col

A character string specifying the column in x containing taxa names to be cleaned.

Value

(data frame; taxa_map.csv) With the fields:

  • 'taxa_raw' Unique taxa names listed in x.

  • 'taxa_trimmed' The contents of taxa_raw, but with white space and common abbreviations (e.g. "Spp.", "C.f.") trimmed. Column contents are outputs from `trim_taxon`.

  • 'taxa_replacement' The taxa name used as a replacement for taxa_raw. Column contents are outputs from `replace_taxon`.

  • 'taxa_removed' A logical value indicating whether the corresponding taxa_raw should be removed. Column contents are outputs from `remove_taxon`.

  • 'taxa_clean' Cleaned taxa names that have been resolved to a taxonomic authority. Column contents are outputs from `resolve_taxa` and `resolve_common`.

  • 'rank' Taxonomic rank for resolved taxon. Column contents are outputs from `resolve_taxa` and `resolve_common`.

  • 'authority' Taxonomic authorities against which taxa_clean was resolved. Column contents are outputs from `resolve_taxa` and `resolve_common`.

  • 'authority_id' Unique identification numbers within each authority. Column contents are outputs from `resolve_taxa` and `resolve_common`.

  • 'score' A numeric score, supplied by the authority, indicating the strength of match between taxa_raw and taxa_clean. Column contents are outputs from `resolve_taxa` and `resolve_common`.

  • 'difference' A logical value indicating whether the contents resolved_taxa differ from raw_taxa.