Render the contents of metadata templates into EML, validate, and write to file.

make_eml(
  path,
  data.path = path,
  eml.path = path,
  dataset.title = NULL,
  temporal.coverage = NULL,
  geographic.description = NULL,
  geographic.coordinates = NULL,
  maintenance.description = NULL,
  data.table = NULL,
  data.table.name = data.table,
  data.table.description = NULL,
  data.table.quote.character = NULL,
  data.table.url = NULL,
  other.entity = NULL,
  other.entity.name = other.entity,
  other.entity.description = NULL,
  other.entity.url = NULL,
  provenance = NULL,
  user.id = NULL,
  user.domain = NULL,
  package.id = NULL,
  write.file = TRUE,
  return.obj = FALSE,
  x = NULL
)

Arguments

path

(character) Path to the metadata template directory.

data.path

(character) Path to the data directory.

eml.path

(character) Path to the EML directory, where EML files are written.

dataset.title

(character) Title of the dataset.

temporal.coverage

(character) Beginning and ending dates of the dataset in the format "YYYY-MM-DD" (e.g. temporal.coverage = c('2012-05-01', '2014-11-30')).

geographic.description

(character) Description of datasets geographic extent. Don't use this argument if geographic coverage is supplied by geographic_coverage.txt.

geographic.coordinates

(character) Coordinates of datasets geographic extent. Coordinates are listed in this order: North, East, South, West (e.g. geographic.coordinates = c('28.38', '-119.95', '28.38', '-119.95')). Longitudes west of the prime meridian and latitudes south of the equator are negative. Don't use this argument if geographic coverage is supplied by geographic_coverage.txt.

maintenance.description

(character) A description of data collection status (e.g. "ongoing", "complete"), communicating the frequency of updates.

data.table

(character; optional) Table file name. If more than one, then supply as a vector of character strings (e.g. data.table = c("nitrogen.csv", "decomp.csv")).

data.table.name

(character; optional) A short descriptive name for the table. Defaults to data.table. If more than one, then supply as a vector of character strings in the same order as listed in data.table.

data.table.description

(character; optional) Table description. If more than one, then supply as a vector of character strings in the same order as listed in data.table.

data.table.quote.character

(character; optional) Quote character used in data.table. If more than one, then supply as a vector of character strings in the same order as listed in data.table. If the quote character is a quotation, then enter '"'. If the quote character is an apostrophe, then enter "'". If wanting to include quote characters for some but not all data.table, then use a "" for those that don't have a quote character (e.g. data.table.quote.character = c("'", "")).

data.table.url

(character; optional) The publicly accessible URL from which data.table can be downloaded. If more than one, then supply as a vector of character strings in the same order as listed in data.table. If wanting to include URLs for some but not all data.table, then use a "" for those that don't have a URL (e.g. data.table.url = c("", "/url/to/decomp.csv")).

other.entity

(character; optional) Name of other.entity(s) in this dataset. Use other.entity for all non-data.table files. other.entity(s) should be stored at data.path. If more than one, then supply as a vector of character strings (e.g. other.entity = c('ancillary_data.zip', 'quality_control.R')).

other.entity.name

(character; optional) A short descriptive name for the other.entity. Defaults to other.entity. If more than one, then supply as a vector of character strings in the same order as listed in other.entity.

other.entity.description

(character; optional) Description(s) of other.entity(s). If more than one, then supply as a vector of descriptions in the same order as listed in other.entity.

other.entity.url

(character; optional) The publicly accessible URL from which other.entity can be downloaded. If more than one, then supply as a vector of character strings in the same order as listed in other.entity. If wanting to include URLs for some but not all other.entity, then use a "" for those that don't have a URL (e.g. other.entity.url = c("", "/url/to/quality_control.R")).

provenance

(character; optional) EDI Data Repository Data package ID(s) corresponding to parent datasets from which this dataset was created (e.g. knb-lter-cap.46.3).

user.id

(character; optional) Repository user identifier. If more than one, then enter as a vector of character strings (e.g. c("user_id_1", "user_id_2")). user.id sets the /eml/access/principal element for all user.domain except "KNB", "ADC", and if user.domain = NULL.

user.domain

(character; optional) Repository domain associated with user.id. Currently supported values are "EDI" (Environmental Data Initiative), "LTER" (Long-Term Ecological Research Network), "KNB" (The Knowledge Network for Biocomplexity), "ADC" (The Arctic Data Center). If you'd like your system supported please contact maintainers of the EMLassemblyline R package. If using more than one user.domain, then enter as a vector of character strings (e.g. c("user_domain_1", "user_domain_2")) in the same order as corresponding user.id. If user.domain is missing then a default value "unknown" is assigned. user.domain sets the EML header "system" attribute and for all user.domain, except "KNB" and "ADC", sets the /eml/access/principal element attributes and values.

package.id

(character; optional) Data package ID for the dataset described by this EML. Ask your data repository for a package ID. Missing package.id is assigned a UUID.

write.file

(logical; optional) Whether to write the EML file.

return.obj

(logical; optional) Whether to return the EML as an R object of class EML object. This EML object can be modified and written to file according to the EML R library.

x

(named list; optional) Alternative input to make_eml(). Use template_arguments() to create x.

Value

  • EML file written to eml.path.

  • EML object when return.obj = TRUE.

Details

make_eml() reads the contents of metadata templates, auto-extracts additional metadata from the data entities, appends value added content (e.g. resolving keywords to controlled vocabularies), and adds all the metadata content to locations in the EML schema according with best practice recommendations of scientists, data managers, and data repositories. The EML is then validated against the schema and written to file.

Character encodings in tabular metadata templates are converted to UTF-8 via enc2utf8(). Characters in TextType metadata templates are not yet converted. Note: This may lead to an inaccuracy and disconnect between values in the data objects and what is reported in the EML (e.g. a categorical variable listed in the EML may not be the same as it's corresponding value in the data object). For this reason it's important to work with UTF-8 encoded data and metadata.

Examples

if (FALSE) {

# Set working directory
setwd("/Users/me/Documents/data_packages/pkg_260")

# For 2 tables and 2 other entities
make_eml(
  path = "./metadata_templates",
  data.path = "./data_objects",
  eml.path = "./eml",
  dataset.title = "Sphagnum and Vascular Plant Decomposition under Increasing Nitrogen Additions",
  temporal.coverage = c("2014-05-01", "2015-10-31"),
  maintenance.description = "Completed: No updates to these data are expected",
  data.table = c("decomp.csv", "nitrogen.csv"),
  data.table.description = c("Decomposition data", "Nitrogen data"),
  other.entity = c("ancillary_data.zip", "processing_and_analysis.R"),
  other.entity.description = c("Ancillary data", "Data processing and analysis script"),
  user.id = "myid",
  user.domain = "EDI",
  package.id = "edi.260.1")
}