Render the contents of metadata templates into EML, validate, and write to file.
make_eml(
path,
data.path = path,
eml.path = path,
dataset.title = NULL,
temporal.coverage = NULL,
geographic.description = NULL,
geographic.coordinates = NULL,
maintenance.description = NULL,
data.table = NULL,
data.table.name = data.table,
data.table.description = NULL,
data.table.quote.character = NULL,
data.table.url = NULL,
other.entity = NULL,
other.entity.name = other.entity,
other.entity.description = NULL,
other.entity.url = NULL,
provenance = NULL,
user.id = NULL,
user.domain = NULL,
package.id = NULL,
write.file = TRUE,
return.obj = FALSE,
x = NULL
)
(character) Path to the metadata template directory.
(character) Path to the data directory.
(character) Path to the EML directory, where EML files are written.
(character) Title of the dataset.
(character) Beginning and ending dates of the dataset in the format
"YYYY-MM-DD" (e.g.
temporal.coverage = c('2012-05-01', '2014-11-30')
).
(character) Description of datasets geographic extent. Don't use this argument if geographic coverage is supplied by geographic_coverage.txt.
(character) Coordinates of datasets geographic extent. Coordinates are
listed in this order: North, East, South, West (e.g.
geographic.coordinates = c('28.38', '-119.95', '28.38', '-119.95')
).
Longitudes west of the prime meridian and latitudes south of the equator
are negative. Don't use this argument if geographic coverage is supplied
by geographic_coverage.txt.
(character) A description of data collection status (e.g. "ongoing", "complete"), communicating the frequency of updates.
(character; optional) Table file name. If more than one, then supply
as a vector of character strings (e.g.
data.table = c("nitrogen.csv", "decomp.csv")
).
(character; optional) A short descriptive name for the table. Defaults
to data.table
. If more than one, then supply as a vector of
character strings in the same order as listed in data.table
.
(character; optional) Table description. If more than one, then supply
as a vector of character strings in the same order as listed in
data.table
.
(character; optional) Quote character used in data.table
. If
more than one, then supply as a vector of character strings in the same
order as listed in data.table
. If the quote character is a quotation,
then enter '"'
. If the quote character is an apostrophe, then
enter "'"
. If wanting to include quote characters for some but
not all data.table
, then use a "" for those that don't have a
quote character (e.g. data.table.quote.character =
c("'", "")
).
(character; optional) The publicly accessible URL from which
data.table
can be downloaded. If more than one, then supply as
a vector of character strings in the same order as listed in
data.table
. If wanting to include URLs for some but not all
data.table
, then use a "" for those that don't have a URL
(e.g. data.table.url = c("", "/url/to/decomp.csv")
).
(character; optional) Name of other.entity
(s) in this
dataset. Use other.entity
for all non-data.table
files.
other.entity
(s) should be stored at data.path
. If more
than one, then supply as a vector of character strings (e.g.
other.entity = c('ancillary_data.zip', 'quality_control.R')
).
(character; optional) A short descriptive name for the other.entity.
Defaults to other.entity
. If more than one, then supply as a
vector of character strings in the same order as listed in
other.entity
.
(character; optional) Description(s) of other.entity
(s). If more
than one, then supply as a vector of descriptions in the same order as
listed in other.entity
.
(character; optional) The publicly accessible URL from which
other.entity
can be downloaded. If more than one, then supply as
a vector of character strings in the same order as listed in
other.entity
. If wanting to include URLs for some but not all
other.entity
, then use a "" for those that don't have a URL
(e.g. other.entity.url = c("", "/url/to/quality_control.R")
).
(character; optional) EDI Data Repository Data package ID(s)
corresponding to parent datasets from which this dataset was created
(e.g. knb-lter-cap.46.3
).
(character; optional) Repository user identifier. If more than one,
then enter as a vector of character strings (e.g.
c("user_id_1", "user_id_2")
). user.id
sets the
/eml/access/principal element for all user.domain
except
"KNB", "ADC", and if user.domain = NULL
.
(character; optional) Repository domain associated with
user.id
. Currently supported values are "EDI"
(Environmental Data Initiative), "LTER" (Long-Term Ecological Research
Network), "KNB" (The Knowledge Network for Biocomplexity), "ADC" (The
Arctic Data Center). If you'd like your system supported please contact
maintainers of the EMLassemblyline R package. If using more than one
user.domain
, then enter as a vector of character strings (e.g.
c("user_domain_1", "user_domain_2")
) in the same order as
corresponding user.id
. If user.domain
is missing then a
default value "unknown" is assigned. user.domain
sets the EML
header "system" attribute and for all user.domain
, except "KNB"
and "ADC", sets the /eml/access/principal element attributes and values.
(character; optional) Data package ID for the dataset described by this
EML. Ask your data repository for a package ID. Missing package.id
is assigned a UUID.
(logical; optional) Whether to write the EML file.
(logical; optional) Whether to return the EML as an R object of class EML object
. This EML object can be modified and written to file according to the EML R library.
(named list; optional) Alternative input to
make_eml()
. Use template_arguments()
to create x
.
EML file written to eml.path
.
EML object when return.obj = TRUE
.
make_eml()
reads the contents of metadata templates,
auto-extracts additional metadata from the data entities, appends value
added content (e.g. resolving keywords to controlled vocabularies), and
adds all the metadata content to locations in the EML schema according
with best practice recommendations of scientists, data managers, and
data repositories. The EML is then validated against the schema and
written to file.
Character encodings in tabular metadata templates are converted to UTF-8
via enc2utf8()
. Characters in TextType metadata templates are not
yet converted. Note: This may lead to an inaccuracy and disconnect
between values in the data objects and what is reported in the EML (e.g.
a categorical variable listed in the EML may not be the same as it's
corresponding value in the data object). For this reason it's important
to work with UTF-8 encoded data and metadata.
if (FALSE) {
# Set working directory
setwd("/Users/me/Documents/data_packages/pkg_260")
# For 2 tables and 2 other entities
make_eml(
path = "./metadata_templates",
data.path = "./data_objects",
eml.path = "./eml",
dataset.title = "Sphagnum and Vascular Plant Decomposition under Increasing Nitrogen Additions",
temporal.coverage = c("2014-05-01", "2015-10-31"),
maintenance.description = "Completed: No updates to these data are expected",
data.table = c("decomp.csv", "nitrogen.csv"),
data.table.description = c("Decomposition data", "Nitrogen data"),
other.entity = c("ancillary_data.zip", "processing_and_analysis.R"),
other.entity.description = c("Ancillary data", "Data processing and analysis script"),
user.id = "myid",
user.domain = "EDI",
package.id = "edi.260.1")
}