NEWS.md
template_geographic_coverage()
now accepts numeric values to the site.col
parameter (#121).emld
complicate validation of EML files input to some EAL functions. Validation of such inputs have been suspended until a permanent fix can be made. These changes have no impact on EAL working correctly, unless an invalid EML record is input, in which case the function will fail if there is a critical issue.data.table::fread()
.taxize::gnr_datasources()
. This issue has been fixed in the taxonomyCleanr
dependency.make_eml()
. See template_taxonomic_coverage()
function docs for more details. NOTE: These new methods don’t facilitate expansion of a taxon resolved in an unsupported system to the full classification hierarchy that is currently available when using ITIS, WORMS, or GBIF. That will require additional effort. Additionally, template_taxonomic_coverage()
now has an empty
argument for returning an empty template. This enhancement partially addresses issue #50.Semantic annotation: EML can now be annotated. This implementation supports two use cases:
template_annotations()
to create the annotations template
make_eml()
Note: All annotated elements are assigned ids and their annotations are placed both immediately under the parent element (subject) and within the /eml/annotations node through id+reference pairs. This redundant approach supports variation in where EML metadata consumers prefer to harvest this information and supports annotation of EML elements requiring id+reference pairs.
Provenance metadata template: This extends support for provenance metadata of data sources external to the EDI Data Repository. Create the template with template_provenance()
. Fixes issue #8
Allow creation of partial EML (part 2): This completes implementation of issue #34 by moving all evaluation of inputs to make_eml() (and associated warning and error handling) from various locations in the code base to validate_templates(). With this implementation comes a new approach to communicating input issues to the user via template_issues, an object written to the global environment and formatted into a human readable report (message) when passed through issues().
UTF-8 character encoding: EMLassemblyline extracts metadata from data objects and may malform this content if the character encoding is not supported. In an attempt to minimize this issue and convert metadata into the UTF-8 encoding expected by EML, the Base R function enc2utf8()
has been implemented anywhere metadata is extracted from data objects and written to file (i.e. templating functions) and anywhere template content is added to the EML (i.e. make_eml()
). Because this may create EML that inaccuratly represents the data object it describes (e.g. categorical variables encoded in UTF-8 but the data encoded in something else) warnings are now issued when the input data object is not UTF-8 (or ASCII) encoded as estimated by readr::guess_encoding()
. Additionally, EMLassemblyline documentation now emphasizes the importance of encoding data objects in UTF-8 first and then beginning the metadata creation process. An encoding conversion of TextType metadata (i.e. abstract, methods, additional_info) has not yet been implemented.
template_core_metadata()
and template_table_attributes()
.template_categorical_variables()
.template_geographic_coverage()
.user.domain
data.table
data.table.description
data.table.quote.character
data.table.url
other.entity
other.entity.description
template_table_attributes()
errors. This fix is an addendum to the prior fix (2.18.1).template_table_attributes()
errors. This has been fixed.make_eml()
(often communicating best practice recommendations) have been refactored to return warnings rather than errors. Fixes issue #34.template_geographic_coverage()
lat.col
and lon.col
arguments expect numeric inputs and error if non-NA missing value codes are present. The values are now coerced to numeric, only complete cases returned in the geographic coverage template, and no errors occur.make_eml()
function arguments user.id
, user.domain
, and package.id
. Details are listed in the function documentation.maintenance.description
of make_eml()
is no longer required however, a missing maintenance.description
will return a warning with the recommended best practice.mime
library. Undetected MIME Types are listed as “Unknown”. Fixes issue #68.make_eml()
arguments data.table.url
and other.entity.url
. Some use cases require assignment of a URL to only one in a list of two or more. This constraint as been relaxed so if a data object doesn’t have a corresponding URL then use the values ""
or NA
(e.g. if in make_eml()
the argument data.table = c("nitrogen.csv", "decomp.csv")
, and a URL only exists for the second object, then data.table.url = c("", "/url/to/decomp.csv")
.Installation: Simplified instructions so dependencies will be installed but and users will not be asked to upgrade installed packages (a point of confusion among many).
Default false numeric attributes to character: Default user specified numeric attributes to character class when the attribute contains character values outside of that specified under the missingValueCode field of the attributes.txt template. A warning alerts the user of the issue and preserves the original data by not coercing to numeric.
template_directories()
. The boilerplate is meant to be a reminder and save the user a little time. Fixes issue #36.make_eml()
but no errors were returned when missing from personnel.txt. The logic of validate_templates()
has been updated to fix this issue.template_categorical_variables()
has been updated to recognize more missing value code types.make_eml()
code: (For developers) The underlying code of make_eml()
is now more concise and understandable.make_eml()
via files or the input argument x
.data.table::fread()
to guessing a delimiter other than “\t”. This issue has been fixed by explicitly stating the expected field delimiter.template_categorical_variables()
to crash: Errors occurred when input file names contained spaces. Using spaces is still a common practice among users. To accommodate this while continuing to promote best practices, the naming restriction has been relaxed and the best practices have been made a warning. The function checking file presence and naming conventions is EDIutils::validate_file_name()
. An explicit file name specification (i.e. including extension) is now required, which precludes errors when the same file name is used among different file types in the same directory. Fixes issue #25.template_taxonomic_coverage()
: If the taxa of a dataset are in more than one table, then a user would want to extract the unique taxa from all the tables and compile into the taxonomic_coverage.txt template. Multiple inputs to the taxa.table
and taxa.col
arguments is now supported.make_eml()
began failing with release of the dependency libary EML 2.0.2. This has been fixed.;
delimiters: Data tables with semi-colon delimiters were not supported. This was fixed by updating EDIutils::detect_delimiter()
(issue #6 of the EDIutils package).data.table.description
in make_eml()
was used to fill in the entity name. However, they are not the same and entity name should be specified separately. This was fixed by adding data.table.name
and other.entity.name
as arguments to make_eml()
. The fix defaults data.table.name
to data.table
and other.entity.name
to other.entity
with a warning message. Fixes issue #24.template_geographic_coverage()
: NULL was output from this function when empty = TRUE
, which is mostly a cosmetic issue. This was fixed by implementing some simple logic. Fixes issue #32.utils::read.table()
. To fix this data.table::fread()
, a more autonomous and robust reader, replaced read.table()
for reading both data and metadata templates. Fixes issue #41.element thereby invalidating the EML. This was fixed by adding escape characters to the quotes.
template_taxonomic_coverage()
: Travis CI has been failing because of long responses from API calls made by template_taxonomic_coverage()
. To expedite tests and reduce errors, the example data now contains substantially fewer taxa to be resolved against authority systems.view_unit_dictionary()
function was opening the unit dictionary in a separate non-searchable window. By removing the utils
namespace from the function call the unit dictionary now opens within the RStudio source pane where searching is supported.EML
v2.0.0 refactor resulted in changes to how missing value codes are handled. This fix restores the original functionality where empty character strings in the missing value code and explanation fields don’t result in validation errors.make_eml()
function at a time. Valid sources are the geographic_coverage.txt template, the geographic.coordinates
and geographic.description
arguments of make_eml()
, and the deprecated bounding_boxes.txt template.template_categorical_variables()
. This issue has been fixed.taxonomyCleanr::make_taxonomicCoverage()
. This issue has been fixed in that projects GitHub master branch, and the necessary adjustments have been made to EMLassemblyline::make_eml()
.EMLassemblyline
has been refactored to run with the `EML’ v2.0.0 dependency.markdown
> Pandoc > docbook.template_geographic_coverage()
argument empty = TRUE
to create an empty geographic_coverage.txt template.template_table_attributes()
instead of with template_core_metadata()
. This is a more logical pairing.make_eml()
.EMLassemblyline
2.4.6 functions that should be otherwise accessible for backwards compatibility.pkgdown
.template_*
to simplify user understanding.
template_arguments()
Create template for all user inputs to EMLassemblyline
(i.e. metadata template content and function arguments) to entirely programmatic workflows with focus on supporting content ingestion from upstream metadata sources.template_categorical_variable()
Create categorical variables template (previously named define_catvars()
).template_core_metadata()
Create core metadata templates (previously part of import_templates()
).template_directories()
Create a simple and effective directory structure for EMLassemblyline
files and data package contents.template_geographic_coverage()
Create geographic coverage template (a refactor of extract_geocoverage()
).template_table_attributes()
Create table attributes templates (previously part of import_templates()
).template_taxonomic_coverage()
Create the taxonomic coverage template for resolving taxa to one or more authority systems and supporting creation of the hierarchical rank specific EML taxonomicCoverage element by make_eml()
.user.id
, user.domain
, package.id
) have been relaxed to enable creation of EML for other data repositories.data.table.description
and other.entity.description
for //dataTable/entityName and //otherEntity/entityName, respectively. This provides a more meaningful file description than the file name it self.Several templating functions, templates, and arguments have been deprecated. Full backwards compatibility of these functions, templates, and arguments will be supported for the next year (i.e. until May 1, 2020).
Functions:
import_templates()
is deprecated in favor of template_core_metadata()
(i.e. metadata required by all data packages) and template_table_attributes()
(i.e. metadata for data tables).define_catvars()
is deprecated in favor of template_categorical_variables()
.extract_geocoverage()
is deprecated in favor of template_geographic_coverage()
Templates:
Arguments:
data.files
is deprecated in favor of data.table
data.files.description
is deprecated in favor of data.table.description
data.files.quote.character
is deprecated in favor of data.table.quote.character
data.files.url
is deprecated in favor of data.url
zip.dir
is deprecated in favor of other.entity
zip.dir.description
is deprecated in favor of other.entity.description
affiliation
is deprecated in favor of user.domain
make_eml()
. This element was missing though documentation implied its existence.