[entity] = dataTable, spatialRaster, spatialVector, storedProcedure, view, otherEntity

This element is found at this location (XPath):

General information: If at all possible, do not publish data in dated, proprietary, binary formats such as MS-Excel, and instead, export to plain text representations such as csv. The entity types <dataTable>, <otherEntity> and <view> cover many commonly encountered data structures and are covered here. <spatialRaster>, <spatialVector>, <storedProcedure>) will be addressed in more depth in a future version of this document. Table 1 gives the general features of EML’s six entity types, to assist in selection.

Table 1. Summary of the six entities in EML 2, including the type of data entity typically described with that element, how they are created and a brief description of its metadata.

Element name Used for Created from Metadata features
dataTable Static ASCII tables export from code, RDBMS or spreadsheets columns/rows named and defined, e.g., measurement and storage typing
otherEntity Binary files, images, maps, KML, KMZ, code applications type of entity
spatialRaster grid, raster cell data, remote sensing data applications, stylesheet conversions. See "Other Resources" spatial organization of the raster cells, their data values, and if derived via imaging sensors, characteristics about the image and its individual bands
spatialVector lines, points polygons, KML (if converted), ESRI shape files applications, stylesheet conversions. See "Other Resources" information about the vector's geometry type, count and topology level
view Data returned from a database query RDBMS similar to dataTable, plus description of the query
storedProcedure Data returned from a stored procedure in a database RDBMS similar to dataTable, plus procedure’s parameters

Every EML data entity has a set of elements in common, called the EntityGroup tree, which describe general information about any data resource. Other elements are provided which are unique to each entity type. The elements in the EntityGroup appear first, and are

<physical> (including optional <access>)

<alternateIdentifier> (optional): The primary identifier belongs in the id attribute of the entityName (e.g., <dataTable id=“xxx”> , but this tag can accommodate additional identifiers that might be used, possibly from different data management systems. It is used similarly to the <alternateIdentifier> element at the dataset level, above.

<entityName> (required): the name of the table, file or database table. In the early phases of EML adoption, this was often the original ASCII file name. However, a better analogy is that the <entityName> is a class, e.g., “FLS time series of air temperature at field station,” with its instantiation (filename) in the <objectName> element (see below).

Context: The EDI repository requires that <entityName>s be unique within the entity.

<entityDescription> This should be a longer, more descriptive explanation of the data in the entity. Like all descriptions, it is human-readable, and should help determine if it is appropriate for a particular use.

The <physical> tree (/eml:eml/dataset/[entity]/physical) further describes the physical format of the data.

<objectName> should be the name of the file when downloaded, or exported as text from a database. The <objectName> often is the filename of a file in a file system or that is accessible on the network.

<externallyDefinedFormat> For data entities in prescribed formats (e.g., NetCDF, KML, Excel), name that format in externallyDefinedFormat/formatName. It is recommended that where possible, formats are drawn from formatNames in DataONE’s objectFormaList. Descriptions that are software-specific should include manufacturer, program, and version, e.g., “Microsoft Excel OpenXML.”

<distribution> provides information on how the resource is distributed, and the contents of this tree was generally covered at the dataset level. However, there are a few points which will be reiterated here.

The content of a <url> element at the entity level should deliver data, and not point to another application or use page. The <url>’s attribute, “function,” should have the value “download.” This is implied if the “function” attribute is omitted.

As of EML 2.1, there is also an optional <access> element in a <distribution> tree at the entity level. This element is intended specifically for controlling access to the data entity separately from the metadata. For more information on using the <access> tree, refer to the general access discussion above.

<coverage> provides information on the geographic, spatial and temporal coverages used in this [entity]. See the discussion at the dataset level for more information.

<methods> provides information on the specific methods used to collect information in this [entity]. Please see the discussion at the dataset level for more information.

<additionalInfo> is a text field for any material that cannot be characterized by the other elements for the data type.

Example 20: The elements in the EntityGroup, showing the entity.

      habitat description for the sampling locations
            <onlineDescription>f1s-1 Data File</onlineDescription>
            <url function="download">http://www.fsu.edu/lter/data/fls-1.csv</url>

Each data type has a specific set of elements that follow the common elements. Table 2 shows the specific trees that are applied to each of the data type.

Table 2. Elements specific to each of the six entity types.

Entity Type Typical Uses Elements following EntityGroup
<dataTable> Static ASCII tables <attributeList>
<view> Data returned from a database query <attributeList>
<storedProcedure> Data returned from a stored procedure in a database <attributeList>
<otherEntity> <attributeList>
<spatialRaster> Lines, points polygons, KML (if converted), ESRI shape files <attributeList>
<spatialVector> Lines, points polygons, KML (if converted), ESRI shape files <attributeList>