Selecting a Standard
MetadataData about data. Metadata provides a context for research findings, ideally in a machine-readable format. It enables discovery of data via an electronic interface, and correct use and attribution of findings. Related Guide standards are formal specificationsAny description of how to store metadata. Specifications have no limitations on the level of required documentation and no requirement for formal approval, publishing or governance by a broad community-based organization. Related Guide of how metadata should be expressed, and following an accepted metadata standard helps ensure that data are appropriately described for later discoveryUse of metadata values or vocabularies to find metadata or data sets. Related Guide and reuse. But there are many standards, and not all standards are appropriate to a particular project. In choosing a standard from the hundreds available, it is important to evaluate options based on project needs with the goal of creating an interoperable system.
Some criteria for selecting an appropriate standard are presented below, divided into the essential criteria, and some optional additional considerations.
Essential Criteria
Who developed the standard?
The developer will provide an indication of the standard’s authority. The most compelling standards tend to be developed by people knowledgeable in a particular discipline or technology. The more collaborative the development process, the greater breadth of understanding went into its initial formation.
Who has implemented and currently uses the standard?
Most standards are implemented by a variety of users. The wider the community of users, the broader the applications for which the standard is likely to be suitable. A broadly used standard is also likely to have additional tools and resources available.
It is not always the case that the most broadly implemented standards are the best for an individual project. From an interoperabilityThe ability of two or more information systems to exchange metadata with minimal loss of information. Related Guide standpoint, the most compelling standards are those that are implemented by a variety of users within your scientific area. If multiple organizations have implemented the same standard, then communicating the metadata between organizations in your area can take place with less semantic or syntactic mediation.
Who currently maintains or sponsors the standard?
The sponsor of a standard (that is, the developer or maintainer) affects its authority, relative importance, and acceptance. Many standards are initiated because of a well-documented need within a particular community. These standards tend to be maintained by multiple organizations working collaboratively. For some standards, it is very difficult or impossible to identify who is responsible for maintenance. If there is no clear maintainer, it is likely that the standard will not evolve with the field and may become less and less applicable and useful.
Where is this standard in the development process?
The stage of maturity of a standard is an indication of its development level. Stages of maturity are categorized in the following ways:
- Missing: The maturity cannot be determined.
- Emerging: The standard is actively being developed—it is in draft, under community review, discussed on mailing lists or forums, mentioned in abstracts, etc.—but has not been formally released.
- Existing: The standard is available for public use, has been released or widely adopted, and is sponsored or maintained.
- Declining: The standard has less use and is either no longer maintained on a regular basis or is routinely superseded by another emerging or existing standard in the community.
Emerging standards are those undergoing review and first-generation implementation. Projects that implement emerging standards are at the cutting edge. They tend to provide feedback to the developer or maintainer that will result in further development. Well-established, existing standards are ideally accompanied by a dynamic community of users and a variety of resources that can be used in implementation, such as profilesThe community-specific application of a metadata standard. Related Guide, extensions, or vocabulariesA set of terms (e.g., words) that are used in a specific community. Related Guide.
Ideally, the metadata manager will choose a standard that is emerging, or existing. Declining standards should be avoided for obvious reasons, and standards for which the maturity level can't be assessed are unlikely to be appropriately documented and maintained. Using an emerging standard may require adjustment over time as it evolves, but if there are no appropriate existing standards, an emerging standard may be the best choice.
Optional Criteria
The optional criteria are much more focused on how implementation of the standard will affect an individual project. These questions represent additional things to consider before implementation.
What is the purpose of this standard?
Each standard is developed for a particular reason. Understanding the reason for the standard's development will provide an indication of how it will benefit, or what it will cost, a particular project. A particular standard may have been created to resolve issues in areas such as metadata format, transmission protocol, limited metadata elementsIndividual instance of a metadata label and value pair. For example, "creator: John Doe" is a metadata element. Related Guide, or multiple project objectives. The scenarios below illustrate some of these issues and provide some examples of standards that were created to resolve them.
- Some communities have many metadata formats. While the metadata include the same concepts and terminology, it is nearly impossible to sift through the varying formats. A standard could prescribe a specific syntax for all metadata files within that specific community (for example, NSDLNational Science Digital Library).
- In some cases, the metadata formats are standardized, but there is not one accepted transmission protocol. One project might submit its metadata via nightly database dumps, while another might use an established Web serviceStandardized way of integrating Web-based applications using open standards over an Internet protocol backbone. Web services share business logic, data and processes through a programmatic interface across a network. The applications interface, not the users. to submit metadata in near-real-time. In this case, a technical standard might prescribe a specific methodology for transmitting metadata (for example, Z39.50).
- Some datasets, in their native format, present limited metadata (filename and date, for example). The community of data producers might collaborate on a standard that stipulates required metadata (for example, Marine Geophysical Data Exchange Format - MGD77).
- A project with multiple types of data may have overlapping needs not fulfilled by a particular standard. Implementation of a standard that fulfills multiple objectives needed by a project presents a greater cost-benefit evaluation.
What are the consequences of implementation?
It is often the case that implementation of a standard produces consequences. In some cases, the effects will be positive. In others, the negative effects may outweigh the benefits. Some of these consequences might include the following:
- Compliance with funding agency requirements (positive consequence)
- Interoperability with other projects (positive consequence)
- Need for extensive reorganization and republication (possible negative consequence)
What resources are available for implementation?
While valuable in many ways, standards can be difficult to implement, especially for new users. A standard that is presented with a suite of well-designed tools and resources available for implementation is more compelling than a standard without them.
Types of resources might include:
- Instructional material
- Human support
- Domain-specific profiles
- Software packages to create and/or publish standards-compliant metadata
- Well-developed protocolsA strategy for transmitting data between systems. A protocol can be used not only over the internet, between computers, but also between applications running anywhere. Examples: FTP, SNMP, SSH.
- Established controlled vocabulariesA managed list of terms. In the context of vocabularies, management typically includes careful selection of terms, maintenance of terms over time (i.e. addition, deprecation, modification), and presentation of the vocabulary in an accessible format. Related Guide
Not every data manager will need or want the same set of resources. It is important to know a project’s needs and evaluate a standard based on those specific needs.