Definition of Metadata
MetadataData about data. Metadata provides a context for research findings, ideally in a machine-readable format. It enables discovery of data via an electronic interface, and correct use and attribution of findings. Related Guide are used to describe data or information. Metadata can describe just about anything you find on a computer, and the term is often used to refer to information about things that aren't on the computer. In environmental sciences like oceanography, metadata describe the information that scientists collect, telling users about the "who, what, when, where, why, and how" of a data set or data item.
We'll take some examples in a moment, but first let's look at how other organizations describe metadata. The National Information Standards Organization (NISO) defines metadata as "structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource." The World Wide Web Consortium (W3C) defines metadata as "machine understandable information for the web." The Federal Geographic Data Committee (FGDC) defines metadata as describing, "the content, quality, condition, and other characteristics of data." Put simply, metadata is data about data. It provides a context for research findings, ideally in a machine-readableIn the context of metadata, formatted in a way that is well defined and processable by the system's software and hardware. Metadata with this characteristic can be discovered, ingested, and presented by an electronic system (also known as 'computable'). Related Guide format. Once published, metadata can enable discoveryUse of metadata values or vocabularies to find metadata or data sets. Related Guide of data via an electronic interface, and correct use and attribution of your findings.
The results of data collection are generally some kind of object – a photo, map, description, data file, etc. These findings are useful, but generally do not contain information about how or where the data was collected, or by whom. The data object alone cannot be interpreted or used. However, if you provide the right descriptive information, your data can become incredibly useful. The additional information might include things like latitude and longitude, date collected, chief scientist, type of equipment used – the list can go on indefinitely. This context, or descriptive data, is the metadata.
The Difference Between Metadata and Data
Metadata describe a data set sufficiently to permit searching and using the data. However, it is not always clear if a particular piece of information should be classified as data or metadata. Some information, such as geographic coordinates of observations, can clearly be classified as both data and metadata. The distinction between metadata and data essentially depends on the context and the needs of a given application or user.
Briefly stated, any data that is required to make other data useful or searchable can be called metadata. Again, quoting the NISO guide, Understanding Metadata, “Metadata is key to ensuring that resources will survive and continue to be accessible into the future." Perhaps this statement provides the best explanation of the difference between data and metadata. In a conductivity-temperature-depth (CTDConductivity - Temperature - Depth ) profile, for example, a single temperature measurement may be lost without significantly degrading the value of the profile. The loss of positional information from a metadata record, however, renders the data almost useless.
Examples
For a simple, non-oceanographic example of metadata, consider your television. When you turn on a television and want information about the next show, you will probably go to an index of television shows - either in the form of a written summary, like the TV listings in a newspaper or a TV Guide, or more recently, as on-screen program information. The listings you will look at contain data (like show title, type of show, and a plot summary) about other data (the television shows themselves). The information contained in the TV Guide is metadata.
Scientific examples are more complex, but essentially the same concept. The notes written by scientists about their experiment, in lab notebooks, log books, or other documents, are metadata (information) about data (the results of the experiment). The notes describe when, where, how, and other characteristics of the experiment. This information becomes far more valuable when it is stored in a computer, so it can be easily searched and distributed to others. This information becomes even more valuable if it is maintained in standard ways, using standard terms; now machines can "understand" it (make use of it) and automatic processes can be developed to make the information more useful.
In the most advanced example, metadata can be created automatically by computer systems to describe when data is collected, what the data looks like, and where it came from. Other systems that process that data can characterize its quality automatically and record the results as metadata. For example, do the data pass certain tests, or are they consistent with related data? Systems can also transform it into more sophisticated forms, such as averaged values, or other derived values, automatically. Provenance metadata then need to be created to describe the transformationIn the context of crosswalking, transformation is the process of creating a target instance of the metadata description from the source instance. Related Guide processes that were applied. This level of sophisticated, automated metadata collection and use is typical of advanced analytical systems, and necessary to implement ocean observing networks like those proposed in the United States and Europe.
Here are a few oceanographic examples
Data: Photo of a newly discovered species of fish
Metadata: Location of discovery (latitude, longitude and depth), other fish in the area, salinity of the water, quantity discovered (school, single fish, two or three individuals), etc.
Data: Meteorological Measurements
Metadata: Location of readings (latitude, longitude, and height), instrumentation used to collect data, units, processing done to measurements
Data: Sediment Core Record
Metadata: Location of discovery (latitude, longitude and depth), description of stratigraphy, length, type of coring device
Please notice that different types of data require different types of descriptive information. However, there are some standard fields that should always be included (i.e. location, date collected).
Elements and Values
The terms "element" and "value" are often used to describe the structure of metadata. "Elements" refer to the categories of metadata used. They can also be referred to as properties, or more informally, as fields. "Values" refer to the actual information filled into an element. Using the above examples, "latitude" would be an element, and "+32.5" might be a value for that element from a particular data record; "core type" would be a property, and "piston core" and "megacore" would be values for that property.