Machine Readability

Description of one of the highly desirable criterion of metadata—machine readability

MetadataData about data. Metadata provides a context for research findings, ideally in a machine-readable format. It enables discovery of data via an electronic interface, and correct use and attribution of findings. Related Guide provide important information about a data resource. In theory, this information can be provided in many forms. For instance, the methods section of a scientific journal paper can be considered metadata.

However, a long text description is usually a human-readable only format, not one that a machine can efficiently parse and understand. Machine-readable descriptions have a consistent and known structure in which specific items of information are labeled and appropriately separated, allowing discovery by electronic systems. A variety of formats can provide appropriate demarcation and separation of metadata elements and valuesMetadata values are the content connected to metadata labels in a metadata element. For example, if the metadata label is "date", the metadata value could be "May 13, 2007". Related Guide, including tab-delimited or comma-delimited text, and Extensible Markup Language (xml). Once a computer system is given the key to your metadata (a machine-compatible description of the format that is used), it can point users to your data via the metadata.

Here’s an example of metadata from the computer science world:

Say that you wanted to describe a laptop computer that you have: a PC running the Windows Vista operating system, and which has Microsoft Word, Microsoft Excel, and Mozilla Firefox software installed. 

A machine-readable version of this information is

“Laptop Computer”, “Operating System”, “Windows Vista”
“Laptop Computer”, “Platform”, “PC”
“Laptop Computer”, “Software”, “Microsoft Word”
“Laptop Computer”, “Software”, “Microsoft Excel”
“Laptop Computer”, “Software”, “Mozilla Firefox”

Notice, a resource (Laptop Computer) can have multiple elements (Operating System, Platform, Software), and an element can have multiple values (Microsoft Word, Microsoft Excel, Mozilla Firefox).

Here’s an example of metadata from the marine science world:

Say there is a multibeam dataset containing information from several multibeam instruments (SeaBeam 2000, SeaBeam 2112, and EM120), and providing data in two formats (MB32 and MB57). 

Here’s what it could look like in a machine-readable format:

“Multibeam Data”, “Sensor System”, “SeaBeam 2000”
“Multibeam Data”, “Sensor System”, “SeaBeam 2112”
“Multibeam Data”, “Sensor System”, “EM120”
“Multibeam Data”, “Format”, “MB32”
“Multibeam Data”, “Format”, “MB57”

In this case, a resource can have multiple elements, but when the information is represented in a machine-readable format, each line represents exactly one resource-element-value combination. These examples both use comma separated text (with quotation marks to delimit text vs number fields), but, as mentioned before, there are multiple ways to format machine-readable text. 

 

Suggested Citation

, 2011. "Machine Readability." In The MMI Guides: Navigating the World of Marine Metadata. http://marinemetadata.org/guides/mdataintro/machinereadability. Accessed February 9, 2012.