Machine Readability

Metadata provide important information about a data resource. In theory, this information can be provided in many forms. For instance, the methods section of a scientific journal paper can be considered metadata.

However, a long text description is usually a human-readable only format, not one that a machine can efficiently parse and understand. Machine-readable descriptions have a consistent and known structure in which specific items of information are labeled and appropriately separated, allowing discovery by electronic systems. A variety of formats can provide appropriate demarcation and separation of metadata elements and values, including tab-delimited or comma-delimited text, and Extensible Markup Language (xml). Once a computer system is given the key to your metadata (a machine-compatible description of the format that is used), it can point users to your data via the metadata.

Here’s an example of metadata from the computer science world:

Say that you wanted to describe a laptop computer that you have: a PC running the Windows Vista operating system, and which has Microsoft Word, Microsoft Excel, and Mozilla Firefox software installed. 

A machine-readable version of this information is

“Laptop Computer”, “Operating System”, “Windows Vista”
“Laptop Computer”, “Platform”, “PC”
“Laptop Computer”, “Software”, “Microsoft Word”
“Laptop Computer”, “Software”, “Microsoft Excel”
“Laptop Computer”, “Software”, “Mozilla Firefox”

Notice, a resource (Laptop Computer) can have multiple elements (Operating System, Platform, Software), and an element can have multiple values (Microsoft Word, Microsoft Excel, Mozilla Firefox).

Here’s an example of metadata from the marine science world:

Say there is a multibeam dataset containing information from several multibeam instruments (SeaBeam 2000, SeaBeam 2112, and EM120), and providing data in two formats (MB32 and MB57). 

Here’s what it could look like in a machine-readable format:

“Multibeam Data”, “Sensor System”, “SeaBeam 2000”
“Multibeam Data”, “Sensor System”, “SeaBeam 2112”
“Multibeam Data”, “Sensor System”, “EM120”
“Multibeam Data”, “Format”, “MB32”
“Multibeam Data”, “Format”, “MB57”

In this case, a resource can have multiple elements, but when the information is represented in a machine-readable format, each line represents exactly one resource-element-value combination. These examples both use comma separated text (with quotation marks to delimit text vs number fields), but, as mentioned before, there are multiple ways to format machine-readable text. 


Suggested Citation

Neiswender, C., Stocks, K. 2011. "Machine Readability." In The MMI Guides: Navigating the World of Marine Metadata. Accessed October 26, 2020.