The Importance of Controlled Vocabularies

Controlled Vocabularies are important to researchers for many reasons:

  • Consistency
  • Accuracy
  • Automation
  • Simplification of input
  • Interoperability
  • Enhancement of searches and discovery
  • Completeness
  • Long- and short-term management
  • Efficient use of time

In many cases, controlled vocabulary terms completely define the allowable content for a particular metadata element.

Discipline-Specific Standards for the Marine Community

The following describes some standards that have been created for specific topics relevant to the marine community. We use the term standard here in its more general sense to include full standards, as well as extensions and profiles.

Metadata and Vocabularies

Metadata is used to describe the aspects of something. In the MMI community, the item being described could be almost anything related to the marine community, such as a data set or a marine service.

Usage vs Discovery Vocabularies

In the guide Vocabularies: Dictionaries, Ontologies, and More, we used the term “altitude” to describe part of the spatial position of something. We may complete the spatial description by including the terms “latitude” and “longitude.” “Latitude” typically refers to a value that describes north-south placement (or y-coordinate ) of something on the earth (more generally a rotational ellipsoid). Used with the term “longitude” (to describe the east-west placement (or x-coordinate), we can fully specify the position of something on the earth.

Semantic vs. Syntactic Vocabularies

Semantics is the meaning of words. Semantic vocabularies provide meaning to the terms used in metadata in a way that is understandable to a human being. For example, the semantic vocabulary description for altitude might be, "the vertical position of a flying object."

A Last Resort: Developing a Local Vocabulary

Controlled Vocabulary Management

First, you must choose whether to use existing controlled vocabularies, or to implement and manage your own vocabulary. Management tasks can be avoided if you use the vocabularies managed by another organization. This is usually a good idea, as it will save time and effort and maximize sharing of terminology (see Choosing and Implementing a Controlled Vocabulary).

Developing Controlled Vocabularies

The following should help you define the terms in your new vocabulary. Remember, you have to make the vocabulary expandable (scalable) because there will likely be additions. As well, there are a few tricks you can think about before starting.

Tips and Tricks

Don’t Have Vocabulary Terms with Embedded Information

Don’t encode information within the vocabulary values. As an example, a value that contains encoded information may have certain characters as meaning certain facts about the value. For example, a single value like XT07aa might indicate an XBT temperature from a T-7 computed using coefficient set aa. Such a value contains information on the type of sensor, the model of sensor, the parameter being measured and processing information. This type of information should be split out of the single value, into multiple values.

How to Determine the Terms

To identify the terms of the vocabulary, you need to first examine the descriptions of your assets, looking for discrete (i.e., non continuous) content. Things that are measured are usually continuous, while things that have specific descriptions are usually discrete. Also, if you can count the total number of possible descriptions, it is likely to be discrete.

How to Create a Scalable Controlled Vocabulary - Allowing for Additions

The scalability of a vocabulary is an important aspect. The vocabulary should not be limited by the initial terms in the list. To avoid this, you need to examine the terms and think about the general class of things that all the terms are describing. Don’t think about an individual term (or an individual car, to extend the vehicle example). Rather, think about the general class of things. Now, attempt to define attributes of the general class. This may not be an easy process.


