Vocabularies: Dictionaries, Ontologies, and More

Every discipline has its own terminology. Consider terms that are used to describe vertical distances. The word “altitude” refers to the distance of something above a reference point like ground level, such as an airplane in flight. If we were examining a set of blueprints for a building we would not use the word “altitude” to describe the level of the rooftop, even though it is also a vertical distance above ground level. Instead, we would use the word “height.” Similarly, if we were in a boat looking down into the water, we would use “depth” rather than “height” to describe the vertical distance.

Also, a single term may be used in multiple communities but with different connotations. For example, an oceanographer may use the term “altitude,” in the operation of a Remotely Operated Vehicle (ROV), to mean the distance above the ocean floor.

In the context of metadata, having multiple terms with the same meaning—and terms that can have different meanings in different contexts—can make it harder for people to find and understand data. Using controlled vocabularies within metadata (instead of freely allowing any terminology to be used) can reduce confusion and improve data accessibility.

The Controlled Vocabularies section of the guides describes the importance of controlled vocabularies, the different kinds and their uses, how to implement existing controlled vocabularies, and some considerations for developing new ones.

What is a Controlled Vocabulary?

A vocabulary is a set of terms (words, codes, etc.) that are used in a specific community. In this example, “altitude,” “depth,” and “height” are all part of the vocabularies that scientists and engineers use to talk about vertical distances. It is common for terms to have different connotations in different communities.

A controlled vocabulary is a managed set of terms. The management can take different forms, but in controlled vocabularies the allowed terminology is restricted in some way. Within a metadata standard or specification, controlled vocabularies are often used to describe the allowed content within a metadata element. This is in contrast to a free-text metadata element. As in the example above, in a free-text element, users may choose to use height, altitude, or depth to describe a dataset containing vertical distances. A controlled vocabulary might limit the user's choices—and ensure consistent use of terminology—by specifying that only the term “depth” be used to describe the distance from the ocean’s surface to the seafloor.

For brevity throughout these guides, when we use the term “vocabulary,” we are usually referring to a controlled vocabulary.

Characteristics of a Good Controlled Vocabulary

At a minimum, a controlled vocabulary only needs to manage a set of terms in some way. However, a good controlled vocabulary—one that is easily understood and applied, is likely to be widely adopted, and which improves the clarity of metadata—is one in which the controlled terms are:

  • Accepted: the term must adhere to community practices.
  • Defined: the terms are precisely characterized; typically, this means the terms have rigorous definitions.
  • Managed: experts create, store, and maintain the controlled vocabulary according to agreed-upon procedures. Maintenance involves periodic review, addition of new terms, modification of terms, and occasionally deprecation of terms.

Note that this definition of a controlled vocabulary does not specify a particular scope of usage. Controlled vocabularies could be developed for a local project, for a broader community, or as part of a widely used standard or tool (ISO 19115).

Suggested Citation

Neiswender, C., Isenor, A., Montgomery, E., Bermudez, L., Miller, S.P. 2011. "Vocabularies: Dictionaries, Ontologies, and More." In The MMI Guides: Navigating the World of Marine Metadata. http://marinemetadata.org/guides/vocabs. Accessed September 16, 2019.