Core Concepts

Controlled Vocabulary

A controlled vocabulary is as a set of restricted words, used by an information community when describing resources or discovering data. A controlled vocabulary prevents misspellings and avoids the use of arbitrary, duplicative, or confusing words that cause inconsistencies when cataloging data. Creating a controlled vocabulary following a minimum set of good practices, system developers dramatically increase the interoperability of both their vocabulary terms, and the data they describe.

While controlled vocabularies can be maintained in any format, for example text files, we strongly encourage maintaining them in a well-known ontology format. (These are described in the next subsection.) Using one of these well-defined formats enables interoperability with a large number of more advanced tools and systems.

RDF, SKOS, or OWL

A controlled vocabulary can be published in plain RDF (Resource Description Framework), SKOS (Simple Knowledge Organization System) [9], or OWL (Web Ontology Language) [1]. RDF provides relationships between terms in a relatively simple way, using subject-predicate-object expressions. SKOS and OWL are based on RDF, and define additional types of resources. The choice of using any of these formats for a controlled vocabulary depends on its application.

SKOS extends RDF schema to provide a model for expressing the basic structure and content of concept schemes, including thesauri, classification schemes, subject heading lists, taxonomies, terminologies, glossaries, and other types of controlled vocabularies. For example it defines conceptbroader and narrower and other types of resources that facilitate formalization of thesauri.

OWL extends RDF schema to allow representations of more complex relationships and more precise constraints on classes and properties. OWL enables complex relationships to be captured, which allows more advanced semantic engines to use those relationships to perform sophisticated reasoning. For example, OWL defines types like ClassDatatypeProperty and Subclass, and constraints on values for properties. Conceptually, SKOS is tuned for describing vocabularies, while OWL is tuned for describing models (e.g., of the real world).

RDF and SKOS files can generally be read by most ontology tools. They are a good way to document a vocabulary, and they provide a reasonably rich set of documentation options. We encourage the use of any of these formats, and suggest you use the most advanced format with which you are comfortable.

Note that the SKOS standard may evolve to be fully compliant with OWL, which would blur the distinction between them but not fundamentally change this recommendation.