Template Guidance on OWL and Creating Ontologies

OWL

OWL is a core component of the Semantic Web, which is a universe of metadata and ontologies expressed in machine-readable format along with software tools that allow the understanding of semantic relations among heterogeneous and distributed resources in the Web.

OWL is based on technologies recommended by the World Wide Web Consortium, such as the extensible Markup Language (XML), Resource Description Framework (RDF), and Uniform Resource Identifier (URI). It combined these technologies to specify how a document can describe terms and the relationships between them.

The URI technology allows a Web user to uniquely reference a page or other resource, enabling capabilities like displaying a page or downloading a file when a user clicks on a link, or naming distinctly every resource in the Web. RDF and OWL uses the URI to link, talk about, complement, use, and extend distributed resources. (MMI has collected more information about URIs.)

Creating Ontologies

Motivation

To make it easier to work with marine science data, the MMI project wants to standardize the way we work with that data. MMI strongly encourages the effective use of interoperable metadata, including well defined vocabularies. Well-defined vocabularies simplify data publishing, discovery, documentation, and accessibility. Vocabularies are complex because they can be found in different organization systems and formats. A common model or format to express controlled vocabularies facilitates interoperability among information systems (See more about harmonization here. The common format selected by MMI is OWL.

Ontology creation guidelines

As noted above, vocabularies are found in many different formats. Sometimes their terms are categorized (grouped or organized, perhaps in a hierarchy) and sometimes they are not. Selections of what should be a class (a general group of terms) or instance (a specific term) is sometimes not trivial. Here we present the technique we are using for the MMI project.

An OWL ontology is composed basically of classes, properties and individuals. (Here is classic tutorial about ontologies.) The questions that will answer in the following sections is what should be a class, a property and an individual when constructing marine ontologies from existing vocabularies.

Selection of classes

A class is a term that represents a category of individuals. For example: Marine-Variables is a class that can help categorize marine terms. The original terms are available in some kind of format. A source for a class name from different encodings is shown in the table below. For example, in GCMD the term 'variable' is a category, while in BODC the term 'parameter' is a category. These group names are expressed as a class in an OWL ontology.

Encoding Source for class
Plain list Title (heading) of the list
Table Title of the table
XML file Element tag
RDBS Name of the table (If the term and its attributes are stored in a table)
UML Name of the class (If the term and its attributes are stored in a class)

Selection of individuals (instances)

The terms in a vocabulary that are not the main categories are said to be individuals. You can ask the following question to determine whether an individual is the appropriate type: Is the proposed instance a member of the class? For example, we can say that "iron is a member of the class elements", so iron is an instance, while elements is a class.

Is it a class or an individual?

One of the major design issues in developing an ontology is determining whether to categorize something as a class or an instance. Although there is no perfect answer to this challenge, the following are considerations when making this decision (some of these considerations adapted from Hull, Duncan and Drummond, Nick. 2005. A Practical Introduction to Ontologies and OWL):

  • If a concept can have kinds, then it is a class (e.g., kinds of sensors). If not, it is an individual (e.g.,downward looking ADCP). Classes usually correspond to naturally occurring sets. A concept should be identified as an individual only if it can never have kinds.
  • A class describes all the features that define the concept in the domain. Meanwhile, an instance represents a specific object whose type is characterized by the concept in the domain.
  • If you make a new concept about something, it is a class (e.g., moored buoy). But, if you just stated a fact about it, then it is an individual (e.g., M1 is a moored buoy).
  • Proper nouns are almost always individuals.
  • Articles and singluar indicate an individual (e.g., the M1 buoy). Plurals usually indicate classes (e.g., buoys). However, this can be a difficult guide to follow if you are using a convention to name classes in the singular.
  • A "that" clause usually indicates a class (e.g., the sensors that measure wind).
  • The decision to categorize something as a class or individual is largely a function of what level you need to be able to query the ontology (how you want to use it, what information you want to find).

Selection of individuals' properties

Each individual has its own collection of properties in the ontology. For example, an individual will have a short-name, id, definition, date of creation, author, and similarly descriptive characteristics. These properties could be data-type properties or object properties.

data-type property (owl:datatypeProperty)
If the value the property can take (the range of the property's values) is a number or string
object property (owl:objectProperty)
If the value a property can take is another resource (another URI)

Attributes of the vocabularies are mapped to properties in OWL. To select the properties in OWL, the following sources can de identified: 1) Owl built-in properties, 2) Dublin Core properties, 3) Other properties.

OWL built-in properties (they also include RDF properties) are the first choice when selecting a label attribute of a term. For example, an OWL built in property is (rdfs:comment). For a complete list look at the OWL Appendix C reference. Dublin Core properties are good because of their wide acceptance, tools support (e.g. OWL and JENA), and compatibility with RDF. The third source of properties is other properties, which includes any arbitrary given label. If using OWL, these properties will have a unique namespace to avoid semantic conflicts.

A fast approach to create an ontology from an available vocabulary is to mimic the attributes of the term in the original source as data-type properties in OWL (Third approach); however at MMI we want to map terms across ontologies and do something useful with this mapping, like getting to know what units they have and trying to plot them. For this reason we have selected a minimum number of properties that a vocabulary should have.

The properties and the name used for the source of these properties are explained below. In the following list Def stands for definition and Map stands for the property name in the ontology. An x before a colon denotes that any namespace can be use, or in other words that it is an arbitrary property. The minimum suggested set of properties are:

Unique identifier
Def: Unique identifier conforming to XML names specifications.
Map: rdf:ID
Original Unique identifier
Def: Original unique identifier of the vocabulary. Is the original label that will be used to query the system. A valid XML name is not necessary.
Map: rdfs:label
Definition
Def: Definition of the term.
Map: Two strategies:
  • dc:description(preferred)
  • rdf:comment
Units
Def: Units of the term
Map: Two strategies:
  • Use units only as a string value of an owl:datatypeProperty
  • Create a class to store the units as individuals and link each term of the vocabulary to a unit using an owl:ObjectProperty, as follows:
    1. Create a Class (rdf:Class) named here: x:unitTypes.
    2. Create individuals for this class, based on the original units of the vocabulary. For this the units should be transform into valid XML names.
    3. Store the original unit values in a field. This is an owl:datatypeProperty named here x:originalUnits.
    4. Create an owl:objectProperty named x:hasUnit, whose domain is the class created and whose range is the x:unitTypes