Workshop Report on Ocean Observing Semantic Interoperability Planning

Marine Metadata Interoperability (MMI) Project
Workshop Report on Ocean Observing Semantic Interoperability Planning

August 26-28, 2008
US National Space and Technology Center, Huntsville, Alabama

by
Anthony W. Isenor and John Graybeal

Final
10 October 2008

Background

The concept of the global ocean observing system is founded on the premise that local observing systems will act as contributors to the larger global system. Though local systems are typically constructed to meet local or state needs, they can contribute to national systems and these national systems to a global system. Designing local systems to address their particular needs, while acting as a cohesive part of the larger system, introduces the problem of interoperability between systems. In today's network-enabled age, such interoperability is essential, as it provides economy-of-scale; in other words, the power of collective data acquisition, data fusion using multiple data sources, and even distributed response to crises.

The MMI Ocean Observing System Interoperability workshop was specifically designed to formulate potential solutions to perhaps the single greatest obstacle to a fully interoperable solution: the diverse vocabularies1 within the individual systems. The collective terminology of a system is referred to as a vocabulary; often this usage emphasizes the terms used to describe the data values, but it can apply to all the labels used in the system. For the interoperable solution to be realized, the vocabularies of one system must be understood by the other systems.

Workshop Summary

The Workshop brought together 40 specialists from across the US, Canada and France to discuss the issue of marine vocabularies within the context of an ocean observing system. Many participants represented larger efforts or organizations including the Quality Assurance of Real Time Oceanographic Data (QARTOD2) project, the QARTOD-to-OGC (Open Geospatial Consortium3) (Q2O) project, Global Earth Observation System of Systems (GEOSS4), European Sea Floor Observatory Network (ESONET5), and the US Integrated Ocean Observing System (IOOS6).

The workshop introduced the concept of a semantic framework7 to support the interoperability required for the ocean observing system. The description of the framework outlines a methodology that would allow independent systems, using different vocabularies, to combine their data into a product that uses a single vocabulary. The framework utilizes and supports numerous specifications that have been developed by standards organizations such as the World Wide Web Consortium (W3C8) and the Open Geospatial Consortium (OGC).

These specifications, the framework, and supporting software technologies allow the vocabulary terms from multiple systems to be linked, or mapped, to vocabularies from other systems. This explicit operation provides the framework with a knowledge base of terminology, the definitions that support the terms, and the relationships between terms. Every term—and its parent ontology—can each be explicitly and uniquely named using standard web identification strings. This means the usage and definition of terms can be precisely specified in any computational context.

The workshop also introduced the participants to the technologies that will support this framework. Several of these technologies are being implemented for the community's use by the MMI project. The tools include:

Voc2RDF9:  This tool runs in an internet browser window, and is available on the MMI website. The tool converts an ASCII comma-delimited set of terms and definitions into a Resource Description Framework (RDF10) document. (RDF is the basis of the Web Ontology language (OWL11), itself used as a harmonizing format for controlled vocabularies.) The tool establishes a namespace for the converted vocabulary, and optionally submits the RDF vocabulary to the MMI Ontology Registry.

MMI Ontology Registry and Repository:  The MMI Ontology Registry and Repository ('the Registry' for short) is a service-based storage location for ontologies (i.e., vocabularies in RDF or OWL). The Registry allows user searches of the terms within the ontology (i.e., the vocabulary terms), and of their attributes. The Registry also supports performing inference on the relationships, meaning that the Registry can chain together terms and relationships from multiple ontologies to create new logically derived relationships. Computer-based searches of the registered ontologies are also possible using services based on the SPARQL12 WC3 specification.

VINE13:  The Vocabulary Integration Environment is a tool created by the MMI Project to map vocabulary terms represented in OWL. The tool allows the user to identify terms from two vocabularies and map a relationship between the terms. As an example, relationships between terms may be defined as "same-as" or broader-than" indicating an equivalence between terms or the broader scope of one term compared to the other, respectively. A confidence level on the mapping may also be included. VINE stores the results of the mapping in a separate OWL (ontology) document.

The combined capabilities of Voc2RDF, the Registry and VINE, allow the creation of the ontologies and the mapping of terms between these ontologies. The functionality of the Registry enables automated systems to perform queries on the ontologies, and completely identify any linkage between terms within the mapped ontologies. The collection of ontologies and mappings collected in the MMI Registry become an open, common resource for the marine community to use (the first existence of such a large collection of environmental science vocabularies).

This ability to understand terms from multiple vocabularies is referred to as semantic mediation. Essentially, a system operating with a particular vocabulary can utilize the Registry's ontologies and mappings to automatically translate the vocabulary of the system to any other system vocabulary for which a mapping exists or for which a logical pathway of mappings exist.  This logical pathway refers to the all-important fact that mappings can be used in series or in part, thereby removing the requirement to map all ontologies to all other ontologies. This set of tools and capabilities represent the basis of the conceptual semantic framework.

After demonstrating prototype versions of all the necessary technologies, as well as a demonstration application using some of the framework, the workshop presenters declared that the full semantic framework—including tools, services, vocabularies, mappings, and applications to use all of these resources—could be operational in time for demonstrations at an already-planned November workshop (the November workshop will deal with construction of the semantic mediation layer).  The workshop organizers proposed that attendees could sign up for projects, execute them in time for the November workshop, and report on the results at the American Geophysical Union Fall Meeting14 in December 2008.

Results and Conclusions

Numerous workshop participants indicated their support of the conceptual framework by committing to provide various elements towards an operational semantic framework. These contributions include creating the ontologies of vocabulary terms, creating mappings pertinent to their activities, and using the resulting framework in their applications. Several participants agreed to help create and review documentation, and provide testing of the developed applications.

The workshop organizers committed to providing the necessary software and upgrades to enable the participants' success, and to provide guidance and assistance to interested participants in all phases of the project. The organizers will also provide a Registry loaded with key ontologies from the marine community (based on vocabularies collected at previous workshops, and added for this one), to enable effective term searches and mappings.

The organizers and attendees agreed to a schedule and collaboration plan for the work to be accomplished. A large number of the attendees agreed to return to the November workshop, to report on their results and further the progress on the semantic framework. Several participants anticipated submitting abstracts to the Fall AGU Meeting describing their work on the project.

The participants concluded by agreeing this framework held strong promise for addressing semantic interoperability issues among diverse data sets and data systems. They encouraged its continued promotion, once it is closer to being fully operational.