OOS Semantic Interoperability Workshop Report

The 2008 OOS Semantic Interoperability Workshop (OOSSI) was attended by 46 participants. The outcome of the workshop's lead-in, the OOS Interoperability Planning workshop, is described in the OOSIP formal workshop report.

In this OOSSI Workshop Report we summarize the reports of the individual workshop teams, who each worked on their own vocabulary definitions. Those teams' reports can be found at the following links:

Goals

Goals for the teams were fairly diverse, comprising the following:

  • Entrain the IT staff into the community, culture, and vocabulary of those working to build automated and largely unattended data systems in oceanography.
  • For our IT staff to take this knowledge and newly formed common understanding and jointly build an interoperable regional data system.
  • Review and comment on topic-specific vocabularies and discuss the process of developing, implementing and registering said vocabularies.
  • Practice mapping terms between different vocabularies.
  • Review an existing project vocabulary, classify its terms, register it with the MMI Ontology Registry and Repository, and map it to the corresponding MMI vocabulary.
  • Review and comment on project's sensorML instance, showing use of project-specific vocabularies, demonstrate coding of steps for science data processing, QC flagging, and provenance.
  • Introduce participants to OOSTethys and demonstrate project and OGC SOS methods: getCapabilities, describeSensor and getObservation which use sensorML instances.
  • Begin the process of mapping vocabulary terms in two existing project parameter name dictionaries, which were developed somewhat independently of each other and designed to support different research programs. There is considerable overlap in the parameters, but many of the terms describing identical observations describe parameters that are reported in different units (e.g. gravimetric as opposed to
    volumetric).
  • Longer term, continue the mapping process with these two dictionaries, working to form a single
    controlled vocabulary to support a project database. Ultimately, we hope to integrate the
    parameter dictionaries from all programs whose data we manage, and map them to a single parameter vocabulary for the project.

Preparation and logistics

Many comments valued the materials provided in advance of the workshop, and several projects leveraged the predecessor workshop (OOSIP) outcomes to prepare materials for this workshop. In several cases vocabularies were produced in advance of the OOSSI workshop; some of these vocabularies were registered with the MMI repository before the workshop began.

Advance preparation for this workshop also spurred development or publication of some vocabularies and services at the local project level.

Workshop Presentations

The workshop content and speakers were well reviewed:

The preparatory talks and discussions were very useful in bringing all team members
up to speed and helped us to achieve a shared understanding of the mapping
tasks. The technical support provided during
the workshop was especially helpful. Our
questions were answered in a timely manner, and our observations and
suggestions for improvement to the VINE tool were often implemented in near
real time. This enabled us to focus on the challenge of mapping terms.

[The invited speakers'] talks overlapped in a way that gave the audience several different views on the same subject material -- a boon to learning.

Process

We felt that we had a fair understanding of the task at hand but even so we underestimated the magnitude.

The processes largely began well before the workshop, collected information from multiple sources, and became highly interactive during the workshop. The most common benefits cited improvements in community understanding and engagement.

Prior to the workshop we collected vocabulary lists from each of the regional data providers. Most parameter names they use locally are either non-standard (used in their databases) or generic (used on their websites). We brought a candidate vocabulary to the workshop which is based on the Climate and Forecast (CF) metadata convention augmented for oceanography.

We started with an existing version of the vocabulary developed over the past summer as part of a community exercise, converted it to RDF using "voc2rdf", and registered the result in mmisw.org. We then reviewed the result, modified and re-classified some terms, and registered the revised version at mmisw.org. Finally, we created a mapping from the vocabulary to an MMI vocabulary (already registered) using "VINE", and registered the result.

At the workshop, two of our team presented their work and demonstrated how they were registered using voc2owl. Using two qc flagging vocaabularies, we demonstrated mappings and very quickly learned that it is very important to have the participation of domain experts and suggest that the mapping(s) be reviewed by the term author(s).

During this initial attempt to map terms used in the data sets we manage, we decided to use two term dictionaries already in use by our group, ones with which we were familiar. Once we understand our own term dictionary, we plan to map as many of these terms as possible to name spaces associated with registered ontologies. One member of our group had worked previously with other MMI community members on vocabulary mapping, but the process was new to the other two team members. All team members found the mapping exercise informative, a worthwhile endeavor and a good use of time.

Vocabulary Creation

The metrics for vocabularies submitted to the MMI Registry and Repository varied widely. Typically one to two vocabularies were registered, in just one or two versions; the maximum number of registered vocabularies was 4. One team created over 100 mappings and indicated 700 mappings would be needed; others did not report on their mappings.

Our group used two pre-existing vocabularies associated with two of our legacy
databases and converted them to RDF using the Voc2RDF tool provided by the
workshop coordinators.

One team was not sure whether its vocabularies were in fact registered or saved, indicating a problem with the interface and explanatory material (or process).

Conclusions

The following neatly summarized the clearest outcome of the workshop.

Few of the project staff had any prior experience registering or mapping vocabularies or appreciation for the subtleties of mappings that were not exact matches. These two workshops placed interoperability in context for our group and clearly demonstrated the importance of controlled-vocabularies and ontologies to achieving semantic interoperability. This goes a long way towards breaking down resistance to change in the community.

The workshop also resulted in many lessons and comments about the software and semantic framework. These lessons are listed in this semantic framework notes page. While many of the notes are general reflections, they represent a good starting point for use cases and changes, not just for the software tools but for the entire semantic framework MMI is presenting.

Workshop Lessons Learned (and tips for others)

A common thread, consistent with the results of the previous mapping workshop, was the need for domain expertise to be readily available.

The VINE tool is very effective at facilitating mapping between vocabularies. It is essential to have access to domain expertise to support the mapping process and provide accurate answers as they arise. If those doing the mappings do not possess domain knowledge, then that knowledge must be made available in some way during the mapping exercise (e.g. via speaker phone, email or Internet chat tools).

It is very important to have the participation of domain experts and suggest that the mapping(s) be reviewed by the term author(s).

Make prior arrangements to provide access to domain expertise to support the mapping process and to provide accurate answers as they arise.

Other lessons focused on the preparation of the workshop itself. Although overall the logistics received good marks (above), it was clear that the software provided some challenges at first. The ability of the workshop team to update and enhance the software 'on the fly' was critical to the success of the workshop.

It was essential to have technical support during the breakout group mapping exercise.

Finally, there was some concern about how the work of the workshop could be effectively pursued after the workshop ended. (This comment applied equally to the workshop organizers, and to the activities of some of the teams.)

Have a plan in place following the workshop to keep the momentum going.

Summary assessment (of value for our team)

The workshop was a good way to get our group started on the vocabulary mapping process. We gained a shared understanding of the process itself, and an enlightened respect for the magnitude of the task. The presentations early in the workshop provided insight into enhanced data discovery systems
powered by semantically-enabled interfaces. We are continuing the vocabulary mapping process and expanding the task to include schema mapping of other elements in our metadata database. We intend to use the mapping results to add semantic capabilities to the project's database interface.