Vocabulary Mapping Session AOSN - SSDS at MBARI

First vocabulary mapping session done at MBARI the June 9th 2005, with AOSN an SSDS participants.

Introduction

Semantic agreements are necessary to achieve interoperability between two or more information systems. Vocabularies need to be harmonized (expressed in a similar format), and tools are needed to facilitate mappings among harmonized vocabularies.

For this vocabulary mapping exercise, two information systems were used: Autonomous Ocean Sampling Network (AOSN) and Shore Side Data System (SSDS), both from MBARI. The vocabularies of these systems were converted into an ontology, and a tool created by MMI was used to expedite mappings and alingments and expressed them in an new ontology.

The session mapping took place Thursday, June 9th from 10 - 11 AM at MBARI. The participants were: Kevin Gomes (SSDS), Mike Godin (AOSN) and Luis Bermudez (MMI). The first 10 minutes were used to explain the tool. Luis moderated the mapping sessions and he was in charged of the mapping tool, while Kevin and Mike told him the relations.

The goals of these exercise were:

  • Create explicit mappings between these two systems to be used for the MMI interoperability demo.
  • Test the tool.
  • Create a list of improvements and functional requirements.
  • Have a first social experience about the exercise.

Ontologies

The following ontologies were used in the mapping process: They can be found here: /examples/mmihostedwork/ontologieswork/ontologies/

  • AOSN ontology
  • SSDS ontology
  • GCMD ontology
  • Mapper-Schema ontology: defines the relations
  • Mapping Ontology AOSN - SSDS and GCMD
  • Inference Model Ontology (created in the cache memory during the exercise)

Mapping Tool

The tool created at MMI is a JAVA application, created as an Eclipse 3.1M6 Plugin. The initial requirements were:

  • Display ontologies in a friendly human readable way.
  • Facilitate searching of resources, filtering ontologies.
  • Allow to map multiple ontologies with three relations: broader-Than, narrower-Than and same-As.
  • View the mappings.
  • Save the mappings in a new ontology.
Snapshot of the tool is presented in the figure bellow:

Results

70 direct relations were created in 50 minutes, using one computer. So about 2 relations/minute per computer. From these direct relations 336 relations can be inferred. ( See all relations here ) . Example of some of the mappings are shown in the figure bellow. The figure presents a snapshot where both the direct and inferred relations can be depicted. (e.g. Nitrate:gcmd and nitrate:ssds)

Feedback

Tool related

  • Multimappings should be allowed ( for example one to many).
  • Instead of creating relations using the drop down box, mapping buttons could be available in the middle of the screen.
  • Allow to select "all" or "none" when selecting ontologies to search terms from.
  • Allow undo an redo (History of actions).
  • Present a feedback when the mapping is created.

Human behavior related

  • A domain expert can be associated with a controlled vocabulary. The expert could be the creator or not of the vocabulary. If not, then he should be very familiar with that vocabulary.
  • Because in the exercise where involve only technical developers, not scientists, the relations were created making the best guess, and always checking for the associated units of the terms. Scientist are very much needed in this type of exercise.
  • When a vocabulary (AOSN or SDSDS) was mapped with GCMD, one participant was active while the other one was inactive. We should look forward to maximize the time of the experts in the mapping sessions. Concentrate in sub-domains and in each sub-domain concentrate in sub-vocabularies.
  • It was suggested that mappers (people how performed the mapping) create mappings in small chunks. For example get one variable (20-40 minutes) and then have a break.
  • People need fast rewards to keep in a hi level their excitement. After a mapping is created the mappers need a feedback. "See the mappings in action", like find more data form data repositories due to the semantic relations.