Devices Ontology Working Group Meeting 2007.12.04

Agenda for the 2007.12.04 (1600 GMT) meeting of the Devices Ontology working group. (Minutes follow the agenda.)

A. This and That

  1. Note Taker.
  2. Comments/corrections of last minutes.
  3. Comments/corrections on Agenda.
  4. Roll Call.

B. Status of Action Items

See the TRAC page for the latest list of action items (due by current telecon) and list of action items from most recent telecon.

C. Questions from Reviewing Sensor Variable Lists

The following questions arose when trying to normalize the sensor variable lists into a single list.

  1. Parameter Names: We have several different kinds of parameter names. Examples are:
    • CNDC: very short and semantic code
    • CNDCZZ01: long, nominally not semantic(?) code with very detailed properties associated
    • electrical conductivity: a measurable phenomenon with no domain context, not usable as a variable
    • sea_water_electrical_conductivity: semantic, unique, variable-capable term with metric and context
    • speed: semantic description of a metric, with no domain context
    • maxgust: abbreviated semantic, unique, variable-capable term with implied domain context
    • bathymetry: descriptive semantic concept, seemingly for more than a single variable
    There is no obvious commonality among these terms, and I suspect many of them would be useless to the average person wanting to learn which instrument types could produce the variable they cared about. So I think the question becomes, what kind of parameter names do we want to represent in our ontology? (To those familiar with the Advancing Domain Ontologies workshop, the a similar question came up in the form "What does it mean to say term A is the 'same as' term B?" See the related guidance (will download PDF).)
  2. Information to Track: I'm trying to create a single spreadsheet with everything in it, and I am still struggling to track the right information about each variable. To get to the ontology construction stage, I think at a minimum we need:
    • Manufacturer
    • Model
    • Semantic Variable Name
    • Connection to Controlled Vocabulary Term
    To make it possible to recognize similarities and differences between terms, and to trace terms back to their providers, I think we also need:
    • Provider Identifier (your Institution and/or Name)
    • Your Local Unique ID for the device
    • A globally unique ID for the device (built from previous 2 items)
    • Description
    To make the use case reasonably useful, I think the search also needs to distinguish between typical variables from that device, and other variables that the device can be configured to produce. This suggest a flag for typical variables:
    • Variable Group ('typical' or any other string)
    . These are quite a set of data to collect, and I don't have them all in hand.
  3. Extra Mappings: I'm seeing multiple mappings for several sets of parameters. What do I do with these additional mappings? (This can only be acted upon after we decide which list is primary, I expect.)

D. Getting to an Ontology


This and That

Attendees:

  • Bruce Andrews
  • Bob Arko
  • Derik Barseghian
  • Luis Bermudez
  • Surya Durbha
  • Nan Galbraith
  • Willem van der Hoeven
  • Jesper Zedlitz

No corrections to minutes or changes to agenda.

Status of Action Items

We are installed TRAC, and we are keeping up in TRAC with the action items for this project. You can subscribe to updates on any of the issues in the TRAC system, if there is one you want to keep track of. We have also installed a subversion (SVN) repository for use in this work.

Results from Reviewing Sensor Variable Lists

Parameter Names

What to do about using common vocabularies so that the ontology enables responding to common user requests?

Suggested to allow a choice of vocabularies, for example CF. Another option is to have software between the user and the vocabulary, so users use their own vocabularies and it is translated to the common vocabulary. Expectation would be that a process of elimination could be used to weed out data you don't want, but this implies a human operator.

To make mapping approaches work, you need a bulletproof crosswalk. Maintaining the crosswalk is an issue going forward. Surya suggests a user-driven process to develop the ontology -- they have had some success with that (example: landform classification). Involves encoding differences using relations, then using user-provided differences to infer other relations.

Surya takes action item to provide documentation on this approach.)

A lot of our work concept presupposes the existence of a top-level shared vocabulary (or two mapped vocabularies). What can serve as that vocabulary? Roy would recommend BODC, but without the instrument and other concepts (everything in the 'by' clause). It looks like this would boil the 6000 or so non-biology terms down to 350. Not clear yet what the gaps will be; a comparison with the CF vocabulary will be instructive. Existing mappings could gracefully fold into this 'truncated BODC'. How often do these terms change? Every two weeks or so, per request of members. How much time to provide this redaction service? Roy estimates just a few hours, likely to happen over the holiday period as it is an interesting way forward.

Roy to create a visible implementation of the BODC vocabulary without the device or process complexities.

Note that the VINE tool is about to be released in a new version, and is likely to be useful for viewing and comparing the vocabularies.

Information to Track

Not discussed (but see Nan's email to the list on this topic.)

Extra Mappings

John will make his own way to deal with this.

Getting to an Ontology

Not discussed, but depends on outcome of the vocabulary harmonization process.

Logistics

The next telecon is 18 December, and the one after 1 January. Assume the latter is cancelled. Do we want one 18 December? Many people will not be available. Is everyone satisfied with progress? Yes.

So, no telecon until 15 January.