Devices Ontology WG Mtg 2008.03.18
Schedule
This telecon will be at 1600 GMT, but note that is 9 AM Pacific Time, noon Eastern time, 11 AM Central Time.
Agenda
A. This and That
- Comments/corrections of last minutes.
- Comments/corrections on Agenda.
- Roll Call.
A. Review of Instructions for Providing Device Data
Review of the new instructions, to be provided (hopefully) before the meeting.
B. Review of Vocabulary Lists
Review of the status of the new vocabulary list(s), and progress with CF standard names.
C. Update on GCMD Device Vocabulary Work
Additional information on this activity has been obtained.
D. Status of Action Items
See the TRAC page for the latest list of action items (due by current telecon) and list of action items from most recent telecon.
Minutes
This and That
No corrections to minutes.
Agenda Update
An update to the agenda was sent in email, and we agreed to follow that updated agenda. So the minutes are organized accordingly.
Instructions
A step-by-step review of the new instructions, along with the template and example.
- Directions
- Template (also pointed to by directions)
- Example (also pointed to by directions)
Discussion focused on measuring and reporting concepts. Some changes were made to how this is presented:
- Only one file (or Excel sheet) per instrument. Both measuring and reporting variables are in that file.
- Add 'both' as an accepted term for the mode, making the choices 'measuring', 'reporting', and 'both'.
- Default interpretation if measuring or reporting is not specified is that reporting is assumed.
- Mode is reported as the next to last field in each variable entry; it is not reported in the header line.
Changes were also made to how the separator:
- Separator is comma, not tab, for readability.
- Embedded commas may be escaped by quoting (in double-quotes like ") the entire value.
Other topics came up but did not result in changes:
- Should we report the source of the variable, e.g., individual sensor, or calculated? No, not for this exercise.
- What about other types of sources, like software, or complex systems of systems? They are perhaps worth keeping in the back of our minds, to see if our concepts work well in the general case. But they are not central to our use case of developing a device ontology.
- Do we describe individual instruments, or a class of instruments? It doesn't matter -- give it a unique ID that makes sense. The lessons we'll learn and the ontology that results will be similar either way.
- It also brings up issues like configurations, and change over time (what is the status of the instrument at a particular time?). Yes, we may want to come back to this at a later stage.
- Some systems are more complicated than others. A moral of the story is that the world is messy, and people will have to choose which part of the messy world to call a 'device' for the purposes of creating, and using, this ontology.
[John] Update the instructions, template, and example to reflect the above.
Vocabulary Rationale (Strategy for Terms)
we will walk through a proposal of the vocabulary rationale, to see if the proposed philosophy for defining vocabulary terms is credible.
Review section A of last meeting's minutes for a preview of this discussion.
- What does it mean to say two variables are the same? Consider the "Same As" MMI white paper (PDF download)
- Creating effective terms: Look at the example in 1C above. Will this process work?
In the white paper, John pointed to the first paragraph under Guidance in the white paper:we recommend using a the measurable property, together with the domain (ormeasurable particular) being measured.John thinks of a measurable property as a substance, like water or air, or a physical object, like the vane of the sensor.
But what does this mean -- what is a domain or measurable particular? Is it an earth realm? No, more than that, and maybe only a subset of the earth realms. Perhaps the term 'feature of interest', and a 'property' of that feature of interest, will be more clear to some people. (At least, those that don't think a 'feature of interest' is a place name.)
It would be really useful to have a list of examples -- maybe a fairly comprehensive list, maybe a whole vocabulary -- of the features of interest that would be reasonable components of these variable names.
We had an extended discussion about why/whether we need to define a rule for the variable names, why can't we just collect some of them and see how it goes. John observed that we've done that once, and the names that resulted weren't comparable in any way -- some included all sorts of information, some included none, and all we had was a long list of unrelated, and unrelatable, instances. So the claim is that some order must constrain the chaos, or you won't have any usable results.
To take a concrete example from the example device file, the original names from the manufacturer didn't say device rotation around X, but 'tilt X'. This is both meaningless to the average reader, and not as accurate or generic as it needs to be. So the translations into more generally applicable terms makes it possible to compare concepts from one device to another.
But what about fitness for purpose? To take the example of an ocean current sensor, yes we know that measures 'water velocity', but some of them would only do that satisfactorily at the ocean surface -- they'd be crushed in deeper waters. Same thing for salinity systems, some might work only in ocean waters, others only in fresh water. But it isn't like there's a hard line between 'it works here, it fails here' -- there are usually gradations (works pretty well, works a little bit) -- so how would our vocabulary convey all that? And in the end, can it cover all the different criteria we use for evaluating fitness for purpose? If it did, it would be like putting a database system into the variable names.
Given these complexities, is it possible to do something useful here? Is there a sweet spot where the effort of creating lists of measured variables, and ontologies to reflect that, might pay off in a useful product? Or do we have to make it so dumb as to be useless, or so complicated it can't work?