nextsteps - Marine Metadata Wiki

Next steps the group would like to take (live document).

Standard Naming Exercise

Next Steps

This page should, at any given moment, reflect where we could go next in this process.

Possible next steps (in favored order, more or less) include:

  • Action Items from last meeting

  • standardizing/normalizing another column in the Variables List spreadsheet

    • elimination of duplication in the parameter/domain/units column

      • this could facilitate normalizing the Qualifier column

    • maybe considering the Qualifier column, since it will affect variable names a lot

  • standardizing the variable names themselves (Name In Use column)

  • combination of the first two

  • figuring out how to put this semantic information to practical use

    • list the Use Cases for the PotentialGoals?

      • this may affect what we do next and how we do it...

    • a review/presentation of technologies could be helpful here

      • OWL/RDF

      • Protege

  • assessing other relevant standards

  • continuing adding data

Here are a few thoughts on these options.

The Qualifier Column

It's interesting that this column has a lot of potential boolean fields (qc'd or not, flag or not, processed or not, "best" or not, primary/secondary or not, ...). This suggests that this column is an element which could be the union of a lot of other concepts. (What does that imply?)

Creating and keeping a dictionary of Qualifiers that we've identified could be a useful step toward standardizing terminology. Dorota has put together the AOSN qualifiers into a list -- this could be a great start. See ourdocuments (look under AOSN).

Note that any such dictionary can't be comprehensive, because there will always be idiosyncratic qualifiers. But a 90% solution could be quite useful.

The Name In Use Column

If we start working on this column, we'll quickly decide (a) the Qualifier column is an important contributor, and (b) the AOSN choices for "Name" won't quite answer the mail for, say, CIMT and OASIS. (Example: The chosen names will depend on the data set. A data set which merges temperatures from various platforms will have to use Platform as a means to qualify names.) It may be useful to work this column so as to learn more about how the other columns relate, but what we put in will be very dependent on the end user domain, it seems to me.

It seems to me that the Domain + Parameter don't come very close yet to Name In Use. For example, "Platform Velocity" lacks a certain precision when referring to the ROV. Does that mean ROV should be a domain, or a qualifier, or what?

Use Cases

We may reach a point where the name standardization useful for one use case, or potential goal, conflicts with the approach most useful for another use case.

Soon it may be valuable to agree on a list of use cases, against which the work we've done can be validated.

Stephanie, from her perch in CenCOOS, may be able to use Use Cases to connect the needs of the greater world to the concrete tasks we're performing.

On the other side of the equation, there will be tasks for SSDS and AOSN which can take operational advantage of some of these products.

Assessing other relevant standards

This will be far more compelling when we are trying to apply to standards to particular problems.

The nominal standards haven't seemed real impressive in their consistency. They are great resources and first efforts at the problem(s), but they seem a little sketchy. (How can we contribute our own work to improve those efforts?)

Practical Outcomes

I'm thinking about a notional solution, based on semantic web technologies, crosswalks, and MBARI-developed and external dictionaries. It seems like we'd need to understand those technologies, perhaps as well as others, and maybe have someone describe their application on a real project, to assess our direction. Would it be useful to bring in someone with more experience to talk us through this?

Note that a large Protege conference is happening in July.

Adding more data

I hope to have the CIMT data sets included in the spreadsheet soon.... Anyone else want to volunteer a data set? (Hey Rich, if we documented the core data set items using XML, I could use the same XSL to create more data for the Variables List....)