meetingnotes20040607 - Marine Metadata Wiki

meetingnotes20040607

Standard Naming Exercise

Meeting Notes 2004.06.07

Throughout this discussion, we referenced the list supplied by Dorota (see ourdocuments page). An updated version of that list is at SnUnitsReferences. The EPIC list of keywords was most often consulted, but almost all the documents were used at some point. We looked at the GCD (?) standard (affiliated with SensorML/OpenGIS work), but didn't see that it would help us.

Minutes

What next from list of nextsteps?

  • Units column.

We might not agree on standard units per se, but on naming conventions for the units.

We have to recognize there are 2 approaches.

  • One is additive with respect to data submission, where people can keep adding more units as more data is submitted.

  • The other is exclusive, where only specified units are accepted.

The latter tends to be the case for controlled data sets (whereas ours is fairly uncontrolled).

The SI system is restrictive, our data systems must accept what is offered. Strike a balance between the two -- some control is appropriate.

We can accept any units, so long as they are convertible. But they must be specified in a consistent syntax. So we should come to an agreement here about the syntax of the Units terms, then.

If we produce a pick list, each type of measure (e..g, length) has a set of associated units, and a relatively standard or preferred unit. (This could be the SI unit, or something else if we'd rather.) Can we maintain those relationships in our lists? This is done in some models, for example the Unidata model.

Note that udunits doesn't handle anything other than scale or addition (can't combine the two). So conversions between degF and degC can't be done.

It may be that Bluefin has a good way of handling units -- we should contact them to see the status.

Strategy that NASA has is to require variables and their units conform to existing standard. (Note: I'm not sure which program this refers to, NASA has a lot of different programs. But their bigger ones tend to be more monolithic and absolute in this regard.--jbg) GCMD is a little more free-form.

Need to capture reference scale also, when that is significant.

Case sensitivity is a concern. Do we need to be case sensitive?

  • We do need to allow for case sensitivity (db, dB as one example, m/M, k/K is another)

Allow multiple strings for same unit reference? Yes, but we want to be able to specify a preferred name (e.g., in the pick list). This helps people to have nicely plotted unit names. Although, we don't intend (necessarily) to specify the displayed units string in this column; a second table/lookup could be used to reference a display string to use.

What should be used for no units? unitless, dimensionless, empty string?

  • Consensus was that 'unitless' was clearest.

Is it practical to overload these unit variables too much?

  • We do for lat/lon direction

  • But we shouldn't for boolean variables (too idiosyncratic, too many definitions to create)

  • We don't want to make degrees incorporate all the knowledge of origin/reference position.

  • Put clarifying information in units only when it is generic to that unit type, and widely used.

What is the purpose of units? How will we be using this column?

  • (not explicitly addressed)

Can I add units for my own application?

  • You can't create new arbitrary units here in this table, because this is _our_ namespace for units.

  • You or others can create their own namespace.

What do we call "seconds" when it is seconds since a particular time?

  • 'epoch seconds' is useful because it indicates all the time information is in this item.

  • But it's really just a number of seconds, like hh:mm:ss of the day.

  • It would be really nice if this unit reflected the usage of netCDF COARDS convention

    • "seconds since YYYY-MM-DD hh:mm:ss 0000" is parsed and used correctly

    • Do we want to mess with our Units standard because of what one application does?

    • If not here, then reference frame.

  • Units for this type of seconds should use netCDF COARDS convention, for clarity and programming convenience.

Notes about notation of units:

  • We picked the 2 or 3 notations which are most common and useful.

    • The first notation was generally most favored.

  • We did not try to be complete in specifying alternatives.

  • Other notations for the same units could be added as needed/desired.

  • Spaces were deemed acceptable within units.

  • Capitalization counts.

Finally we discussed nextsteps. The consensus order of things was:

  • finish off Units

  • eliminate "duplicates" (where Parameter, Domain, and Units are the same)

  • consider Qualifiers next

    • can be defined as "what's needed to resolve duplicates within a data set"

References

S Watson's notes from meeting

Attending

  • Fred Bahr

  • Rich Schramm

  • Dorota Kolber

  • Mike Godin

  • Stephanie Watson

  • John Graybeal

  • Mike McCann

Action Items

Action Item (John): Point to these pages from MBARI public web site.

  • In progress. Not sure if we can arrange public site reference, but am talking to Nancy about Canyon Head.

Action Item (John): get Excel on bob

  • In progress, request in to Todd.

Action Item (John): fix the Excel spreadsheet to transfer this data to the current spreadsheet (ahem)

  • Also take a (non-destructive) cut at identifying duplicates

Action item (Rich?): Check with BlueFin about their approach.