Let's say I want to find the best vocabulary out there for describing things in the environment that can be observed—variables, parameters, phenomena, observable properties, whatever you might call them. If you are doing earth science on your laptop, these might be the labels you put in the top row of your spreadsheet.

How can I find that best vocabulary for a given domain and situation, like environment science (especially oceanography) observations in an infrastructure project? Shall we try?

The Criteria

First, I need to define what I mean by 'best'. These are the top-level characteristics I would choose:

  • comprehensive in my domain—I should be able to find most, or even all, terms that I need
  • used by the relevant science community
  • maintained by a community using an open (public) process
  • rigorous, by which I mean
    • good (clear, unambiguous) definitions for all terms
    • non-overlapping concepts (let's not have water, h2o, moving_water, and falling_water, unless the distinctions are absolute)
    • normalized to a known quantity (is each term based on medium + property, medium + property + process, or some random combination of these and other factors?)
  • extensible as new terms are needed
  • combinable with other terms (so if my vocabulary incorporates medium and property, I could blend it with another vocabulary that describes process, if I wanted to)

There are other characteristics that will be valuable eventually (well-mapped to other vocabularies, uniquely referenced by URLs, etc.), but with these criteria I think step 1 is complete, at a draft stage.

The Search

Now I have to go find these vocabularies that might fit the criteria. Where do I look?

MMI for sure. It is still one of the best, or the best, lists of vocabularies in environmental science. (Have to look at the MMI vocabularies and ontologies on the site, and any that might be on the Ontology repository.)

We ask around, via this post and some lists, for vocabularies that are at least as good as those posted. Maybe we even post a news item on sites like MMI. We can do that!

For now, we'll use this post as a collection point. At the end of this post are some examples of vocabularies that might be the best. Please add any vocabulary you think is better in a comment at the end of this page. If you aren't a member and can't access comments on this page (consider becoming a member!), you can email me with the information.

The Selection

Now we have to choose from those vocabularies. What criteria are there to choose from? We have both the criteria listed above, and some more (that we will reference later). But we have to evaluate the vocabularies against the criteria. What's the best way to do that? Maybe a poll of users, using the criteria we've identified in a spreadsheet matrix that people fill out.

Appendix: Proposed Best Vocabularies for Environmental Parameters

These are the best we have. Can you do better?

which is more important, rigor or amount of use?

I received a comment off-line to the effect that high usage in a community is more important than the rigor of the vocabulary. For an individual user this is arguably true, but for an infrastructure project I claim the rigor is essential to good engineering within the project (and community adoption can be achieved through mappings). 

What do you think should be the priority?

Vocabulary usage

NERC/BODC/SeaDataNet Parameters

Requires a login to access - probably precludes meeting the 'amount of use' criteria

