Meeting Notes 2004.05.18
Standard Naming Exercise
| ExerciseGoals | DetailedPlan for task | ExternalStandards we can use | OurDocuments |
| StandardNameExercise home page | |||
MeetingNotes20040518 (meeting 2)
What do we want to do today?
-
fill in blanks
-
standardize domain column items
-
-
this is useful regardless of how the results will be used (e.g., ocean/river/lake vs water)
-
could construct a class hierarchy (but how useful is that?)
-
will get gnarly when we get to current
-
-
discuss how units column should be used
-
standardize most basic variable names (large percentage of file contents)
-
-
how to indicate quality flag, e.g., _QF?
-
-
identify references
=== Review of Element Issues ====
-
how is domain applied?
-
-
use the most generic term for it
-
-
should Domain always be a part of name?
-
-
appears to be a user- and use-specific issue
-
keeping domain and parameter names separate is very useful (sayeth the DB pros)
-
-
distinction between bottles and in-situ?
-
-
not at this level, one may be treated as the other
-
-
columns 'instrument' and 'sensor' confusing (which is which?)
-
-
could improve definitions in the spreadsheet
-
how primary is this information?
-
-
units has been subject to a lot of work
-
-
don't store as knots, since it isn't metric (or maybe just be sure to convert for science use)
-
have precise terminology, so we know degC = degrees C = centigrade
-
Discussion of Preferred External Standards
McCann: [http://sweet.jpl.nasa.gov/sweet/esmf.xls esmf.xls-ESMF/CF Taxonomy] [http://sweet.jpl.nasa.gov/sweet/gcmd.xls gcmd.xls-GCMD Keywords Taxonomy]
-
extended presentation of variables vs Realm, Phenomena, Property, Substance, Space, Numerics, Time, Biosphere, Services, Data
-
lots of measurements exist, across a wide range of domains
-
includes domain in some of the names
-
this ontology is used in the registration of data sets (via form) at GCMD
-
form exists to enter new terms for describing their data
-
-
indication is NASA is quick to respond to new requests
-
Gene Major is implementing the GCMD, process for adding names is not rigid
-
we could use these for comparison as we're filling out our variables
-
how mature is this? can we use it as a sole basis?
-
-
interesting that Earth Realm doesn't include atmosphere in several places that it might
-
using this model, "first submission wins", so it may not be totally systematic
-
-
what distinguishes between CTD profile and measurement at given depth?
-
-
this may be different kind of information, not captured in this information space
-
-
-
we're looking at automating processes, rather than providing human-searchable data
Kolber: [http:// ARGO Data Management User Manual] (.pdf)
-
contains a very well described structure with detailed info on lots of variables
-
variable names are short and do not include domain, which is expressed in a long name
-
how to handle different domains in short name?
-
-
short name tends to be meaningless, since long name is what appears on plots and contains domains
-
interoperability affected by repeated use of same names
-
-
ARGO is dealing with limited data types -- this simplifies the issues (e.g., they are not dealing with water temperature and air temperature)
-
might be interesting to take some of these common names (lat, long, jday, depth) and map the information they provide to our spreadsheet
-
-
ARGO vs GCMD? neither one necessarily what *we* wanted to do
-
McCann: [http://sweet.jpl.nasa.gov SWEET Ontology (with Protoge)]
-
download Protege from Stanford site, and install Ontology Web Language (OWL) plugin
-
can also view on web (but not with Safari)
-
tool can be used to create new ontologies or additions to existing ontologies
-
large user and developer community for this tool
-
submissions can be submitted to web library
-
could take SWEET ontologies (which are based on GCMD) and add our own information
-
putting cart before horse?
-
-
take spreadsheet and group thing together
-
Others For Future Discussion
-
Schramm: EPIC Key File Variables
-
-
contains a long list of variables and associated information
-
variables are organized by numeric indices
-
-
Graybeal: SensorML
-
-
an ontology for what data describes sensors (their location, etc.)
-
recent release just came out
-
-
MarineXML
-
-
complicated histories here: European and Australian versions
-
BODC reconciled European version with their own? Australian based on BODC originally?
-
BODC has old data sets that have to be compatible
-
very hard to keep up with this work or know where it's going
-
-
starting to id discrepancies with other references could start feeding this back to those references
Completing the Parameter Column of the Compiled Spreadsheets
-
lots of classes (for an ontology) becoming apparent
-
starting to normalize some names
-
time could be challenging to represent
-
almost through this column
The Plan
Get together next Tuesday at 8:30 AM to continue this process.
-
Anyone who wants to fill out the easy Parameter entries for the rest of their data, send results to John
-
Next steps include:
-
-
Finish the Parameter column
-
Do the same for Domain as we did for Parameter
-
Go down the Name In Use column and propose better names, given what we now know
-
