CCDS: Canonical Cruise Data Structure
SIOExplorer: Overview, initial results and next steps Stephen Miller, John Helly, Marty Africa, Uta Peckman, Deborah Day and Dru Clark
The new SIOExplorer project, which is a collection in the overall NSF-funded National Science Digital Library (www.nsdl.org). The collaborative effort includes researchers at SIO, computer scientists from the San Diego Supercomputer Center (SDSC), and archivists and librarians from the UCSD Library.
The co-authors of this paper tested a shipboard prototype during a Floating Digital Library Workshop from New Zealand to Samoa on R/V Melville in March, 2002. General purpose tools have been developed to automate collection development, manage metadata, and geographically search the library, as discussed in other presentations.
In the initial year of operation, the biggest challenge has been wrestling with the volume and variability of data and documents. Shipboard sensors, data volumes, and organizational structures have evolved greatly over the decades, particularly with 244 multibeam expeditions since 1982.
Considerable success came after introducing the concept of a Canonical Cruise Data Structure (CCDS) with nine basic categories that seem to capture the essential characteristics of data practices since the 1960’s. Automatic software pulls data into the CCDS from diverse source directories and media, guided by a template with rules for priority and filenames. Almost all metadata are harvested automatically into simple “metadata interchange format” (.mif) files, one for each “arbitrary digital object” (ADO) in the CCDS. The metadata are placed in an Oracle database, and the associated data are managed by the SDSC Storage Resource Broker on various disk and automatic tape silo systems.
The system is extensible to various domains and data types, including geochemistry, image archives, multibeam bathymetry, reports and publications. A Java Metadata Object Browser and Editor (MOBE) expands or hides the complexity for each domain, as needed. A prototype interactive CruiseViewer with both Java and html approaches will be demonstrated.
As the second year of the project begins, greater emphasis will be placed on search and display tools. At-risk data on shipboard magnetic tapes will be migrated to RAID systems and tape silos. Public outreach will begin at the Birch Aquarium and other locations.

