3. Requirements for the Semantic Framework

Extended requirements lists for related semantic frameworks have been developed at the Open Ontology Repository project, and the NeOn Lifecycle Support for Networked Ontologies project (see last section). The first focuses on the ontology repository, a major capability of the infrastructure, while the second addresses an entire community semantic framework. (These projects are described in more detail in the last section.)

The outline in this section provides a first-order binning for the requirements that must be considered. The functional requirement categories include infrastructure operations; end-user tools; and interfaces. (Each of those terms is defined in its subsection below.) Non-functional requirements (the 'ilities') are also briefly discussed, at the end of the section.

3A. Infrastructure Operations

The infrastructure operations provide key capabilities to the other elements of the system.

Ontology Registration and Version Management

Capabilities required for ontology registration and versioning include:

  • register and track ontologies
  • generate URLs (more generally, URIs) for terms corresponding to entries in ontology documents
  • provide services associated with ontologies (examples: querying concepts, capturing associated metadata)

Term Services

The infrastructure must be able to perform many basic operations with the terms in its controlled vocabularies. These include:

  • find terms using limited and unlimited searches, including limiting the search domain to those ontologies or terms with certain characteristics.
  • provide information about terms in response to queries
  • resolve URLs (more generally, URIs) that reference specific terms or ontologies (responding with term information or the entire ontology, as appropriate)

Term Relation Services

The infrastructure facilitates semantic mediation by using defined relationships in its ontologies to infer associations among terms. This requires the infrastructure be able to:

  • given a term, identify related terms
  • given a term, a target vocabulary, and a desired relationship (e.g., 'sameAs'), find terms in the target vocabulary that satisfy the relationship
  • given two terms, provide a description of any relationships between the terms

3B. End-User Tools

End-user tools are software applications or services that provide capabilities direclty to end users. They may be tightly integrated with the infrastructure components, but add user-friendly interfaces and functions.

Term Selection

It must be as easy as possible for users to discover and choose the most appropriate terms for their use as they build a vocabulary. The user needs optimized interfaces for searching, reviewing, and understanding a large collection of similar terms from other controlled vocabularies.

Implicitly, term selection (and term creation, described next) include the addition of the selected terms to a controlled vocabulary.

Term Creation

When creating a vocabulary that includes terms not previously defined, the interface must support efficient creation of new terms. (Defining new terms will be necessary to a surprising degree, as even with hundreds of vocabularies the marine science domain has barely scratched the surface of the required set of terms.) Efficient and effective creation of new terms in a community vocabulary requires features such as:

  • rapid entry of terms and definitions
  • automated addition of all relevant metadata to a newly created term
  • access to existing definitions of analogous terms, e.g., in Wikipedia
  • ability to track changes to term definitions
  • support for community interaction in defining and approving terms
  • ability to efficiently cite sources for term definitions

Not all of these requirements are immediately needed for a basic vocabulary creation capability, but the first two are essential.

Controlled Vocabulary Registration

New controlled vocabularies, even those still in the process of development, must be registered in a common repository for them to be widely useful and accessible. The registration process must be as simple and transparent as possible for the user. Registration should be tightly integrated with the process of building the terms of a controlled vocabulary.

Term Mapping

Users must be able to relate terms from different ontologies. This function requires the selection of one or more terms as subjects, one or more terms as objects, and a relation to be made between the subjects and objects. The relationship is stored for each subject-object pair, as another entry in an ontology.

An implementation of the functions required for this service is provided by the VINE (Vocabulary Integration Environment) tool. See http://marinemetadata.org/vine for more information.

3C. Interface Requirements

The infrastructure must provide services compatible with typical ontology and knowledge management systems. These are the services that tool developers and other service providers will use to interact with the semantic framework.

The interface formats appropriate for these interfaces are still being defined, but will include these protocols:

  • possibly SOAP, or an additional protocol

At least one query language must be supported. The currently planned query language is SPARQL.

3D. Non-Functional Requirements

Non-functional requirements occur in the following categories (among others), and are briefly described here:

  • usability: the system must be easily interfaced to by developers; user functions must be straightforward to use and easy to understand
  • reliability: core components must be capable of highly reliable operations when they are declared available for operational use, with no down time required for maintenance and minimal down time due to failures; initial capabilities must be reliable, but system operations may be briefly halted for maintenance, installations, or troubleshooting
  • maintainability: software must be developed in a way that can support troubleshooting and allow modifications
  • expandability: the system must support increases over time in usage, storage, and feature set.