Cyberinfrastructure: Information Management in eScience (CMIS)
DEADLINE FOR PAPER SUBMISSION EXTENDED TO AUGUST 5, 2007
For paper submission guidelines, please see the CIMS Web site
We expect researchers to interact and advance the state-of-the art in algorithms, systems architectures and exchange platforms for scientific information, especially, in the form of (but not limited to) data and digital documents. The focus will be on scientific domains that employ automated or semi-automated tools to share and disseminate the data to the public domain. While we welcome research papers on individual components of a cyberinfrastructure, (a) a systematic integration architecture is missing, and (b) several challenge problems still require substantial research efforts. We list a sample of the research challenges below:
Interoperability: In attempting to build a robust cyberinfrastructure, a problem that is still a major impediment is the issue of interoperability. In the sciences, this problem is even more acute because of the various data formats and structures used by different communities that could benefit by exchanging data. Aggravating the problem is the lack of standard data formats. Different tools used by different practitioners produce data in different formats. Data exchange is also hampered because of different protocols and incompatible interfaces. We encourage submissions from researchers proposing solutions to a wide range of issues related to data interoperation and information sharing.
Security: Although in some sciences, the data is often made freely available, the adoption and use of the cyberinfrastructure will be more widespread if it can ensure the security of the data and documents. Publishers of the data can then retain control of their data and decide whom to share the data with. Secure sharing of data and information is an important priority for eScience projects.
Metadata Extraction: To enable meaningful searches from the data and to help machine processing of the data and the documents residing in the repositories connected by the cyberinfrastructure, one must deploy metadata. Automatic metadata extraction and creation is an important problem that must be addressed to improve the quality of the information served to the end-user. Good metadata is also a vital cog for enabling interoperation. Researchers have utilized techniques from machine-learning, pattern recognition, data mining and text mining, information theory, etc. to derive meaningful metadata corresponding to data and documents. Extracting metadata from large scale scientific data is a challenging problem that must be resolved to enable progress in the eSciences. To augment the effort to automatically harvest the metadata, efforts are on to design schemes to collect metadata directly from the publishers of the data and using user collaboration. Communities are building domain-specific ontologies to specify the semantics of concepts and their relationships. Advances in the creation and maintenance of metadata for cyberinfrastructure will be discussed in the workshop.
The topics of interest for CIMS include (but are not certainly limited to):
- Federated databases
- Data interoperation
- Scientific data warehouses
- Access control and security
- Knowledge representation and management
- Ontology creation and reuse
- Ontology mapping and management
- Automatic metadata extraction
- Information extraction
- Scientific data Mining
- Creation and maintainance of Digital Libraries
- Crawling, indexing and search
- Digital preservation
- Web services and workflows
- Simulations and analytic tools
- Scientific data visualization
- Evaluation metrics for CI
Workshop is collocated with the ACM Conference in Information and Knowledge Management (CIKM 2007).
