Open Access News

News from the open access movement

Thursday, December 13, 2007

Alpha version of the OAI-ORE spec and user guide

The Open Archives Initiative has released the alpha version of the ORE Specification and User Guide (December 10, 2007). 

Abstract:   Open Archives Initiative Object Reuse and Exchange (OAI-ORE) defines standards for the description and exchange of aggregations of Web resources.  This document provides an introduction and lists the specifications and user guide documents that make up the OAI-ORE standards.

From the Introduction:

The World Wide Web is built upon the notion of atomic units of information called resources that are identified with URIs such as [this page]. In addition to these atomic units, aggregations of resources are often units of information in their own right.  Examples of these aggregations are:

  • A simple unordered set, or bag, of resources, such as a collection of favorite images from various Web sites.
  • A multi-page, HTML document where the pages are linked together by hyperlinks that provide "previous page" and "next page" access....
  • A scholarly publication stored in an ePrint repository such as arXiv or in a DSpace, ePrints, or Fedora repository.  Such a publication may appear on the Web as multiple resources, each with an individual URI.  The set of resources typically consists of a human readable "splash page", that links to the body of the publication in multiple formats such as LaTeX, PDF, and HTML.  In addition, the publication may have citation links to other publications, each existing as one or more resources.
  • An overlay journal issue that aggregates multiple scholarly publications as described above, each located in their origin repository, into an issue.  Issues may be recursively aggregated themselves into volumes, and then into the journal itself.
  • A semantically-linked group of cellular images - each available as a Resource resident in repositories from research laboratories, museums, libraries, and the like - in the manner implemented in the ImageWeb Project.
  • Published scientific results such as those envisioned by [Lynch CTWatch] that, in addition to the features of the scholarly publication described above, incorporate data plus the tools to visualize and analyze that data.

A mechanism to associate identities with these aggregations and describe them in a machine-readable manner would make them visible to Web agents, both humans and machines.  This could be useful for a number of applications and contexts.  For example:

  • Crawler-based search engines could use such descriptions to index information and provide search results sets at the granularity of the aggregations rather than their individual parts.
  • Browsers could leverage them to provide users with navigation aids for the aggregated resources, in the same manner that machine-readable site maps provide navigation clues for crawlers.
  • Other automated agents such as preservation systems could use these descriptions as guides to understand a "whole document" and determine the best preservation strategy.
  • Systems that mine and analyze networked information for citation analysis/bibliometrics could achieve better accuracy with knowledge of aggregation structure contained in these descriptions.
  • These machine-readable descriptions could provide the foundation for advanced scholarly communication systems that allow the flexible reuse and refactoring of rich scholarly artifacts and their components [Value Chains]....