Saturday, July 19, 2008

More on disciplinary v. institutional repositories

Stevan Harnad, The OA Deposit-Fee Kerfuffle: APA's Not Responsible; NIH Is. PART II, Open Access Archivangelism, July 19, 2008.

See also PART I and PART 0.

Summary:  The concept underlying the OAI metadata harvesting protocol is that local, distributed, content-provider sites each provide their own content and global service-provider sites harvest that content and provide global services over it, such as indexing, search, and other added values. (This is not a symmetric process. It does not make sense to think of the individual content-providers as "harvesting" their own content (back) from global service-providers.)

The question is accordingly whether OA deposit mandates should be (1) convergent, with both institutional and funder mandates requiring deposit in the author's own OA Institutional Repository (IR), for harvesting by global overlay OA services and collections (such as PubMed Central, PMC) or (2) divergent, requiring authors to deposit all over the map, locally or distally, possibly multiple times, depending on field and funding. It seems obvious that coordinated, convergent IR deposit mandates from both institutions and funders will bring universal OA far more surely and swiftly than needless and counterproductive divergence.

In the interests of a swift, seamless, systematic, global transition to universal OA, NIH should accordingly make one tiny change (entailing no loss at all in content or functionality) in its otherwise invaluable, historic, and much-imitated mandate: NIH should mandate IR deposit and harvest to PMC from there.

The spirit of the Congressional directive that publicly funded research should be made publicly accessible online, free for all, is fully met once everyone, webwide, can click on the link to an item whose metadata they have found in PMC, and the article instantly appears, just as if they had retrieved it via Google, regardless of whether the item's URL happens to be in an IR or in PMC itself.

A possible reason the NIH mandate took the divergent form it did may have been a conflation of access archiving with preservation archiving: But the version that NIH has (rightly) stipulated for OA deposit (each "investigator's... electronic version of their final, peer-reviewed manuscripts upon acceptance for publication") is not even the draft that is in the real need of preservation; it is just a supplementary copy, provided for access purposes: The definitive version, the one that really stands in need of preservation, is not this author-copy but the publisher's official proprietary version of record.

For preservation, the definitive document needs to be deposited in an archival depository (preferably several, for safety), not an OA collection like PMC. But that essential archival deposit/preservation function has absolutely nothing to do with either the author or with OA.

PS:  This is just a summary.  The rest of the post responds to my responses to Stevan's earlier posts in this series.  I'll let him have the last word.