Open Access News

News from the open access movement


Thursday, January 24, 2008

Compare and contrast, OA and OD

Stevan Harnad, Open Access and Open Data, Open Access Archivangelism, January 23, 2008.  Excerpt:

...Here are a few comments on some important differences between Open Access (OA) and Open Data (OD).

The explicit, primary target content of OA is the full-texts of all the articles published in the world's 25,000 peer-reviewed scholarly and scientific journals. This is a special case, among all texts, partly because (i) research depends critically on access to those journal articles, because (ii) journals are expensive, because (iii) authors don't seek or get revenue from the sale of their articles, and hence have always given them away to any would-be user, and because (iv) lost access means lost research impact.

Research data are also critical to research progress, of course, but the universal practice of publishing research findings in refereed journal articles has not extended to the publication of the raw data on which the articles are based. There have been two main reasons for this. One was the capacity of the paper medium: There was no affordable way that data could be published alongside articles in paper journals. The other was that not all authors wanted to publish their data, or at least not right away....

The online era has now made it possible to publish all data affordably online. That removed the first barrier (although there are still technical problems, which Peter Murray-Rust and others discuss and are working to overcome). But the question of whether and when an author makes his data open is still a matter for the author to decide. Perhaps it ought not to be the author's choice -- but that is a much bigger and more complicated question than OA....

That difference in scope and universality is one of the reasons the OA and OD movements are distinct ones: OD has both technical and political problems that OA does not have, and it is important that OA should not be slowed down by inheriting these extraneous problems -- just as it is important that OD should not be weighed down by the publisher copyright problems of OA (which do not apply to OD for the simple reason that the authors do not publish their data, hence do not transfer copyright to a publisher)....

But an interesting overlap region is thereby created between OA and OD: for article texts are themselves data! ...Data-mining can be done not only on raw research data, but on article texts too, treated them as data: text-mining.

Here too, the interests of OA and OD are perfectly compatible and complementary -- except for one thing: If text-minability and 3rd-party re-publication were indeed to be made part of the definition of OA (i.e., not just removing price barriers to access by making research free for all online, but also "removing permissions barriers" by renegotiating copyright) then this would at the same time radically raise the barriers to achieving OA itself (just as insisting on making the paper edition free would), making it contingent on authors willingness and success in renegotiating copyright with their publishers....

[Green OA deposit and IDOA (Immediate-Deposit/Optional-Access) mandates work for OA but do] not work for OD, because (a) depositing data cannot be mandated, it can only be encouraged and because (b) making article-texts re-usable by 3rd-party text-miners and re-publishers as data requires permission from the copyright holder....

So the strategic issue is whether to insist on something stronger than IDOA -- at the risk of not reaching consensus on any mandate at all -- or waiting patiently a little while longer, to allow IDOA mandates to become universal, generating toll-free online access (OA), with its immediate resultant benefits to research and researchers -- and to trust that the pressure exerted by those very benefits will lead to the demise of embargoes as well as to OD (for both data and texts) in due course....

Comments.  I agree with nearly all that Stevan says here, and will just make a couple of short points on where we may diverge. 

  • First, I reiterate my oft-stated position that OA does and ought to remove permission barriers, not just price barriers.  However, I believe this is a minor point in the differentiation of OA and OD, since it affects both in nearly identical ways.  At the same time, I repeat my related position that removing price barriers is urgently needed and should not be delayed while we work for the additional benefits of removing permission barriers.  For some recent thoughts on this, see the Richard Poynder interview (esp. pp. 36-39, where I discuss the importance removing permission barriers and the points where Stevan and I may differ). 
  • Funding agencies could easily mandate OD, and I've commended the ERC for doing so.  Moreover, the question whether to adopt this kind of green OD mandate shouldn't be tangled up with the question whether to remove permission barriers.  Just as funding agencies can mandate the removal of price barriers for peer-reviewed manuscripts, and neglect or defer the removal of permission barriers, they can do the same for data files.  And despite the significant, missing layer of utility that would come from removing permission barriers, both moves would greatly accelerate research.