Welcome to the SPARC Open Access Newsletter, issue #85
May 2, 2005

Read this issue online

NIH policy launches today

Today the NIH starts accepting publications based on NIH-funded research for public release through PubMed Central. 

Readers of this newsletter know my views.  On the one hand, the policy is a significant precedent establishing that U.S. federal funding agencies can provide free online access to the results of publicly-funded research.  Moreover, because the NIH is the world's largest funder of medical research, the policy is bound to free up a large amount of high-quality research and accelerate the pace of discovery for the public good.  On the other hand, the NIH stumbled toward the end of the process and weakened the policy in three important ways.  First, it reduced the requirement to a request, even while insisting that it wanted to "maximize participation".  Second, it extended the permissible delay after publication from six months to 12 months or more, without waiting for evidence that the shorter embargo would hurt publishers.  Third, it decided not to use its government-purpose license to disseminate grantee research but to rely instead on copyright-holder consent, which invites journals to block the implementation of the policy. 

It will take us a while to learn how many grantees are participating in the program, how soon after publication grantees are requesting public access, and how aggressively journals will try to influence author decisions.  The early results, however, from journals that have announced policies for NIH-funded authors, show that journals are insisting on embargoes of 6 or 12 months.  There are members of Congress from both parties willing to monitor how well the policy meets its goals and, if necessary, to strengthen the policy.

Manuscript submission home page

New FAQ specifically on the manuscript-submission process

Email list specifically on the manuscript-submission process

Helpdesk web form specifically on the manuscript-submission process

April 29 statement on implementing the public-access policy


Archived postprints should identify themselves

Some publishers worry that OA archiving tends to erase their brand.  I'd like to suggest that authors, readers, and the friends of OA should worry about this too.

Most postprints in most OA repositories use metadata to identify the journal in which an article was published.  This is very helpful but not sufficient.  While these metadata are readable by special tools, they are not always readable by the human readers who open the files. 

Authors who archive the texts of their published articles (postprints) should identify somewhere on the postprint, in a clear and conspicuous way for human readers, the journal in which the article was published.  This practice would help *all* the stakeholders.  In particular, it would help the following:

(1) Authors.  When an article has been accepted by a peer-reviewed journal, then authors benefit from revealing this fact.  This is elementary:  peer-reviewed articles have more credibility than unrefereed preprints.  That's why authors submit their work to peer-reviewed journals instead of posting them directly online and bypassing peer review.  Readers who are too busy to look at everything on a subject have a justified preference for peer-reviewed literature.  Authors deserve the boost in credibility and the boost in audience that comes from identifying their work as peer-reviewed.  If the journal is esteemed, then identifying it gives the author an extra boost on both fronts.

(2) Readers.  Readers not only prefer refereed articles to unrefereed articles when they are in a hurry.  They also use a journal's name, orientation, and reputation as clues to an article's quality or methodology.  This isn't the place to debate whether these clues support accurate inferences more often than marketing programs and self-fulfilling opinions already circulating within the discipline.  What matters here is that readers would rather have these clues than not have them, and authors benefit by attracting readers.

(3) Journals and publishers.  This is where we started.  Journals and publishers want to preserve their brand.  What they have to offer above everything else is their selectivity, quality, and editorial standards.  Journals work hard on these and deserve to be identified.  Journals that perform copy editing, fact checking, and other services, deserve credit for these forms of added value.

(If you tuned in late, I acknowledge that journals add value.  It's a myth that OA wants to dispense with these valuable services, although sometimes OA journals must choose between the more essential and the less essential services.  The true bone of contention is not whether these services are valuable but how to pay for the most essential services without creating access barriers for readers.)

(4) OA proponents in general.   About 80% of surveyed journals already permit postprint archiving.  We don't really know what's holding back the remaining 20%, and there are bound to be many variables in the mix.  One variable, surely, is branding.  If journals in the 20% knew that postprint archiving would preserve their brand, then they would be more likely to permit postprint archiving (i.e. more likely to shift from gray to green).  Conversely, if postprint archiving routinely preserved the journal's brand, then journals in the 80% would be less likely to rescind their permission (i.e. less likely to shift from green to gray). 

Journals know or should know that OA increases an article's citation count by 50-300%, even after we restrict the comparison to non-OA articles from the same journal and year.  This boost in citation impact benefits journals as much as it benefits authors.  All journals have this interest in encouraging OA archiving, not merely permitting it.  However, non-OA journals have countervailing interests as well, which explains why very few go beyond permission to encouragement.  We don't know how close the balance of pros and cons is at a given journal, but every added benefit can help tilt the balance toward active encouragement.  Preserving the journal's identification can provide this kind of help.

For the evidence that OA increases citation impact, see the studies collected in Steve Hitchcock's bibliography, The effect of open access and downloads ('hits') on citation impact.

(5) All who want to solve the version control problem.  If a work has been accepted by the peer-reviewed Journal of Yada Yada and says so in some evident way, then readers will know they are reading a postprint, not a preprint --apart from what they infer about the paper from JYY's reputation for quality.  If the identification is more expansive, then readers might be able to tell whether they are reading a postprint that has gone through peer review but not yet copy editing or one that has gone through both. 

In my view, the version control problem is troublesome but not urgent.  I'd like to find a solution, but I'm only interested in solutions that don't deter or delay self-archiving.  Branding alone won't solve all aspects of the version control problem, but as long as authors have already decided to preserve branding, from self-interest, then it meets my criterion of that version-control solutions should not hinder OA.

Since self-archiving is done by authors (or by helpers acting on behalf of authors), the bottom-line recommendation has to go to authors.  Identifying the journals in which your articles are published helps you, not just your readers and publishers.  It's not just another burden on you or another gift to publishers.  It is true common ground.  It's already beneficial and easy.  Let's make it the norm.

*How* should archived postprints identify the journals in which they were published?  I don't want to drift toward false precision or a premature standard, and in fact I'd be happy with just about any kind of identification that effectively conveyed this information to readers who are looking for it.  But even this informal standard requires that the identification should be visible text, not invisible metadata.  All the benefits I spelled out above are enhanced if the journal identification is human-readable.  (Of course, they are also enhanced if the identification is machine-readable.  So I'm not at all arguing against embedded metadata.) 

Authors who take this recommendation will have to add one more small extra step to the steps already required for deposit in an OA repository.  In general, I believe it's harmful to make the self-archiving process more time-consuming or complicated.  But this step is justified because it helps authors realize the goals that led them to self-archive in the first place.  It attracts readers and therefore increases impact.

Les Carr and Stevan Harnad have found that for the average author self-archiving one article takes 6-10 minutes.  If authors took the additional step I'm recommending, then self-archiving might take 6.1-10.1 minutes.  If that starts to deter authors, then I'd regret it as much as anyone and consider withdrawing my recommendation.  One alternative is for archiving software to eliminate even this pebble in road by taking the journal identification from the metadata and copying it to the top of the text file, perhaps after the user clicks "yes, do that".

As long as authors are adding citations, should they add anything else?  Here we risk enlarging the task until authors decide they can't be bothered.  But if we're thinking about what could be useful, there are at least two possibilities.

(1) A version number.  This would take us even further toward solving the version-control problem.  If there were a convention for identifying postprints that have been peer-reviewed but not copy-edited, postprints that have been both peer-reviewed and copy-edited, and postprints that have been revised or corrected since publication, then version numbers would be even more useful.  There is much potential utility here to offset the negligible cost in effort.  But for now this is potential ahead of the demand, or at least ahead of the convention.

(2) A link to the publisher's edition.  On the plus side, this would encourage even more publishers to support postprint archiving and do more to solve the version-control problem.  On the minus side, it does nothing to boost author credibility or readership, it may suggest that the OA edition is inferior to the publisher edition, and it adds another step to the self-archiving process.  It's not clear how these considerations net out.  But I can observe that linking to the toll-access publisher copy doesn't help authors, and that complicating the self-archiving process without compensation for authors is not a good recipe for promoting OA.  Publishers who want to make this step as easy for self-archiving authors as possible might want to make sure that authors have the URL or, even better, the published file with the URL already embedded.  But since authors want to self-archive their postprints immediately upon publication, journals would have to provide these without an embargo.

* Postscript.  I've been talking about postprints because they already have publishers.  But the same argument applies to preprints that have already been accepted for publication.  Think about the difference between your response to an unlabeled preprint and one that says it's forthcoming from the Journal of Yada Yada.


Trojan horse eprints

Some publishers worry that self-archiving will create copies whose download counts they can no longer monitor.  We could increase publisher support for self-archiving, or reduce publisher opposition, if we could solve this problem. 

Unfortunately, the problem may be intrinsic to OA. 

Or at least the only solution on the horizon is a bad one.

A Canadian company called Remote Approach is working on executable scripts embedded in PDF files that will report back to their creators whenever the files are opened, even after they have been copied and redistributed.  That would help publishers keep accurate traffic data, whether the copies in circulation were authorized or unauthorized.  You can tell what demand Remote Approach is trying to meet.

Even though this technology would likely increase publisher support for postprint archiving, I am very suspicious of executable scripts in PDF files.  The problem is not just preserving reader privacy.  If that were all, we might be able to insure that the scripts only collected anonymized traffic data.  The deeper problem is that once we allow scripts in text files for benign purposes, it will be very hard to block, let alone detect, scripts for malign purposes.  Malign scripts could subvert fair-use rights and open access.

The Associated Press reported on March 31 that "Remote Approach is also working on a feature that would let a company block a document from being read if there's no Internet connection."

Imagine downloading a copy of a self-archived PDF to your personal hard drive on Monday.  On Tuesday, when you want to read it offline, you discover that it is unreadable.  The publisher had remotely set it to deactivate when taken offline. 

Imagine a new and "improved" script that can deactivate a file even for online reading.  Imagine self-archiving such a PDF file on Wednesday.  On Thursday, the publisher remotely deactivates it so that nobody can read it, even though it is still online.

As soon as publishers can remotely disable PDFs so that users can't read them offline or from certain addresses online, then PDFs will be unsuitable for disseminating science and scholarship, especially in OA repositories. They won't be suitable again until we have trustworthy tools for scrubbing them clean of the remote activation code.

As soon as Remote Approach delivers the remote-deactivation scripts, authors will have to make a choice.  The publisher's PDF is normally the preferred edition for self-archiving.  Should authors archive the publisher's PDF, if they have permission to do so?  Or should they reduce risk and archive a different format that cannot contain mischievous codes? 

For me, the answer is clear.  Even when I'd like to archive the published edition, I would not knowingly archive any article in a package that could render it useless to me and my readers.  If the purpose of self-archiving is to regain control of scholarly communication and provide open access, then it's perverse to archive a version that puts the access decision back in the hands of a publisher who might choose to turn it off.

Unfortunately, it wouldn't be enough for publishers to disavow the Remote Approach technology.  For true peace of mind, we need to know the state of a file, not the state of a publisher's scripting policy.  For this, we'll need tools to scrub files clean.  It's possible that scrubbing utilities could distinguish good scripts from bad, but I doubt it.  To be assured of safety we may have to scrub out all executable scripts.

Today most publishers that allow postprint archiving forbid authors to archive the published PDFs.  But a significant minority, including the New England Journal of Medicine and California Law Review, forbid authors to archive anything else.  When PDFs can contain malign scripts, publishers in the latter category will be in a hard spot.  Users won't know whether the publishers have a hidden agenda for pushing the PDFs. 

Publishers may think that using Trojan Horse PDFs will deter self-archiving, an outcome that many would welcome.  But in practice it will only deter authors from archiving the PDF edition, aggravating the version-control problem, an outcome that most publishers would regret.  Publishers who see this far ahead may start to shun the PDF format, especially if they cannot assure users that clean files are really clean.

Scholarly publishing is a small part of the overall publishing industry, and it's probably a small part of Adobe's PDF business.  But if other users object as much as scholarly users to the prospect of malign scripts in PDFs, then this prospect could kill the format.  Adobe can avert this risk by giving users an effective OFF switch.

Associated Press, Company develops system to track PDF documents, March 31, 2005

Joe Brockmeier, Unexpected features in Acrobat 7  LWN.net, March 30, 2005.

Robyn Weisman, Remote Approach Launches PDF Tracking Service, PDF Zone, March 15, 2005.

Remote Approach (the company)


Top stories from April 2005

This is a selection of open-access developments since the last issue of the newsletter, taken from the Open Access News blog, which I write with other contributors and update daily.  I give both the item URL and blog posting URL so that you can read the original story as well as what I or another blog contributor had to say about it.  For other developments, the blog archive is browseable and searchable.

Here are the top stories from April:

* Two OA leaders move on.
* More private-sector companies worry about "competition" from government OA.
* More universities adopt resolutions for open access or against high journal prices.
* Discussion widens for OA in the humanities.
* The family of eprint repository software grows.

* Two OA leaders move on.

On April 20, Jan Velterop announced his departure as publisher of BioMed Central.  The next day, Rick Johnson announced his resignation as director of SPARC (effective July 1).  The two decisions are independent, of course, and it's just a coincidence --even if an eerie coincidence-- that one came on the heels of the other.  In response to a question that I've heard more than once already, I don't interpret these two decisions as evidence of disenchantment with OA or the effort to promote it.  To me they are signs that OA is maturing. OA has been around long enough that its leaders can start to step down.  More, they can start to step down not in frustration but after a string of significant successes.

I've worked with Jan and Rick since 2001, when we were on the team (with a dozen others) that drafted the Budapest Open Access Initiative.  I'm personally grateful for their leadership, vision, and energy.  Jan is unmatched in his promotion of the cause of OA journals, and Rick is unmatched in his coalition building and strategic thinking.  But if I dwell on that, I sound like I'm writing an obituary for their OA labors, which would be very premature.  Both plan to continue their work for OA in new and different ways.  I wish them the best.

Jan will be replaced at BMC by Matthew Cockerill and Anne Greenwood.  Rick will be replaced at SPARC by Heather Joseph.

Press release on Jan Velterop's resignation

Press release on Rick Johnson's resignation

(SPARC will continue to publish this newsletter.)

* More private-sector companies worry about "competition" from government OA.

Senator Rick Santorum (R-PA) has introduced a bill in the U.S. Senate which would largely eliminate open access to the publicly-funded meteorological data collected and disseminated by the National Weather Service.  Santorum doesn't want the government to "compete" with private industries.  His rationale suggests that private weather forecasting companies collect the same data at their own expense and should have a fair chance to sell it to consumers.  But in fact the private forecasting companies make use of publicly-funded data and want to become its exclusive distributors.

The National Weather Services Duties Act of 2005 (S.786)
(The final colon is part of the URL.)

Robert P. King, Feds' weather information could go dark, Palm Beach Post, April 21, 2005.

OA government geo data may be removed from web  
James Fallows, An Update on Stuff That's Cool (Like Google's Photo Maps), New York Times, April 17, 2005

The Carpetbagger Report notes that AccuWeather, one of the private, for-profit weather forecasters actively lobbying for the bill, is based in Santorum's state of Pennsylvania. Ed Bott has discovered that the family of Joel Myers, founder and President of AccuWeather, has donated money to Sen. Santorum.

The EFF has posted a web form for U.S. citizens to ask their Senators to oppose the Santorum bill.

Just as Santorum was introducing his bill in the Senate, the U.S. National Science Board was calling for public comments on its recommendation that the U.S. government "encourage free and open access wherever feasible" to publicly-funded data.  (The deadline for public comments was yesterday.)

And of course in January 2004, the U.S. government signed the OECD Declaration on Access to Research Data From Public Funding.

While the Santorum bill was stirring controversy, the CAS (formerly, Chemical Abstracts Service) was complaining about unfair competition from PubChem, an OA service from the NIH.  Unlike the weather data case, here there is a serious disagreement about how much duplication there is between the two services and whether they are complementary or  competitive. 

Associated Press, Company says free government information threatens its business, The State, April 15, 2005. 

Susan Morrisey, Database Debate, Chemical and Engineering News, April 25, 2005.  On the CAS complaint against PubChem.

Don't underestimate corporate complaints about unfair competition from publicly-funded OA.  Publishers used this argument effectively to kill PubScience in 2002.

* More universities adopt resolutions for open access or against high journal prices.

The last big wave of university resolutions in support of OA or in opposition to high journal prices came in the fall and winter of 2003-2004.  A new wave started this spring.  Some of these resolutions were adopted in March but only publicized in April.  In chronological order by date of adoption:

University of North Carolina, March 4, 2005.  Two resolutions.

Article about the UNC resolutions.

University of California, March 10, 2005.

University of Kansas, March 10, 2005.

In addition to its resolution, Kansas also signed the Registry of Institutional OA Self-Archiving Policies, becoming the first U.S. institution and the first AAU institution to do so.

Columbia University, April 1, 2005.

Article about the Columbia resolution.

See my list of university resolutions, including quoted excerpts.

* Discussion widens for OA in the humanities.

Roy Rosenzweig, Should Historical Scholarship Be Free? Perspectives, April 2005.  An exemplary argument for OA in history. I wish every discipline had a high-profile essay of this cogency to kick the ball forward.

Alun Salt, I'm a hypocrite (of sorts), April 25, 2005. A blog posting on OA publishing in archaeology.

Alun Salt, Isn't ArXiv Wonderful?  April 10, 2005.  A blog posting on the need for an arXiv in archaeology.

Antonella D'Ascoli, Open Access Archaeology, Journal of Intercultural and Interdisciplinary Archaeology, March 30, 2005.  Although the title is in English, the article is in Italian.

For background, see my 2004 essay, Promoting Open Access in the Humanities

* The family of eprint repository software grows.

The well-populated field of eprint repository will soon have two new members: 

Symposia, from Innovative Interfaces (announced but still forthcoming)

teiPublisher, from the University of Maryland Libraries (in beta)

There were several other developments in this field as well:

Two of the largest and most-used eprint repositories, ADS and arXiv, announced a partnership to offer a joint, customizable current-awareness service.

METALIS is a new search engine from AePIC for OAI-compliant repositories in library and information science.

My Meta Maker by Thomas Severiens is a tool for creating harvestable metadata for articles not on deposit in OAI-compliant archives.  (It's not new but it came up in an April AmSci OA Forum discussion.)

DSpace released version 1.3 alpha.

Marion Prudlo wrote a review of three archiving tools (LOCKSS, Eprints, and DSpace) for the April 2005 issue of Ariadne.


Coming up later this month

Here are some important OA-related events coming up in May.

* May 10.  The Dutch DARE program will launch its Cream of Science (Keurderwetenschap) project, a national web site showcasing the best recent Dutch scholarship.  Most but not all of it will be free online through the DARE network of OA repositories.

* Mid-May.  The Research Councils UK will release their policy on open access.

* Sometime in May, Science.gov 3.0 is supposed to launch.  It will support Boolean searches and stored searches with current awareness.

* Notable conferences this month

Open Waters - Open Sources: 11th Biennal Conference of the European Association of Aquatic Sciences Libraries and Information Centres (OA is among the topics)
Split, Croatia, May 4-6, 2005

Making the strategic case for institutional repositories (sponsored by CNI, JISC, and SURF) (by invitation only)
Amsterdam, May 10-11, 2005

Set research free: The open access publishing movement
Seattle, May 11, 2005

Digital Repositories: Interoperability and Common Services (sponsored by DELOS) (OA is among the topics)
Heraklion, Crete, May 11-13, 2005

Everything you always wanted to know about e-journals but were afraid to ask... (sponsored by the UK Serials Group)
Sheffield, England, May 12, 2005

Open Access Conference
Braga, Portugal (University of Minho), May 12-13, 2005

Open Access und rechtliche Rahmenbedingungen
Göttingen, May 13, 2005

Fedora Users' Conference
New Brunswick, New Jersey, May 13-14, 2005

Globalization of Information (XIth IAALD World Congress)
Lexington, Kentucky, May 14-20, 2005
--There are several preconference programs, all on May 14-15. One will focus on OA, one on OAI, and one on OAIS.

UNESCO Between Two Phases of the World Summit on the Information Society (OA is among the topics)
St. Petersburg, May 17-19, 2005

Authors' "Copy Rights" and Open Access Publishing (a public lecture by Mary Jackson)
Philadelphia (Drexel University), May 19, 2005

NASIG 20th Annual Conference (OA is among the topics)
Portland, Oregon, May 19-20, 2005

Scholarly Publishing and Open Access: Payers and Players (sponsored by the Library Association of the City University of New York)
New York, May 20, 2005

Council of Science Editors Annual Meeting
Atlanta, May 20-24, 2005
--Task Force on Science Journals, Poverty, and Human Development, a session on Monday, May 23, 11:30 am - 1:00 pm and 5:00–7:00 p.m. (OA is among the topics.)

Wissenschaftliches Publizieren der Zukunft - Open Access (sponsored by DINI and SPARC)
Göttingen, May 23-24, 2005

XML, the Web and Beyond
Amsterdam, May 24-27, 2005
--One track of the conference will be on "Open Data"

Commons-sense: Towards an African Digital Commons
Johanesburg, May 25-27, 2005

AARHUS Convention: Convention on Access to Information, Public Participation in Decision-making and Access to Justice in Environmental Matters
Almaty, Kazakhstan, May 25-27, 2005

Free Software, Free Society (OA is among the topics)
Thiruvananthapuram, Kerala, India, May 28-30, 2005

Information and Innovation (2005 annual conference of the International Association of Technological University Libraries) (OA is among the topics)
Quebec, May 29 - June 3, 2005

* Other OA-related conferences



* I've added 19 new conferences to the conference page since the last issue.  In the next few days I'll delete the second asterisk marking them and the new entries will blend into the rest of the collection.

* Charlie Lowe at Cyberdash found that neither the Atom feed nor the Feedburner RSS feed for Open Access news worked for his Drupal news aggregator. So he commisisoned a plain-vanilla RSS 2.0 feed for OAN from Feedburner and has agreed to make it public.

Lowe's new RSS 2.0 feed URL for Open Access News (no need to change if one of the existing feeds works for you)

Lowe's blog posting about it

* In mid-April I had more continuous downtime with Open Access News that I've ever had before.  For nearly three days I was unable to upload new postings and sometimes unable to finish uploading half-loaded pages.  The result was that the blog was often frozen and sometimes both frozen and fragmented.  The problems seemed to arise from Blogger, although I never got an authoritative diagnosis.

I thank the many friends who offered to help me move OAN to another site or another platform.  So far I haven't had to take these steps.

If I encounter similar problems again, I'll use the SPARC Open Access Forum (SOAF) to tell you about them.  If the problems persist, I'll use SOAF to post news items until the blog is fixed.  You don't have to subscribe to SOAF to read it when you suspect the blog is down.

I can solve most Blogger and RSS problems myself.  But when I can't, I'd appreciate having an expert or two whom I could consult.  If you're willing to be on my emergency, pro bono help list, for either Blogger or RSS, please let me know.  Thanks.


This is the SPARC Open Access Newsletter (ISSN 1546-7821), written by Peter Suber and published by SPARC.  The views I express in this newsletter are my own and do not necessarily reflect those of SPARC.

To unsubscribe, send any message to <SPARC-OANews-off@arl.org>.

Please feel free to forward any issue of the newsletter to interested colleagues.  If you are reading a forwarded copy of this issue, see the instructions for subscribing at either of the first two sites below.

SPARC home page for the Open Access Newsletter and Open Access Forum

Peter Suber's page of related information, including the newsletter editorial position

Newsletter, archived back issues

Forum, archived postings

Conferences Related to the Open Access Movement

Timeline of the Open Access Movement

Open Access Overview

Open Access News blog

Peter Suber

SOAN is an open-access publication under the terms of the Creative Commons Attribution License.  Users may freely copy, distribute, and display its contents, but must give credit to the author.  To read the full license, visit

Return to the Newsletter archive