Open Access News

News from the open access movement

Saturday, January 05, 2008

Misleading questionnaire on scholarly publishing

The American Society for Information Science and Technology (ASIS&T) has released a Scholarly Publishing Questionnaire.  It includes several questions about OA.  (Thanks to Robin Peek.)

One question asks whether you believe that OA journals are "radical". 

Another asks whether OA entails that "archiving will suffer".

Another asks whether you'd like to "personally deal with any permission requests" and "deal personally with any legal disputes when copyright is infringed", leaving the impression that OA would impose these burdens.  Of course, OA makes permission requests unnecessary and permits rather than restricts most acts that normally count as copyright infringement, such as copying and redistribution.

Granted, these are questions, not assertions.  But they're loaded questions with misleading assumptions. 

There's one more reason to distrust the results:  The questionnaire doesn't ask for an email address or assign a validation code.  There's apparently nothing to stop anyone from filling it out more than once.

Intro to open data

Peter Murray-Rust, Open Data In Science, a preprint forthcoming from Serials Review.  Self-archived January 5, 2008.

Abstract:   Open Data (OD) is an emerging term in the process of defining how scientific data may be published and re-used without price or permission barriers. Scientists generally see published data as belonging to the scientific community, but many publishers claim copyright over data and will not allow its re-use without permission. This is a major impediment to the progress of scholarship in the digital age. This article reviews the need for Open Data, shows examples of why Open Data are valuable and summarises some early initiatives in formalising the right of access to and re-use of scientific data.

Also see Peter's blog post about this article.

155 years of cell science now OA

The Journal of Cell Science has provided free online access to its entire 155 year backfile.  (Thanks to Black Knight via McBlawg.)  JCS is published by the Company of Biologists.

STM response to the OA mandate at NIH

STM comments on U. S. National Institutes of Health Unfunded Mandate, a press release from the International Association of Scientific, Technical & Medical Publishers (STM), January 4, 2008.  Here it is in its entirety:

STM today expressed disappointment with the recent passage of legislation in the United States. This legislation (the Consolidated Appropriations Act of 2007 (H.R. 2764)) includes provisions directing the National Institutes of Health to mandate that investigators who are supported by grants from the National Institutes of Health must deposit their manuscripts directly into the National Library of Medicine's PubMed Central database no later than 12 months after the official date of publication.

The legislation neither provides compensation for the added-value of services that these manuscripts have received from publishers nor does it earmark funds to ensure the economic sustainability of the broad and systematic archiving this sort of project requires. It also undermines a key intellectual property right known as copyright - long a cornerstone used to foster creativity and innovation.

STM believes that this legislation establishes an unfunded government mandate with an unknown impact on the advancement of science and puts at risk a system which has enabled more research to be available to more scientists in more countries than at any point in the history of science.

STM CEO Michael Mabe commented, "Other governmental bodies, such as the European Commission, have recognized the unique role and extensive investments made by scientific publishers in the organization of peer review, the management of publication processes, the production, access, distribution, preservation and digitization of scientific knowledge. They have called for an evidence-based approach toward questions like the broad and systematic archiving of scientific manuscripts to ensure that the current system of scientific publishing is not destabilized without reason. Regrettably, neither the acknowledgement of the key role that publishers play in the advancement of science, nor the commitment toward an evidence-based approach, nor the funding to support this broad mandate seems present in the current U.S. legislation."

Mabe continued: "STM publishers will, of course, comply with the laws of the nations in which they operate. At the same time, in order to fulfill their primary mission of maximizing the dissemination of knowledge through economically self-sustaining business models, they will continue a vigorous engagement with appropriate stakeholders on issues such as this where legislative change seems necessary or desirable."


  • The STM paraphrase of the new legislation is inaccurate on one point.  The new law doesn't require deposit of manuscripts within 12 months of publication; it requires deposit immediately upon acceptance and free online release within 12 months of publication.
  • The claim that the NIH mandate "undermines...copyright" is unargued.  It also flies in the face of the facts.  The STM (like the AAP/PSP in its response) overlooks a key clause in the new legislation:  "Provided, That the NIH shall implement the public access policy in a manner consistent with copyright law."  As I pointed out in my comment yesterday on the AAP/PSP response, "It's simply mistaken to say that the OA policy demanded by Congress requires violation of copyright.  It requires compliance with copyright."
  • "STM believes that this legislation establishes an unfunded government mandate...."  It's true that the new legislation gives the NIH no new money specifically to implement this policy, although it does increase the NIH budget by 0.5% to $29.2 billion.  The NIH has already estimated to Congress that implementing the new policy, at 100% compliance from grantees, will cost $3.5 million/year.  That comes to about 0.01% of the NIH budget.  I call that a bargain, especially in light of the ways in which OA multiplies the value of valuable research.  Studies by John Houghton and others have shown that diverting a bit from the research budget in order to make all funded research OA significantly amplifies the return on investment: "With the United Kingdom's GERD [Gross Expenditure on Research and Development] at USD 33.7 billion and assuming social returns to R&D of 50%, a 5% increase in access and efficiency [their conservative estimate] would have been worth USD 1.7 billion; and...With the United State's GERD at USD 312.5 billion and assuming social returns to R&D of 50%, a 5% increase in access and efficiency would have been worth USD 16 billion."
  • BTW, the $3.5 million/year that the NIH will spend to implement the OA policy is dwarfed by the $30 million/year it spends on page charges and other subsidies to toll-access journals.  The $30 million now pocketed by publishers like those represented by STM is also "unfunded".
  • "Other governmental bodies, such as the European Commission, have recognized the unique role and extensive investments made by scientific publishers in the organization of peer review...."  This is true.  It's also true that other governmental bodies have already acted to mandate OA to publicly-funded research.  There are OA mandates now in place at public funding agencies in Australia, Austria, Belgium, Canada, France, Germany, Scotland, Switzerland, and the UK.  In many other countries there are policies in place that encourage without requiring OA to publicly-funded research.
  • "[Other governments] have called for an evidence-based approach toward questions like the broad and systematic archiving of scientific manuscripts to ensure that the current system of scientific publishing is not destabilized without reason."  The most extensive government investigation into the costs and benefits of OA was undertaken by the UK House of Commons Select Committee on Science and Technology, and resulted (July 2004) in the strongest OA recommendations to date from any government panel, including a recommendation for an OA mandate to publicly-funded research.  See the voluminous oral and written evidence collected by the committee.  When publishers take on the question, they typically overlook the evidence that high-volume OA archiving in physics has not caused journal cancellations in the 15+ year life of arXiv, the evidence that high journal prices cause many more cancellations than OA archiving, and the evidence that libraries will have many incentives to continue their subscriptions even after funding agencies adopt strong OA policies.  They also tend to make evidence-free statements about the dire consequences of OA for peer review and copyright --just as the STM is doing here.  (For more detail on the evidence, including the evidence of publisher disregard for the evidence, see my article from September 2007.)
  • "STM publishers will, of course, comply with the laws of the nations in which they operate."  I commend this statement, but it shows another misunderstanding of the new law.  The NIH policy will regulate NIH grantees, not publishers.  The question for publishers is not whether they will comply with a law addressing other actors, but whether they will continue to publish work by NIH-funded researchers, knowing that those researchers will be bound by a prior funding agreement to deposit their peer-reviewed manuscripts in PubMed Central for eventual OA dissemination.

Journals should make it easier to learn their access policies

Peter Murray-Rust, Why getting information from publishers is soul-destroying, A Scientist and the Web, January 5, 2008.  Peter starts from Bill Hooker's post on the AAP/PSP (blogged here just below), and then expands upon a more general problem:

...I know exactly what Bill has gone through because I’ve done a lot of this myself. It might seem simple to find information from publishers. It’s not. My generalisations below extend a little into Open Access publishers, but it’s mainly aimed at Closed Access publishers.

A little while ago I thought it would be useful to see what degree of compliance Open Access publishers of chemistry had with the BBB declarations.  Should be easy - there’s only about 60 titles listed. So I mailed the Blue Obelisk and the Open Knowledge Foundation and suggested that if we divided the work - each took a few publishers - we could do this in a relatively short time. And maybe publish it.

Oh dear. The publisher websites were awful. It’s practically impossible to find out anything from most publishers (of any ROMEO/HARNAD colour). It’s spread over several pages, perhaps for authors, perhaps general blurb, wherever (and this is true for all publishers). We created a spreadsheet of what we wanted to record but found that the practice was so variable that we couldn’t systematize it.

So we gave up. The effort of finding out policies, even for Open Access publishers was too great. (But, closed access publishers, do not feel this is a defeat - we shall return).

The thing that really upsets me about closed access publishers is how profoundly unhelpful they are. They don’t want to communicate with the general authorship and readership. Each thinks it’s the centre of the world. Despite their acclaimed publisher organisations (AAP, ALPSP, STM, etc.) it is one of the technically worst industries I have encountered. There are no standards. No attempt to adjust to the modern world (I shall revist this later). Here are some examples:

  • Many don’t reply to courteous requests for information. I admit that this blog is sometimes a bit brusque, but it’s come that way because of the unhelpfulness of publishers. Every publisher should have a page on which it lists its policies. And there should be open forums for discussion on these policies. Some repository managers spend large amounts of time trying to work out whether articles can be put in  a repository - and I guess the publisher gets asked frequently. Wouldn’t it be easy to add a label to each journal saying whether manuscripts can be put in a repository. I suppose not, it would require agreement across the industry.
  • They work on Jurassic timescales. In the modern age people expect replies by return. It’s taken months to get answers for my latest manuscript  - and I’m an author. The ACS is taking a minimum of FIVE MONTHS to respond to Antony Williams’ courteous request as to the copyright position of our abstracting of factual data.
  • Requests, discussion, etc are all fragmented. I suspect the  same questions get asked again and again. If these were listed  on a policy FAQ as they were asked and answered it would save everybody’s time....
  • The technical business model is slow to adjust to changing demands. So when publishers adopted their “hybrid” policy (a different one for each publisher of course) they generally failed to tell the technical department that they needed to adjust their labelling and their policies and permissions for individual articles. With the result that I spent a number of gloomy days on this blog pointing out to publishers how little effort they had put into this....

But the really sad thing is that publishing (unlike making toothpaste, or bicycles) is based on communication....

Comment.  Hear, hear.  Three years ago I wrote an article entitled, Journals:  please post your access policies.  Here's a snippet:

Journals should post the details of their current access policies on their web sites.  Today some do and some don't.  Some are thorough and some are skimpy.  Some are current and some are way out of date.  Because policies differ from journal to journal, and sometimes from issue to issue of the same journal, potential readers, authors, and subscribers are more confused than they --we-- have to be.  We shouldn't have to undertake a research project, make phone calls, send off emails, or conduct listserv colloquies (themselves confusing and inconclusive) just to learn these basic facts of life in the digital age....

Most journals do a very good job spelling out the details of their submission policies online.  Self-interest requires it.  The time has come to do the same with their access policies.  If assisting potential readers, authors, and subscribers counts as self-interest, then self-interest requires this step as well....

Nearly all [the] benefits would be even greater if journals would post their policy details to a central database, or post them on their own web sites with standardized terminology or tags.  Detail-harvesting, searching, and comparison could then be automated.  But for now this is too much to ask.  At least journals should put their policies on their own sites in their own words and keep them up to date....

Does the AAP/PSP criticism of the NIH policy reflect its members?

Bill Hooker, Does the AAP/PSP really represent its members? Open Reading Frame, January 4, 2008.  Excerpt:

The PSP lists its members here; it didn't take long to compare that list with the list of publishers indexed by SHERPA/RoMEO. Of the 355 publishers in the RoMEO database, 46 are members of PSP; of these, 16 are listed as "grey" (won't allow archiving), 23 are "green" (allow refereed postprint archiving -- NIH mandate compliant) and 7 "pale green" (allow preprint archiving; many "pale green" publishers actually allow postprint archiving and are NIH compliant, but are not listed as green because of various restrictions).

It's not possible to do what I wanted here -- which was to answer the title question. The problem is that the PSP lists 102 about 100 members that aren't indexed by RoMEO. I found that somewhat surprising, particularly since the list includes names I'd have expected to find in RoMEO: FASEB, Stanford U Press, Yale U Press, Cold Spring Harbor Lab Press, NEJM, Highwire Press and others.

Nonetheless, we can say that if the RoMEO-indexed sample (46 of 148, 31%) is representative, then at least 50% of PSP members are already complying with the NIH mandate, and a further 15% at least allow preprint archiving and may even be NIH-compliant.

It's even more unbalanced if we compare the numbers of journals published by each company. Those 46 publishers account for 5901 journals; the grey publishers put out 222 (4%), the green publishers 4243 (72%) and the pale green publishers 1436 (24%).

If the PSP were honest and interested in fairly representing its members, I'd think they would find out (and make public) whether the remaining, non-RoMEO indexed members follow the same pattern. I won't hold my breath....

Bill adds this clarification in a subsequent post:

The publisher list I've been using in the last few posts actually comes from, using information from SHERPA/RoMEO....This wouldn't cause any confusion and I wouldn't bother to point it out, except that RoMEO actually uses a four-colour scheme (green, blue, yellow, white) which EPrints has squished into three (green, pale green, grey).

Comment.  Bill asks a good question and makes a good start toward an answer.  To summarize his data:  Counting AAP/PSP members as publishers, about half already have policies in place compatible with the coming NIH mandate.  Counting AAP/PSP members according to the journals they publish, the vast majority (roughly 72%) already have such policies in place.  When the AAP/PSP Executive Council launched PRISM in August 2007, it tried to give the impression that the new organization was a coalition and represented the AAP/PSP membership.  But PRISM never publicly identified any members of the coalition, and nine major publishers soon disavowed or distanced themselves from it.  Two members of the AAP/PSP Executive Council even resigned in protest:  James Jordan, director of Columbia University Press, and Ellen Faran, director of the MIT Press.  Has AAP/PSP ever consulted its members about its position on the NIH policy?  Are AAP/PSP members willing to see their dues spent on a lawsuit to delay it? 

Update (1/5/08). Bill Hooker is drafting a letter to the green publishers who belong to the AAP/PSP, asking whether the association's position on the NIH policy reflects their views and whether it consulted them before making its position public. Let him know if you are willing to sign his letter (by email or by commenting on his blog). I am.

Making 2008 the year of open data

2008 — year of open data, Open Data Commons, January 3, 2008.  Excerpt:

2008 is looking like it will be the year of open data. With the release of the Science Commons protocol, the announcement of CCZero, and of course our project, it looks like there will be quite a few options on the table for licensing data in an open way this year. This is after a long time where there were no good options for those looking at licensing data.

Hopefully we will soon release the draft Public Domain Dedication & Licence for use and then we can start getting some feedback from projects making use of the licence and their experiences. With some early adopters, we can quickly start to see some of the benefits of the public domain approach, and maybe some variations on the Community Norms (you are after all free to roll your own)....

Friday, January 04, 2008

Florida museum to make marine taxonomy data OA

Florida Museum receives $186,000 for DNA bar-coding project, a press release from the University of Florida, January 4, 2008.  (Thanks to Gavin Baker.)  Excerpt:

The Florida Museum of Natural History received $186,000 from the Alfred P. Sloan Foundation Tuesday to identify and prepare 25,000 marine specimens as part of a new international DNA barcoding project....

Florida Museum Malacology Curator Gustav Paulay expects the project to eventually yield public, online databases for species identification that also will create evolutionary tree diagrams with the click of a button.

“The point of this is to make the taxonomic information as available as possible,” said Paulay, a world-renowned coral reef expert and co-principal investigator on the two-year project known as the Marine Barcode of Life....

US National Academy of Sciences defends evolution in OA book

I no longer blog individual OA books, since there are now so many.  But I'll make an exception for this exceptionally important one: 

Science, Evolution, and Creationism, by a special committee of the US National Academy of Sciences and the Institute of Medicine, National Academies Press, January 3, 2008.  From the blurb:

How did life evolve on Earth? The answer to this question can help us understand our past and prepare for our future. Although evolution provides credible and reliable answers, polls show that many people turn away from science, seeking other explanations with which they are more comfortable.

In the book, Science, Evolution, and Creationism, a group of experts assembled by the National Academy of Sciences and the Institute of Medicine explain the fundamental methods of science, document the overwhelming evidence in support of biological evolution, and evaluate the alternative perspectives offered by advocates of various kinds of creationism, including "intelligent design." The book explores the many fascinating inquiries being pursued that put the science of evolution to work in preventing and treating human disease, developing new agricultural products, and fostering industrial innovations. The book also presents the scientific and legal reasons for not teaching creationist ideas in public school science classes....

One important way in which this book is not exceptional is that the full text is free online.  All monographs from the National Academies Press are published in dual OA/non-OA editions, and have been since March 1994.

Update.  Also see the National Academies' press release, January 3, 2008.  It names the 16 members of the authoring committee and makes this statement:

...Despite the overwhelming evidence supporting evolution, opponents have repeatedly tried to introduce nonscientific views into public school science classes through the teaching of various forms of creationism or intelligent design.  In 2005, a federal judge in Dover, Pennsylvania, concluded that the teaching of intelligent design is unconstitutional because it is based on religious conviction, not science (Kitzmiller et al. v. Dover Area School District).  NAS and IOM strongly maintain that only scientifically based explanations and evidence for the diversity of life should be included in public school science courses.  "Teaching creationist ideas in science class confuses students about what constitutes science and what does not," the committee stated....

OA + translation

Isaac C-H Fung, Open access for the non-English-speaking world: overcoming the language barrier, Emerging Themes in Epidemiology, January 4, 2008.

Abstract:   This editorial highlights the problem of language barrier in scientific communication in spite of the recent success of Open Access Movement. Four options for English-language journals to overcome the language barrier are suggested: 1) abstracts in alternative languages provided by authors, 2) Wiki open translation, 3) international board of translator-editors, and 4) alternative language version of the journal. The Emerging Themes in Epidemiology announces that with immediate effect, it will accept translations of abstracts or full texts by authors as Additional files. Editorial note: In an effort towards overcoming the language barrier in scientific publication, ETE will accept translations of abstracts or the full text of published articles. Each translation should be submitted separately as an Additional File in PDF format. ETE will only peer review English-language versions. Therefore, translations will not be scrutinized in the review-process and the responsibility for accurate translation rests with the authors.

AAP/PSP response to the OA mandate at NIH

Publishers Say Enactment of NIH Mandate on Journal Articles Undermines Intellectual Property Rights Essential to Science Publishing, a press release from the Professional/Scholarly Publishing division of the Association of American Publishers (AAP/PSP), January 3, 2008.  I'm happy to quote the press release at length so that I can respond to it at length:

The Association of American Publishers today criticized a controversial new NIH research publication policy that was enacted as part of the omnibus appropriations package for 2008, and reaffirmed that journal publishers who have opposed the policy will continue to pursue their concerns with Congress regarding the policy’s negative impact on science publishing and the protection of related intellectual property rights. Publishers will also urge NIH to conduct a rulemaking proceeding, with opportunity for public comment, before implementing the new policy.

Allan Adler, AAP’s Vice President for Legal and Government Affairs, said the new policy is “unprecedented and inconsistent with important U.S. laws and policies regarding the conduct of scientific research and the protection of intellectual property rights.”

“These issues were never examined by Congress because the statutory authority for the new policy was enacted as a rider on appropriations legislation, without hearings or studies to assess its merits and without scrutiny from the Congressional committees that have expertise and legislative jurisdiction regarding laws governing federal scientific research programs and intellectual property rights,” Adler added.

Under the previous voluntary NIH policy, NIH-funded researchers who wrote articles for publication in scientific journals were “requested” to submit an electronic version of their final, peer-reviewed manuscripts to NIH immediately upon acceptance by a journal for publication, so that the agency could make it freely available to the international online world through its PubMedCentral web site no more than 12 months after the date of journal publication.

“But,” Adler noted, “changing to a new mandatory policy that will ‘require’ such submission eliminates the concept of permission, and effectively allows the agency to take important publisher property interests without compensation, including the value added to the article by the publishers’ investments in the peer review process and other quality-assurance aspects of journal publication. It undermines publishers’ ability to exercise their copyrights in the published articles, which is the means by which they support their investments in such value-adding operations. The NIH policy also threatens the intellectual freedom of authors, including their choice to seek publication in journals that may refuse to accept proposed articles that would be subject to the new mandate.”

“Journals published in the U.S. have strong markets abroad; indeed, in some fields of research, most sales are to institutions and individuals outside the United States,” Adler said. “A government policy requiring these works to be made freely available for international distribution is inherently incompatible with the maintenance of global markets for these highly successful U.S. exports. Smaller and non-profit scientific societies and their scholarly missions will be particularly at risk as their journal subscribers around the world turn to NIH for free access to the same content for which they would otherwise pay.”

Adler pointed out that Congress took a very different approach to ensuring public access to the results of government-funded scientific research when it reauthorized activities of the National Science Foundation in the “America COMPETES Act” enacted last August. “By addressing the issue through the regular legislative process, Congress not only avoided controversies over intellectual property interests in science publishing, but also recognized the value of publication in peer-reviewed science journals and the increasing availability of journal articles from a variety of sources. Instead of mandating free public access to articles published by private sector journals, Congress instructed the NSF ‘to provide the public a readily accessible summary of the outcomes of NSF-sponsored projects,’ along with ‘citations to journal publications’ in which funded researchers have published articles regarding such research.” (emphasis added)

“In the face of such a recent, relevant and rational precedent,” Adler concluded, “there was simply no sound reason for Congress to subsequently allow an appropriations rider to take an inconsistent and more controversial route toward achieving the same policy goal of enhancing public access to the results of scientific research funded by a federal agency.” ...


  1. In December I predicted a publisher lawsuit to halt or delay the OA mandate at the NIH, and this press release suggests that we'll see one.  It's the public version of a legal brief.  Here's a point by point response.
  2. "Publishers will also urge NIH to conduct a rulemaking proceeding, with opportunity for public comment, before implementing the new policy...."  The NIH released its draft policy for a 60 day period of public comments, ending on November 2, 2004.  It later extended the comment period to end on November 16, 2004.  More than 6,000 comments were submitted, which NIH Director Elias Zerhouni described as "overwhelmingly supportive."  The recent mandate language was subject to amendment on six occasions from June to October 2007.  Senator James Inhofe (R-OK) actually filed two amendments in October, one to weaken the language and one to delete it, but he withdrew them both when he couldn't drum up enough support for them.  The mandate language was subject to amendment again in December, after the Bush veto, when Congress had to cut provisions to make the bill acceptable to the President.
  3. "Changing to a new mandatory policy that will ‘require’ such submission eliminates the concept of permission, and effectively allows the agency to take important publisher property interests without compensation...."  I respond to this objection in the next seven bullets (4-10).
  4. Any discussion of the copyright objection has to start with the fact that the NIH policy doesn't apply to the published articles, on which authors transfer copyright.  It only applies to the authors' peer-reviewed manuscripts.  This is true of the current, voluntary policy and it's true of the new bill strengthening the policy to a mandate.
  5. The NIH mandate might or might not eliminate the concept of publisher permission.  Some funder mandates (like CIHR's) still require publisher permission and some (like the Wellcome Trust's) do not.  I hope the NIH mandate falls in to the second category, but it's still too early to say
  6. One way that the NIH could dispense with publisher permission without violating copyright law is to rely on regulations, already adopted in the US, granting federal agencies a license to disseminate the results of the research they fund.  There are two versions of this license, the first (45 CFR 74.36(a), adopted in 2003) for all the agencies within the Department of Health and Human Services, including the NIH, and the second (2 CFR 215.36(a), adopted in 2005) for all federal agencies. 
  7. There are at least two other ways in which the NIH could dispense with publisher permission without violating copyright.  I described them in the August issue of SOAN:  "(1) it could use the fact that the OA editions will be the author's peer-reviewed but unedited manuscripts, not the published editions on which authors transfer copyright.  (2) It could make the new OA requirement an explicit term in the funding contract and require grantees to make any subsequent copyright transfer agreements with publishers subject to the terms of the prior funding contract."
  8. The AAP/PSP is forgetting that the bill enacted by Congress includes an all-important proviso:  "Provided, That the NIH shall implement the public access policy in a manner consistent with copyright law."  It's simply mistaken to say that the OA policy demanded by Congress requires violation of copyright.  It requires compliance with copyright.  If the AAP/PSP is worried that NIH will disregard this part of the instruction from Congress, then it's simply asserting a tautology:  if the NIH violates the law, then it will violate the law.
  9. The AAP/PSP doesn't mention that it has actively worked to boost compliance with the voluntary policy (in order to head off pressure to convert it to a mandate).  This doesn't show that it welcomes the removal of publisher permission from the policy.  But it does show (1) that it presupposes publisher permission and (2) that it believes compliance will not harm publisher interests. 
  10. More directly, the NIH policy is not a "taking" in the legal sense because it doesn't revise copyright law and relies entirely on contracts.  As SPARC, ARL, and the ALA put it in a September 2007 memorandum, "The [new] legislation concerns contract terms, not copyright exceptions....[T]he copyright owner retains complete control of his work, unless he chooses to accept NIH funding. The proposed provision simply provides that, in exchange for public funding, the investigator must deposit a copy of the articles stemming from that funding with PMC so that it can make it publicly available."
  11. "The NIH policy also threatens the intellectual freedom of authors, including their choice to seek publication in journals that may refuse to accept proposed articles that would be subject to the new mandate."  Again the publishers pretend to speak for authors when their policy to lock up content harms them.  If some publishers hate the the NIH policy so much that they refuse to publish the work of NIH-funded authors, then author freedom will be limited by the publisher decision, not by the NIH policy, which is compatible with the participation of all publishers.  But in fact, no publishers of biomedical journals will refuse to publish work by NIH-funded researchers; the quality and quantity of that research are too great.  The AAP/PSP argument here amounts to this:  Please let us opt out without suffering the consequences of letting our competitors publish all that first-rate research.
  12. "A government policy requiring these works to be made freely available for international distribution is inherently incompatible with the maintenance of global markets for these highly successful U.S. exports. Smaller and non-profit scientific societies and their scholarly missions will be particularly at risk as their journal subscribers around the world turn to NIH for free access to the same content for which they would otherwise pay."  This is irrelevant to the copyright argument, but it deserves an answer anyway.  Congress understood that conventional publishers depend on subscription revenue, which is why it gave them up to a year to recover their costs before PubMed Central released the peer-reviewed manuscripts to the public.  It's also why Congress only applied the policy to the author manuscripts, not to the published articles.  For the rest, the AAP/PSP is simply repeating the argument that green OA will undermine subscriptions, which I've answered at length in an article from September 2007 (see esp. Sections 4-10).  Finally, the appeal to small, non-profit publishers is deeply misleading.  Caroline Sutton and I have found 427 societies publishing 496 full OA journals, and 19 societies publishing 74 hybrid OA journals (to use our updated numbers).  These are more society publishers than have ever joined the AAP/PSP In opposing OA mandates.
  13. "[A better policy would] recognize[] the value of publication in peer-reviewed science journals and the increasing availability of journal articles from a variety of sources...."  The OA mandate at the NIH will not bypass peer-reviewed journals.  On the contrary, it will only apply to articles already published in peer-reviewed journals.  It will enhance, not diminish, the value of "increasing availability of journal articles from a variety of sources".
  14. "Instead of mandating free public access to articles published by private sector journals, Congress instructed the NSF ‘to provide the public a readily accessible summary of the outcomes of NSF-sponsored projects,’ along with ‘citations to journal publications’....In the face of such a recent, relevant and rational precedent...there was simply no sound reason for Congress to...take an inconsistent and more controversial route toward achieving the same policy goal of enhancing public access to the results of scientific research funded by a federal agency...."  It's true that Congress made a different recommendation in the COMPETES Act.  But why should policy A trump policy B instead of B trumping A?  If the criterion is recency, then the OA mandate at NIH is the most recent result of Congressional deliberation.  If it's rationality, then we have to look at the rationale  It's not true that "there was simply no sound reason" for Congress to demand an OA mandate from the NIH.  The reasons were overwhelming.  Congress wanted to accelerate research and share knowledge; to give taxpayers (including professional researchers) access to the research for which they have already paid; to increase the return on the government's enormous investment in research; and to remedy the well-documented failure of the voluntary policy.
  15. Finally, since AAP/PSP brings up the subject of taking without compensation, I can add this.  If the AAP/PSP had its way, then it would take something of value from U.S. taxpayers without compensation:  access to the results of research for which they have already paid in three ways, namely, through the NIH research grant, through researcher salaries at public universities, and through subscription fees at public universities. Private-sector scientific publishers have been huge beneficiaries of public investment and the NIH policy is one small step to give the public something to show for that investment.

Update.  See Andrea Gawrylewski's article in The Scientist (free registration required).  Excerpt:

...Adler told The Scientist that it's too early to say whether this mandate will prompt publisher lawsuits, but he "wouldn't rule out the possibility that publishers might seek judicial review." It depends on how the NIH chooses to implement this policy, he added, given that the general language of the mandate does not specify how it will be implemented in light of copyright laws.

Update. Also see the STM response to the NIH mandate, and my comments on it.

More on the OA mandate at the NIH

Stevan Harnad, Optimize the NIH Mandate Now: Deposit Institutionally, Harvest Centrally, Open Access Archivangelism, January 2, 2008.  Excerpt:

The January issue of Peter Suber's SPARC Open Access Newsletter is superb, and I recommend it highly as a historical record of the milestone reached by the OA movement at this pivotal moment. There is no question but that the NIH Green OA self-archiving mandate is the biggest OA development to date, and heralds much more.

There remains, however, an important point that does need to be brought out, because it's not over till we reach 100% OA, because mistakes have been made before, because those mistakes took longer than necessary to correct, and because a big mistake (concerning the locus of the deposit) still continues to be made.
First, a slight correction on the chronometric facts:

Peter Suber wrote:  "If NIH had adopted an OA mandate in 2004 when Congress originally asked it to do so, it would have been the first anywhere. Now it will be the 21st."
Actually, if the NIH OA mandate had been adopted when the House Appropriations Committee originally recommended it in September 2004, it would have been the world's third Green OA self-archiving mandate, not the first. And Congress's recommendation in September 2004 was the second governmental recommendation to mandate Green OA self-archiving: The first had been the UK Parliamentary Select Committee's recommendation in July 2004.

(1) The Southampton ECS departmental mandate was (as far as I know) the very first Green OA self-archiving mandate of all; it was announced in January 2003 (but actually adopted even earlier). QUT's was the second OA mandate, but the first university-wide one, and was announced in February 2004. (See ROARMAP.)

(2) The UK Parliament's Science and Technology Committee Recommendation to mandate Green OA self-archiving was made in session 2003-04 and published in July 2004 (i.e., before September 2004, when the US House Appropriations Committee made its recommendation)....

Now NIH's has indeed instantly become by far the most important of the Green OA self-archiving mandates to date in virtue of its size and scope alone, but it still hasn't got it right!

The upgrade from a mere request to an Immediate-Deposit/Optional-Access (ID/OA) mandate was indeed an enormous improvement, but there still remains the extremely counterproductive and unnecessary insistence on direct deposit in PubMed Central. This is still a big defect in the NIH mandate, effectively preventing it from strengthening, building upon and complementing direct deposit in Institutional Repositories, and thereby losing the golden (or rather green!) opportunity to scale up to cover all of research output, in all fields, from all institutions, worldwide, rather than just NIH-funded biomedical research: an altogether unnecessary, dysfunctional, self-imposed constraint (in much the same spirit as having requested self-archiving instead of mandating it for the past three lost years)....

[W]ith direct IR deposit mandated by NIH, each of the world's universities and research institutions can go on to complement the NIH self-archiving mandate for the NIH-funded fraction of its research output with an institutional mandate to deposit the rest of its research output, likewise to be deposited in its own IR. This will systematically scale up to 100% OA....

"Optimizing OA Self-Archiving Mandates: What? Where? When? Why? How?"

Comment.  Just two quick notes on the history:

  1. Stevan is right that OA mandates at Southampton ECS and Queensland U of Technology preceded the first Congressional call for an OA mandate at the NIH, and those universities deserve immense credit.  I meant that the NIH would have been the first funder mandate.  Apologies for not making that clearer.  For details on the early history of OA mandates, see my Timeline.
  2. The House Appropriations Committee first called for an OA mandate at the NIH on July 14, 2004.  But that call was not packaged into a report from the full House until September of that year.  The U.K. House of Commons Science and Technology Committee issued its groundbreaking OA recommendations on July 20, 2004.

Thursday, January 03, 2008

"OA must remain a priority for the library community...."

ACRL Research Committee, Environmental Scan 2007, Association of College and Research Libraries, January 2008.  Excerpt:

...Over the past decade, the Association of College and Research Libraries (ACRL) has undertaken an ongoing environmental scan to identify the trends that will define the future of academic librarianship....

Top Ten Assumptions for the Future of Academic Libraries and Librarians....

1. There will be an increased emphasis on digitizing collections, preserving digital archives, and improving methods of data storage, retrieval, curation, and service....

Institutional repositories provide access to one of the most dynamic venues for digital content creation, curation, and service. The literature on repositories is broad, and touches upon (among other things): auditing trusted repositories and metadata; creative approaches to developing institutional repository services; managing new data types; and discussions on future institutional repository development paths....

4. Debates about intellectual property will become increasingly common in higher education....

The OA movement and Web 2.0 applications such as Wikis promote information sharing, while at the same time information is becoming a valuable commercial commodity. Scholarly communication is being reshaped by advances in technology and by the growing realization on campuses that the high cost that libraries must pay for journals is to a significant extent the result of faculty members giving up intellectual property rights to their research....

Efforts such as the Creative Commons <>, MIT Open Courseware <>, SPARC <>, and various OA publishing initiatives that facilitate the sharing of intellectual content and permit scholars to retain certain rights to intellectual property are becoming more popular....

Librarians must continue to work with faculty and professional organizations to persuade them to use OA to scholarly works. Efforts such as the Creative Commons, SPARC, institutional repositories, and other OA publishing initiatives that facilitate free access to scholarly material should be supported.

OA to federally funded research must remain a priority for the library community....

9. Demands for free, public access to data collected, and research completed, as part of publicly-funded research programs will continue to grow.

Recent literature on Open Access reflects the extensive growth of this relatively new movement to make publicly funded scientific research freely available to the public. High profile OA initiatives like Highwire Press, Public Library of Science (PLoS), BioMedCentral, and others have attracted the attention of scholars interested in supporting improved publishing models (Walters, 2007; Park and Qin, 2007). In the past few years the promotion of OA has expanded beyond libraries and has gained the support of many governments, the United States and the European Union in particular, the scientific community, publishers, funding agencies, and the general public (Albert, 2006). The National Institutes of Health have supported legislation requiring that the results of government-funded research be made freely available to the public online (Engelward and Roberts, 2007). As this report goes to press, the U. S. Congress continues to debate this issue. Similar legislation was also proposed in the European Union (EU), but ultimately lost support due to pressure from the publishing industry (Ensreink, 2007). Funding agencies, such as the Howard Hughes Medical Institute (HHMI), have implemented or are considering policies that encourage those scientists they fund to self-archive in open repositories or to publish in OA journals (“Funding Agencies Toughen Stance on Open Access,” 2007).

A battle between OA proponents and the publishing industry is escalating. The Association of American Publishers recently hired a public relations consultant, who is famous for using “media messaging” to shape the climate change debate, to assist it in shaping the debate on OA (Giles, 2007). On the other hand, many publishers are supporting OA in one form or another and are experimenting with a variety of business models to respond proactively. Some publishers have hybrid programs that give authors the option of paying to make their articles freely accessible. Others are altering subscription models to give free access to older journal content (Suber, 2007). This multifaceted and contentious issue will likely continue to get coverage in the professional literature over the next several years (Albert, 2006)....

30,000 books on a keychain, 7.5 million books on your desk

Michael S. Hart, The Top Inventions of 2008, Global Politician, January 3, 2008.  Hart is the founder of Project Gutenberg.  Excerpt:

What will be the top inventions in 2008?

1. ...The advent of USB 3.0 will combine with inexpensive terabyte drives....

3. Virtual Libraries (Taken One At A Time)

By the end of 2008 the Project Gutenberg Library will be as large -- or larger--than the average United States Public Library.

30,000+ volumes originating from Project Gutenberg....

Claims of over a million eBooks from some sources notwithstanding -- Project Gutenberg's library stands alone in that the volumes should each have been proofread by at least two human beings, along with a wide variety of software proofreading programs, and in the fact the eBooks take only one file per volume and are very small files.

As time goes one, more and more virtual libraries of this size will become available in these small files that allow an entire library, 30,000 books of a million characters each, to be worn on keychains, necklaces, bracelets, etc.

These small text files also work very will with compression program varieties such as .zip files, that allow 5 books to be stored in an alternate .zip file in the space 2 books took previously.

5 books in 2 megabytes....

30,000 books in 12 gigabytes.

That's all the words in the books of an average US Public Library.

2008 will see 12 gigabyte USB flash drives for under $100.

$100 to carry every word in 30,000 books. . . .

In less space and weight than your average wristwatch....

Before Gutenberg the average person owned zero books.

Before Project Gutenberg the average person owned zero libraries.

4. Virtual Libraries (Taken As A Whole)

...[B]y the end of 2008, there are going to be 7 million eBooks in the world . . . and someone somewhere is going to download all of them....

Still presuming one million character per volume:...

7.5 million volumes could be stored in .zip format in 2.6 terabytes.

The average computer today sells for under $500.

It comes with about 120 gigabytes of hard drive.

Adding in five drives at half a terabyte each totals under $500.

There is your potential world class library for under $1,000!

Yes, there are problems.

First is the unwillingness of people such as Google to make it easy to download their books in .txt format.

I understand this is changing, but changing a million books takes a bit of time, and I worry that Google is more concerned with SAYING THEY HAVE MILLIONS OF BOOKS than actually making them available as actual text files the likes of which you are reading right now.

This is not just a concern with Google, I have the same concern via the work of The Million Book Project, The Open Content Alliance and all of the rest who speak in terms of millions of online books from the perspective of "instant gratification." ...

Australian cultural heritage and web 2.0

Michael Middleton and Julie M. Lee, Cultural Institutions and Web 2.0, in Proceedings Fourth Seminar on Research Applications in Information and Library Studies (RAILS 4), RMIT University, Melbourne, November 2007. 

Abstract:   The document reports upon an exploratory survey of the approaches that Australian cultural institutions are implementing to meet Web 2.0 challenges. It is given context by a review that is made of Web developments in order to characterize Web 2.0 applications. A sample of applications that have been undertaken internationally and locally are described under the headings ranging from business resources through to exhibitions, professional development and youth outreach in order to explore strategies for implementation. The applications serve to introduce business and technical issues that have arisen, including those involved in forming partnerships with peer institutions and with major Internet services. A discussion section follows in which challenges and opportunities relating to management and software support are identified under the headings: Access, Audiences, Authority, Collaboration, Current Awareness, Metadata, Policy, Publishing, Records retention, Rich Web applications, Seeding, Skills and Statistics. These are seen as common to those convergent areas of application where large repositories are endeavouring to enhance the digital access to their records and information artefacts, and engage patrons further. Each area represents a prospective domain of investigation for research institutions or the cultural institutions themselves. A conclusion summarises these findings with respect to the role that cultural institutions can play in improving access to and involvement with the cultural heritage.

From the body of the report:

...Cultural institutions have the opportunity to foster Web 2.0 applications by improving access to their resources by: 

  1. Developments in access or resource unification. These may be achieved by standardizing search protocols across databases, or by grouping or clustering intermediate metadata for distributed databases.
  2. Seeding of non-repository systems such as online encyclopedias to provide links into their own databases.
  3. Contributing to integrated use of resources through distributed databases or mashups that add value to the databases.
  4. Dissemination of information through facilities provided and maintained by an organization such as podcasts, blogs, wikis, and RSS feeds....

Internet sites that have attracted major use – what Dempsey (Dempsey, 2006, July) terms ‘gravitational pull’ - have effectively catered for seamless discovery of accumulated information resources. This may be whether the resources are file types as provided for by services like Google or Yahoo, or predominantly physical materials as is the case with Amazon. These services have led users to higher expectations of accessibility....

This access has been extended further within a European model ("MICHAEL - Multilingual Inventory of Cultural Heritage in Europe," 2007) initially a partnership between France, Italy and the UK, but with European Commission funding being extended to a further 11 European countries. In addition to subject entry points there are entry points by audience, time period and spatially to allow its users to search, browse and examine descriptions of resources held in institutions [Figure 3]. There is a focus on interoperability between national cultural portals to promote access to digital content from museums, libraries and archives....

Cultural institutions may well be required to play a much greater role in the organization of data that supports e-science or cyberinfrastructure. There is a growing awareness of the worth of data sets in scientific research areas, and the need to manage them for effective re-utilization and persistent availability. With respect to scholarly communication in general, Lynch (Lynch, 2007) has pointed out that there are social and political factors moving us towards open access and the development of technical and social models to assure the persistence and integrity of important digital data over time, or as he sees e-science ‘the investment it represents can be amplified by disclosure, curation and facilitation of reuse’....

[Summary of the discussion of access:]  Improvements in access will depend upon improving retrieval capabilities in repository software by applying ranking and relevance feedback capabilities, or using markup metadata to report contents of repositories into search engines that provide such facilities. Allied with this, must be rationalization of descriptive metadata to permit unification of different types of information repositories....

OA clinical drug trial database for the UK

UK Commission Calls for Open-Access Clinical Trial Database, The Food & Drug Letter, January 4, 2008.  Only the first sentence is free for non-subscribers:

A governmental commission in the UK has called for the establishment of an open-access database of information for certain high-risk Phase I clinical trials along with 21 other recommendations that are intended to prevent any repeat of the catastrophic Phase I clinical trial of the gene therapy TGN1412....

Getting specific about the OA mandate at the NIH

In my newsletter article yesterday on the OA mandate at the NIH, I pointed out six policy details on which Congress was silent and on which the NIH will be free to follow its discretion. 

In a blog post in response, Gavin Baker takes up all six and offers predictions on what NIH will do and suggestions on what it ought to do.

Questionnaire on OA

E-Conservation Online has posted a questionnaire on open access.  (Thanks to Klaus Graf.)  From the preface:

The Open Access (OA) is an important concept that we would like to discuss with our authors and readers. The next survey aims to investigate their awareness about Open Access possibilities, their experiences and their concerns about implications OA may have upon their training and career.

Everybody is welcome to participate....

The results of this poll will be carefully analyzed and summarized into a complete report. The results will be published on our website....

OA Catalan journals

Revistes catalanes amb Accés Obert (RACO) is a collection of OA Catalan journals.  (Thanks to Jan Szczepanski.)  Here's the English-language description from Intute:

Revistes catalanes amb Accés Obert (RACO), or Open Access Catalan Journals, is an online repository of scholarly journals from Catalonia. Here users may access a broad range of full-text articles from diverse publications, which may be browsed by subject. These include: philosophy and psychology; religion and theology; social sciences; the arts and entertainment; language, linguistics and literature; and history and geography. Relevant articles may be quickly located using the site's search facility. The site is equally navigable in Spanish, Catalan and English although users should note that some of the journals will be available in Catalan only. New additions to the repository are listed, and users can receive a free email alert when new issues from their chosen journal are published. All in all, this laudable project has created an easy-to-use website and unprecedented access to diverse research.

Wednesday, January 02, 2008

More on SCImago Journal Rank v. Impact Factors

Declan Butler, Free journal-ranking tool enters citation market, Nature News, January 2, 2008.  (Thanks to Garrett Eastman.)  Excerpt:

A new [OA] Internet database lets users generate on-the-fly citation statistics of published research papers for free. The tool also calculates papers' impact factors using a new algorithm similar to PageRank, the algorithm Google uses to rank web pages. The open-access database is collaborating with Elsevier, the giant Amsterdam-based science publisher, and its underlying data come from Scopus, a subscription abstracts database created by Elsevier in 2004.

The SCImago Journal & Country Rank database was launched in December by SCImago, a data-mining and visualization group at the universities of Granada, Extremadura, Carlos III and Alcalá de Henares, all in Spain....

The new rankings are welcomed by Carl Bergstrom of the University of Washington in Seattle, who works on a similar citation index, the Eigenfactor, using Thomson data. “It's yet one more confirmation of the importance and timeliness of a new generation of journal ranking systems to take us beyond the impact factor,” says Bergstrom....

Thomson is also under fire from researchers who want greater transparency over how citation metrics are calculated and the data sets used. In a hard-hitting editorial published in Journal of Cell Biology in December, Mike Rossner, head of Rockefeller University Press, and colleagues say their analyses of databases supplied by Thomson yielded different values for metrics from those published by the company (M. Rossner et al . J. Cell Biol. 179, 1091–1092 ; 2007).

Moreover, Thomson, they claim, was unable to supply data to support its published impact factors. “Just as scientists would not accept the findings in a scientific paper without seeing the primary data,” states the editorial, “so should they not rely on Thomson Scientific's impact factor, which is based on hidden data.”

Citation metrics produced by both academics and companies are often challenged, says Pringle. The editorial, he claims, “misunderstands much, and misstates several matters”, including the authors' exchanges with Thomson on the affair. On 1 January, the company launched a web forum to formally respond to the editorial.

An OA repository to save Afghani literature

An announcement from the DSpace Federation:

The earliest publications appearing in Afghanistan are extremely rare and, judging by their absence from library collections around the world, are to be found now almost exclusively in private collections, where public access is limited or non-existent. Decades of war in Afghanistan have further dispersed and destroyed holdings of books within the country itself.

The immediate objective of the Afghanistan Digital Library is to retrieve and restore the first sixty years of Afghanistan’s published cultural heritage. The project is collecting, cataloging, digitizing, and making available over the Internet as many Afghan publications from the period 1871–1930 as it is possible to identify and locate. In addition to books, this will eventually include all published serials, documents, pamphlets, and manuals. The Afghanistan Digital Library site is still in a pilot phase. If a search retrieves no results, users should browse the collection to see what is available instead....

From the ADL about page:

...Phase 1 of the project, undertaken in 2005, has drawn materials from the collections of several private collectors as well as from the holdings of New York University Library and the British Library. Phase 2, undertaken in 2006, has trained a staff at the National Archives in Kabul in conservation and digitization and is engaged in the cataloging and digitization of materials held in various public and private collections inside Afghanistan. In time the project plans to carry the dissemination of Afghan publications through the period between 1931 and 1950. Providing universal availability to this broad historical span of Afghanistan’s published history, and in the process constructing a national bibliography for the country, the Afghanistan Digital Library will reconstruct an essential part of Afghanistan’s cultural heritage....

The ADL is a project of the NYU Libraries, with funding from the National Endowment for the Humanities, the Reed Foundation, and the W.L.S. Spencer Foundation. 

Comment.  This is another good example of how digitization enhances rather than jeopardizes preservation.  Kudos to NYU and its partners.

New German copyright law is confusing scholars

Authors Object Under Copyright Act, German American Law Journal, January 1, 2008.  Excerpt:

On January 1, 2008, German authors are confused. Revisions to the Copyright Act, Urheberrechtsgesetz, enter into force today. A new provision, in § 137(l), appears to require their affirmative action to prevent the publication of old works in new media without their express consent.

The archivist blog Archivalia published various opinions on the due date for objections to be filed by authors with publishers and now understands the law to mean that the due date does not fall on December 31, 2007, as widely reported, but a year later.

Meanwhile, authors who objected in order to preserve their works for publication through open access media run into the dismissal of their objections by publishers who argue that the waivers are either formally improper or substantively inapplicable. The latter category includes contributions to journals and collective works. The publishers' association, Börsenverein, spearheads the effort and published a manual on new media law.

PS:  For readers of German, the best source of information and pro-OA advice is Klaus Graf's Archivalia.

RePEc in 2007

RePEc in December 2007, and what we have done over Year 2007, The RePEc blog, January 2, 2008.  (Thanks to Gavin Baker.)  Excerpt:

Every month, a short summary of what happened with RePEc is sent to the RePEc-announce mailing list. I will also put that message, slightly adapted, on this blog.

The major event this month is that we passed to three important thresholds: 15,000 authors, 80% of the material now online, and 1/8 billion abstract views. For some hints at what 15,000 authors represent in the Economics profession, see elsewhere on the blog. Also, we have now released rankings for the most cited recent papers and articles.

As year 2007 is now over, we can reflect on what RePEc has achieved over that year. 158 archives were added, and the total of currently 844 archives have added 108,000 bibliographic items to RePEc, a 24% growth, with 240 new working paper series and 130 new journals. 105,000 new items are online, a 31% growth. 3,500 authors registered, almost ten a day, a 30% growth. Citation analysis coverage increased by 39%.

In 2007, we added also a few new features....

Finally, RePEc celebrated its 10th year in its current form....

[T]he thresholds we have passed this month:

125,000,000 cumulative abstract views
275,000 online articles
130,000 items with references
15,000 registered authors
1,900 working paper series
80% of all items available online

Update (1/8/08). 75% of the top 1,000 economists are registered with RePEc.

Comparative book-scanning

Beth Ashmore and Jill E. Grogg, The Race to the Shelf Continues:  The Open Content Alliance and, Searcher, January 2, 2008.  Excerpt:

Internet giants such as Google, Yahoo!, Microsoft, and Amazon are in the middle of nothing short of a modern-day space race: Who can scan the most and the best books in alliance with the biggest and brightest libraries in the U.S. — nay, the world! — while simultaneously providing print on demand, “find in a library,” and “buy the book” links as well? ...

[T]he Open Content Alliance, or OCA, is giving Google a run for its money. OCA comes armed with an open access philosophy and its own impressive stable of partners, including Yahoo! and, at least initially, Microsoft. Amazon, the dark horse in the race, as scanning and making books available for free online would seem antithetical to its book-selling roots, has gotten into the act, offering to partner with libraries to help scan and sell rare and hard-to-find books from library collections. Under Amazon’s model, the libraries retain their own digital copies along with a portion of any print-on-demand profits. Ultimately, librarians now have choices when it comes to large-scale digitization partnerships....

OCA’s approach to this process has two major differences that set it apart from the Google Book Search Library Project: no scanning of in-copyright materials from library collections (at least not yet) and open access is the guiding principle — meaning that even Google itself could (and does) crawl titles from the OCA repository....

A relatively new and unique [OCA] partner is the Biodiversity Heritage Library, a cooperative project of the American Museum of Natural History, Harvard University Botany Libraries, Ernst Mayr Library of the Museum of Comparative Zoology, Missouri Botanical Garden, Natural History Museum–London, The New York Botanical Garden, Royal Botanic Gardens in Kew, and Smithsonian Institution Libraries. Kahle is particularly proud of this partnership as it represents a trend that could see other disciplines banning together to bring a wealth of knowledge on a particular topic to the open access world. As Kahle explains: “This is a whole branch of science deciding to go open … it is a massive program to digitize tens of millions of pages, basically all of the literature about species. This is important to have in the open because it can be repatriated to the developing countries that actually have these organisms, as well as making it possible to do data mining research on it … It is a commitment of the major natural history museums, natural history libraries and botanical gardens to go and make the information about species public.”

So, why would a librarian choose to go with the OCA over the other partners currently available? Two words: open access....

Another selling point of OCA is its affiliation with the Internet Archive....

OCA may not have the speed or financial resources of Google Book Search to whisk away a library’s holdings and scan them. Nor can OCA scan collections for free, like Google, and we all know how seductive free can be to budget-stretched libraries. OCA is a decidedly community-based effort. It represents a model for the future of digitization efforts that appears viable, provided libraries can cover the associated costs....

Kahle even sees a future for OCA in copyrighted works: “Our approach at the Internet Archive is to start with out of copyright and then move into orphan works, then out-of-print and then in-print. I’m hoping that by the time we get to in-print commercial publishers, we’ll have moved along to help promote their books online and allow them to be downloaded.” ...

In the end, Kahle believes that the OCA’s survival and attraction may lie in its ability to provide the service layers that users require. “This is public domain material. Have the public domain material stay in the public domain and have organizations compete on the service layers. This is the architecture of the World Wide Web.” ...

In 2006, Amazon’s Back in Print initiative demonstrated how rights owners of out-of-print titles could get their titles available through print on demand (POD) via Amazon’s BookSurge division, acquired in 2005. However, it was not until Amazon announced that it would be working through its BookSurge division with Kirtas Technologies and libraries to identify these out-of-print, out-of-copyright titles and add them to BookSurge’s POD service that the library community became active partners....

Linda Becker, vice president for sales and marketing at Kirtas Technologies, Inc., further explains Kirtas’ role: “Customers have two choices. One, they could send us their books and we can digitize the books for them and put them on Amazon. This is what we are doing for New York Botanical Gardens and Cincinnati Public Library. Or, they could purchase a system to digitize materials themselves and send us the work. Then, we do the backend work to get it ready for print on demand and we send it on to Amazon.” This second option is the method by which Emory University, University of Maine, and Toronto Public Library are participating. Becker notes that the project was launched as a pilot in June 2007 with the five libraries mentioned above, but Kirtas is currently talking to approximately 20 more libraries.

In either option, the library is in control of what gets scanned. Beidler points out that the libraries maintain complete control and ownership of the entire process and also the end files that result from the digitization.” ...[L]ibraries put these titles in the Amazon POD program and those books are then available for Amazon customers to purchase directly through Amazon....

Libraries can choose to participate in either the POD or SearchInside! the Book programs on a title-by-title basis, but, according to Beidler, the most common scenario is for participating libraries to place titles in both of these Amazon-provided programs....

In the Amazon/BookSurge/Kirtas model, the libraries function as the publishers, so they create an imprint of sorts that identifies the contributing library as the owner of the material. This means that the library carries the burden for copyright compliance, making sure that the library either owns the copyright or the material is in the public domain. The library also sets the list price for a given title, which varies based on its value, meaning its size, rarity, and other criteria....

Beidler said that no one in this partnership has stipulated that titles must be rare, but many librarians choose to digitize those materials first, as these are the most difficult to access and at the highest risk for damage and deterioration....

Which Project to Pick? ...

Financial concerns certainly must be considered, but there are also some weighty philosophical issues that emerge. The titles included in the Google Book Search program are unavailable to other Web services. Is this a real problem or does Google’s search engine supremacy make this a nonissue? Does OCA have a sustainable model of open access in place and can it continue to scale? Would selling print-on-demand copies of your rare books through Amazon make your digitization project financially feasible? And what do we do about copyright? Some libraries have taken a stance of sorts on these types of issues, as reported in an Oct. 22, 2007, New York Times article, “Libraries Shun Deals to Place Books on Web” . In this article, the author explains the resistance of some libraries, such as the Boston Public Library and the Smithsonian Libraries, to sign up with Google....

New OA journal on language contact

The Journal of Language Contact is a new peer-reviewed journal on the "evolution of languages, contact and discourse" published by the Dynamique du langage et contact des langues at the Institut Universitaire de France.  The inaugural issue is now online.  (Thanks to Adrianne's favories.)

January SOAN

I just mailed the January issue of the SPARC Open Access Newsletter.  This issue takes a close look at the long-sought Congressional victory mandating OA at the NIH.  It also contains my annual look back at OA developments from the previous year.  The round-up section briefly notes 85 OA developments from December.

Tuesday, January 01, 2008

2007: the year of openness

Glyn Moody, Word of the Year: Open, Linux Journal, January 1, 2008.  Excerpt:

The beginning of the year is traditionally a time to look back, and, for the brave of heart, to make a few predictions looking forward. Lacking the requisite bravery, I'll just quote something that the Economist wrote recently:

Rejoice: the embrace of “openness” by firms that have grown fat on closed, proprietary technology is something we’ll see more of in 2008.

Now, had this "fearless prediction" been made a year ago, I would have been impressed, because 2007 has turned out to be the year when everyone, it seems, wants to be open.

For example, hard as it might be to believe, Microsoft actually became an open source company in October last year, when two of its licences were accepted by the OSI as meeting the necessary criteria to be blessed with its approval. But the high-tech company that has beaten the “openness” drum more than any has been Google....

First we had Open Social....

Then we had Google's Android: “the first truly open and comprehensive platform for mobile devices” ....[Then there was open access to wireless spectrum, first from Google and then Verison.]....

Some have argued that Verizon's apparent conversion to wireless open access is more apparent than real, but only time will tell. Happily, the same cannot be said about one of the last – and most important – acts of openness that 2007 brought us: news that all research funded by the US National Institutes of Health would finally be made available as real open access:

The Director of the National Institutes of Health shall require that all investigators funded by the NIH submit or have submitted for them to the National Library of Medicine's PubMed Central an electronic version of their final, peer-reviewed manuscripts upon acceptance for publication to be made publicly available no later than 12 months after the official date of publication: Provided, That the NIH shall implement the public access policy in a manner consistent with copyright law.

Although the battle for open access to this US research has had a rather low profile in the world of open source, this new legislation mandating it is as important to its own field – one, be it noted, hugely inspired by open source - as anything that's happened in free software this year.

Given this crescendo of openness during 2007, I think that the Economist's expectation that we will see a further “embrace” of it in 2008 is not so much a daring prediction as a dead certainty.

How to celebrate public domain day

John Mark Ockerbloom, Public Domain Day gifts, Everybody's Libraries, January 1, 2008.  Excerpt:

...Much of the world gets to celebrate today as Public Domain Day as well, the day when a whole year’s worth of copyrights enter the public domain for anyone to copy or reuse as they like.

In countries that use the “life plus 50 years” minimum standard of the Berne Convention, works by authors who died in 1957 enter the public domain today. That includes writers, artists, and composers like Nikos Kazantzakis, Diego Rivera, Dorothy L. Sayers, Jean Sibelius, and Laura Ingalls Wilder.

In countries that use the “life plus 70 years” term, works by authors who died in 1937 enter the public domain, including works by J. M. Barrie, Jean de Brunhoff, H. P. Lovecraft, Maurice Ravel, and Edith Wharton. Since many countries with this term recently extended it due to trade agreements, they’re often seeing these works re-enter the public domain after being removed from it, but their return to the public is still appreciated.

In countries like the US and Australia, which are under 20-year freezes of all or most of the public domain, it’s not quite as momentous a day. Here in the US, like Bill Murray in Groundhog Day, we’re once again waking up to a public domain 1922, as we have since 1998. Our next mass expiration of copyrighted published material is scheduled for New Year’s Day 2019, 11 years from now. That’s assuming that copyright isn’t again extended before then. Recent extensions here and abroad have often been pushed through in the name of “harmonization” (which seems to always lengthen rather than shorten copyrights), and with Mexico now having a life+100 years term, I would not at all be surprised to see that as the pretext for the next round of attempts to further extend copyright.

But this is not a foregone conclusion. Canada, notably, has held the line at life+50 years for its copyright terms, despite many of its trading partners extending their terms further....

Let’s not just ask what the public domain can do for us; let’s ask what we can do for the public domain. In particular, as of this year more than 14 years have passed since the Web started to explode into public consciousness, with NCSA’s release of the Mosaic web browser in 1993. Many of us older Net users started creating web sites that year. And 14 years was the original term of copyright specified in the UK’s Statute of Anne, and the US’s first copyright law (with an optional renewal term).

As an advocate of more reasonable copyright terms, like those envisioned by our country’s founders, I am therefore today dedicating the copyrights of all 1993 versions of my web sites into the public domain. These sites include The Online Books Page, which is still in operation, and Catholic Resources on the Net, which I stopped maintaining in 1999....

I’m very interested in hearing about things that people are giving to or receiving from the public domain this year. Happy Public Domain Day!

PS:  Kudos to John for his info, his gift, and his example.  I didn't start making web pages until about 1995, but I'll start to look for ways to follow his example on or before Public Domain Day 2010.

OA Japanese literature in western languages

Klaus Graf is collecting links to Japanese digitization projects making their works OA in western languages.

January Cites & Insights

The January 2008 issue of Walt Crawford's Cites & Insights is now online.  This issue contains a lengthy retrospective on the Open Content Alliance and Google Book Search, in which Walt reviews major comments on the two projects over the past 25 months (disclosure: including some of my comments).

ARL task force endorses open access and open data

ARL Joint Task Force on Library Support for E-Science, Agenda for Developing E-Science in Research Libraries, Association of Research Libraries, November 2007.  (Thanks to Clifford Lynch.)  Excerpt:

E-science has the potential to be transformational within research libraries by impacting their operations, functions, and possibly even their mission. Recognizing this potential, the ARL [Association of Research Libraries] Steering Committees for Scholarly Communication and for Research, Teaching, and Learning jointly appointed a task force in 2006 to address the emergent domain of e-science....

Government agencies such as NSF and NIH play a key role in setting policy. One area of particular resonance for the research library community relates to data policies. For example, NSF’s current position on data indicates “all science and engineering data generated with NSF funding must be made broadly accessible and usable, while being suitably protected and preserved. Through a suite of coherent policies designed to recognize different data needs and requirements within communities, NSF will promote open access to well-managed data…. In addition to addressing the technological challenges inherent in the creation of a national data framework, NSF’s data policies will be designed as necessary to mitigate existing sociological and cultural barriers to data sharing and access….” (NSF 2007).

Regarding the data-intensive and data-driven aspects of e-science, NIH policy supports the concept of sharing data that is produced as a result of NIH-funded projects....

Into this mix, the US House of Representatives in July 2007 approved language supporting public access to the results of research funded by NIH.... [PS: This language became law on December 26, 2007.]

Just as the open access movement has prompted libraries to engage in policy discussions about open and sustainable access to scientific journal literature, so too the open data movement will prompt libraries to understand the implications and advantages of models that encourage unfettered access to data, where appropriate. SPARC has been a leader in raising awareness of the need for open access to support the sharing, review, and publication of research results. Efforts are also taking shape through programs such as the Science Commons to provide licensing models that remove barriers to the sharing of information, tools, and data within the scientific research cycle....

See especially Appendix B: Model Principles for Research Library Roles in E-Science (pp. 21-22, drafted by Chuck Humphrey, "with edits from task force members"):


1. Open Access: Research libraries will support open access policies and practices regarding scientific knowledge and e-science. Barriers will be removed that impede or prevent open access to research outputs, and consequently that restrict the potential linkage of outputs to the data upon which research findings are based.

2. Open Data: Access to open data is a movement supported by research libraries, taking into consideration the ethical treatment of human-subject data....

Update.  See Dorothea Salo's comments on the report's position on institutional repositories.

Charities for OS and OA

The Top 80 Charities for Open Source and Open Access Advocates, Virtual Hosting, December 31, 2007.  (Thanks to Amy Quinn.)

PS:  I'd add at least these these two, the two most active non-profits working for OA to publicly-funded research in the US:

New year's resolutions

From Alex Halavais at Thaumaturgical compendium:

...I’m...planning for my two courses this semester, both of which will be distance courses. Almost all the materials —at least those I create— will be open access....

From Cameron Neylon at Science in the open:

This promises to be a year in which Open issues move much further up the agenda. These things are little ways that we can take this forward and help to build the momentum.

  1. I will adopt the NIH Open Access Mandate as a minimum standard for papers submitted in 2008. Where possible we will submit to fully Open Access journals but where there is not an appropriate journal in terms of subject area or status we will only submit to journals that allow us to submit a complete version of the paper to PubMed Central within 12 months.
  2. I will get more of our existing (non-ONS [non-Open Notebook Science]) data online and freely available.
  3. Going forward all members of my group will be committed to an Open Notebook Science approach unless this is prohibited or made impractical by the research funders. Where this is the case these projects will be publically flagged as non-ONS and I will apply the principle of the NIH OA Mandate (12 months maximum embargo) wherever possible.
  4. I will do more to publicise Open Notebook Science. Specifically I will give ONS a mention in every scientific talk and presentation I give.
  5. Regardless of the outcome of the funding application I will attempt to get funding to support an international meeting focussed on developing Open Approaches in Research.

OA advocate honored "for services to geography"

Robert Barr has been made an Officer of the British Empire (OBE) by Queen Elizabeth "for services to geography".  Barr is a professor of geography at Manchester University, managing director of Manchester Geomatics, and advocate for OA to public geodata in the UK.  (Thanks to Free Our Data.)

Presentations on OA from UUK meeting

Stevan Harnad, Universities UK: Open Access Mandates, Metrics and Management -- PPTs now online, Open Access Archivangelism, January 1, 2008.

Universities UK Research Information and Management Workshop [December 5, 2007]

End-of-year update on growth of OA

Heather Morrison, Dramatic Growth of Open Access: Dec. 31, 2007 update, Imaginary Journal of Poetic Economics, December 21, 2007.  Excerpt:

My Dramatic Growth of Open Access: Open Data Edition has been updated with today's numbers. For analysis of growth of OA in 2007 and predictions for 2008, see my Dramatic Growth of Open Access: 2007 (Interim) and Predictions for 2008, and my Minor Update.

One notable story for the latter half of December is the continuing very strong recent growth in the Directory of Open Access Journals, with 64 new titles added in the past 30 days; an average of more than 2 per calendar day, higher than their 2007 overall growth rate of 1.4 per calendar day....

Monday, December 31, 2007

New free journal from Springer

Neuroethics is a new peer-reviewed journal from Springer.  Instead of using Springer's Open Choice hybrid model, it will offer free online access to all its articles, at least for 2008 and 2009.  (Thanks to Adam Kolber.)

The page on instructions for authors says nothing about publication fees.  It does, however, require authors to transfer copyright to Springer, which it justifies by saying, "This will ensure the widest possible dissemination of information under copyright laws."  For the moment I'm less interested in the incorrectness of this statement than in the fact that Springer's hybrid journals use an equivalent of the CC-BY license.  It looks like Springer is experimenting with a new access model:  free online access for all articles in a journal (hence, not hybrid); no publication fees; but no reuse rights beyond fair use.  The copyright transfer agreement permits self-archiving of the published version of the text but not the published PDF.

Also see my post last week on Springer's new Evolution: Education and Outreach, with a similar access policy but a few confusing wrinkles of its own. 

Sunday, December 30, 2007

Selling what's free

David Gallagher, On eBay, Some Profit by Selling What’s Free, New York Times, December 28, 2007.  Excerpt:

While scouring eBay for interesting Christmas presents a while back, I found and bought a DVD of a film made in 1954 about my home town of Doylestown, Pa. After it arrived I went searching for more information about it — and found the entire film, available as a free download from the nonprofit Internet Archive.

It turned out that the eBay seller had simply downloaded the movie file, burned it onto a DVD and stuck it in the mail. And he was doing the same with a wide range of other public-domain material: military truck manuals from World War II, PowerPoint presentations on health matters from government doctors, vaudeville shorts from the late 1800’s.

The seller’s name is Jeffrey....In an interview, Jeffrey said that he spends 20 to 30 hours a week working on his eBay business....He wouldn’t say how much money he makes, but indicated that it was worth the time he was putting into it.

Jeffrey’s auction listings do say the material is in the public domain, and he acknowledges that it is all out there on the Web for those who know where to find it. But he said some of his customers were people who might not know how to turn a downloaded file into something they could watch on a TV or play on a CD player. Some have dial-up Internet connections that would choke on a 600-megabyte compilation of technical manuals. Others don’t have the time or expertise to search for specific information....

Brewster Kahle, the digital librarian of the Internet Archive and a co-founder of the organization, said his group had no problem with people selling material from its online collection in this way....

Also see the reader comments at the end of the story, especially this one from Rick Prelinger, founder of the Prelinger Archives:

The Doylestown film is from our archives, which we support by selling stock footage. Though I’d prefer that people put out higher-quality DVDs than they generally do, and fervently wish they’d be open about where their source material came from (most of the cheap DVD vendors get hazy when describing sources), the public domain is the public domain. If you have a legally acquired copy of a public domain work, you can do with it what you please; this freedom makes possible quotation, anthologies, mashups and cultural innovation.

By the way, paying for public domain works isn’t so unusual — don’t we still pay for editions of Dickens, Mark Twain and Flaubert?

OA database of FOIA documents

GovernmentDocs is an OA database of documents released under the US Freedom of Information Act (FOIA).  (Thanks to Free Government Information.)  From the site: was created to advance the values of open and accountable government. This site gives the public an unprecedented level of access to government documents by allowing users to browse, search, and review hundreds of thousands of pages acquired through the Freedom of Information Act (FOIA) and other public disclosure, or “sunshine,” laws.

With the system, citizen reviewers can engage in the government accountability process like never before. Registered users can review and comment on documents, adding their insights and expertise to the work of the national nonprofit organizations which are partnering on this project.

For more detail, see the November press release.

Comment.  GovernmentDocs is well-implemented.  I expected to see scanned images of the original docs, but I didn't expect to see my search terms highlighted in the images or to see a first-draft OCR of the scan on the same page.  What I like best, though, is the very idea:  sharing these docs with everyone, multiplying the benefits that flow from the trouble of requesting them in the first place.

Update. Also see GovernmentAttic, another OA collection of FOIA documents. (Thanks to Michael Ravnitzky.)

OA portal of public records databases

PublicRecordsWire is an "an open system for cataloging, sharing and discovering new public records databases."  Users can search, tag, and rate the databases, browse by tags, category, and popularity, and add new databases.  (Thanks to Abbie Mulvihill.)