Open Access News

News from the open access movement


Saturday, May 31, 2008

Help Harvard implement its OA policies

Harvard's Office of Scholarly Communication (newly headed by Stuart Shieber) is looking for a program manager.  The person filling the post will play a large role in the "implementation of new initiatives relevant to open access" such as the OA mandate at the Faculty of Arts and Sciences, the OA mandate at the law school, and others still to come.

More misunderstanding of the NIH policy

Joan Leach, Hurdles for free debate charter, ScienceAlert, May 29, 2008.  Excerpt:

...[Australia's Senator Kim Carr] seems to be favouring a policy of “open access” to scientific publications for Australia [in addition to “outreach” to the lay public]....This sounds sensible but there’s a sting in the tail that is illustrated by US experience. Under direction from Congress, the US National Institutes of Health (NIH) requires all grantees to engage in “outreach” – they can’t use the money for, say, better lasers.

NIH grantees must make the papers they publish freely available to the public. The NIH runs Pub Med Central (PMC), a free digital archive of journal literature from the biomedical and life sciences. All articles in PMC are free. PMC says it “aims to fill the role of a world-class library in the digital age. It is not a journal publisher.”

Prof Mary Ganguli, a psychiatrist who researches dementia at the University of Pittsburgh, has related her experience. Despite being a world-class researcher at a world-class institution, she has been the victim of the unsupported communication mandate that tells researchers to do something without monetary support.

For Ganguli this means putting her papers in prestigious journals on PMC, but the journals dislike this intensely. Why would other researchers looking for Ganguli’s work pay the journal for an article that they can already get for free?

To counter this, the journals levy a fee on researchers. Ganguli reports: “My tier one journal charges 1500 pounds sterling; I have not seen a figure lower than US$600 at any journal yet”.

Researchers eventually charge such fees to their grants, so eventually the taxpayer will pay twice for the research. Then, there are questions about protection of copyright and intellectual property, the foundation for potential commercialisation....

Comments.  This may be the highest concentration of misunderstandings about the NIH policy I've ever seen.

  • "US National Institutes of Health (NIH) requires all grantees to engage in “outreach” – they can’t use the money for, say, better lasers."  Both assertions are untrue.  The NIH requires OA to the results of NIH-funded research, not "outreach".  It also requires grantees to spend their research funds on research, including equipment, even if it allows them, optionally, use grant funds for publication fees at fee-based OA journals.
  • "Mary Ganguli...has been the victim of the unsupported communication mandate that tells researchers to do something without monetary support."  This seems to refer to the NIH policy requiring deposit in PubMed Central.  But the deposit is a 5-10 minute operation which doesn't require monetary support, any more than submitting a report at the end of the grant period (a much more time-consuming process) requires monetary support.  Moreover, a growing number of journals will deposit the article on the author's behalf, lifting this chore from the author's shoulders altogether.  But for grantees who don't publish in those journals, and who demand monetary support for this little chore, we should remember that they got a large research grant for their trouble. 
  • "For Ganguli this means putting her papers in prestigious journals on PMC, but the journals dislike this intensely."  Some do, some don't.  Publishers who do dislike it pretend to speak for all publishers, but that pretense is self-serving and untrue
  • "Why would other researchers looking for Ganguli’s work pay the journal for an article that they can already get for free?"  We should be clear:  researchers looking for someone's work can find and retrieve it more easily when there's an OA copy in a repository like PubMed Central.  So Leach's objection here is about publisher revenue, not about researcher productivity.  Now if journals dislike the NIH policy enough, they can refuse to accept work by NIH-funded authors.  But so far, none have done so.  The NIH policy doesn't affect the freedom of authors to submit work to the journals of their choice and doesn't affect the willingness of those journals to publish their work.  And it hugely increases the visibility and impact of their work.  On the larger, implied objection that the NIH policy will undermine journal revenue, see my detailed response in an article from September 2007 (esp. Sections 4-10).
  • "To counter this, the journals levy a fee on researchers."  This sentence creates two false impressions:  that journals are converting to OA in response to the hated NIH policy, and that OA journals always charge author-side fees.  I don't know of a single journal that disliked the NIH policy so much that it converted to OA in response.  But journals do convert to OA all the time, for other reasons.  When they do, some charge author-side publication fees and some don't.  Leach doesn't seem to realize that OA journals use many different business models, and that charging a publication fee is just one of them.  Nor does she seem to realize that the majority of OA journals charge no fees at all or that more subscription-based journals (both by numbers and percentages) charge author-side fees than OA journals.
  • "Researchers eventually charge such fees to their grants, so eventually the taxpayer will pay twice for the research."  This sentence creates four false impressions:  that all NIH-funded authors publish in OA journals, that all OA journals charge author-side fees, that all fees are paid out of research grants, and that it is OA (rather than the absence of OA) which causes taxpayers to pay twice.  On the last of these:  When authors of articles based on publicly-funded research publish in conventional, subscription-based journals, then taxpayers must pay twice to see them --once for the research grant and once for the subscription or pay-per-view fee.  When the same authors comply with the NIH policy, taxpayers only pay once, since those taxpayers can see the authors' peer-reviewed manuscripts in PubMed Central without charge.  If some of those authors publish in fee-based OA journals and NIH pays the fee, then taxpayers are still paying just once, since the NIH does not allocate new funds to pay the publication fee, but merely allows grantees to use already-allocated grant funds for the purpose.
  • "Then, there are questions about protection of copyright and intellectual property, the foundation for potential commercialisation...."  Here Leach is replacing an objection with a vague allusion to an objection.  What copyright questions concern her, exactly?  Under the NIH policy, authors must retain the right to comply with the policy and may transfer all other rights to the publisher.  Publishers may exploit those rights, including commercial exploitation, to the fullest.
  • NB:  I haven't seen the statement by Mary Ganguli which Joan Leach summarizes here.  All my responses are directed to Leach's summary.

Friday, May 30, 2008

Another society journal converts to OA

Uwe Dulleck, Benno Torgler, and Clevo Wilson, Change of Guard for Economic Analysis and Policy, Economic Analysis and Policy, March 2008.  Excerpt:

IMPORTANT CHANGES TO THE AIMS AND SCOPE OF THE JOURNAL...

EAP runs a strict open access policy. Therefore, the journal offers authors a unique and invaluable forum to disseminate their views on emerging and controversial topics.....

[W]e offer the interested readership free and full access to the content published in EAP.

As a novelty, EAP papers will also appear in RePEC, an international public-access database that promotes scholarly communication in economics and related disciplines, covering more than 144,000 articles from leading journals in the discipline....

OPEN ACCESS TO THE JOURNAL

It is clear that authors wish that their papers be as widely disseminated as possible and readers wish to access papers as soon as possible without inconvenience, financial and non financial. An open access policy addresses these important issues while at the same time giving much needed publicity to the journal and its authors. We observe that in 2003 approximately 21,000 refereed journals were published worldwide (Getz 2004). Of these, nearly 57 percent were made available online reaching a global electronic audience. However, not all of them were free. We believe that an open-source method will help to distribute and increase the readership of EAP worldwide in a faster manner. A study published in Nature reports that there is a strong correlation between the number of times an article has been cited and the probability that the article was online. The study concludes that “to maximize impact, minimize redundancy and speed scientific progress, authors and publishers should aim to make research easy to access (Lawrence 2001, p. 521). Thus, it is a timely policy strategy to make EAP easily accessible. The ‘current issue’ papers will appear in the home page of the journal. Others can be accessed via the journal archives.

Thanks to Christian Zimmerman on the RePEc blog for the alert and for this additional information:

Economic Analysis and Policy (EAP) is a 38 year old journal published by the Economic Society of Australia (Queensland branch) that has just adopted an open access policy. To celebrate this important step, EAP intends to publish in 2009 a special issue on the Economics of publishing, with special reference to different business models, like the commercial, university press, open access and pre-print models. Academic publishing is undergoing a profound transformation that we wish to better understand....

Update.  Klaus Graf reports by email that EAP now uses CC-BY licenses.

New OA journal of health research

The International Journal of Health Research is a new peer-reviewed OA journal from Poracom Academic Publishers.  (Thanks to Vikas Anand Saharan.)  The inaugural issue came out in March.

Poracom Academic Publishers doesn't seem to have its own web site.  If anyone has more information about it, please drop me a line.

UpdateKlaus Graf has found the web site for a Nigerian software company named Poracom.  This is clearly the Poracom behind the IJHR, although nothing at the web site refers to Poracom Academic Publishers.  Poracom has written some software (journal management software?) powering IJHR and another, older OA journal, Tropical Journal of Pharmaceutical Research, published by the Pharmacotherapy Group at the University of Benin.  Klaus has also found that IJHR charges no publication fees but requires authors to transfer copyright.  It removes price barriers, not permission barriers.  (Thanks, Klaus.)

New OA journal of research-creation

Inflexions: A Journal for Research-Creation is a new peer-reviewed OA journal sponsored by the Sense Lab.  The inaugural issue is undated but now online.

What is research-creation?  From the about page:

Inflexions...invite[s] writing and/or other forms of expression actively exploring such issues as: (inter/trans/non) disciplinarity; the emergence of new modes of collaboration; micropolitics and the life and death of institutions; creativity, subjectivity and collectivity in cultural production; the ethics of aesthetics; the aesthetic as ethics. The goal is to promote experimental practices combining research and creation in such a way as to foster symbiotic links between philosophical inquiry, technological innovation, artistic production, and social and political engagement. Of continuing concern will be how these efforts may renew and recast relations between the concrete and the abstract, perception and conception, the body and technology....

Radio interview on OA

Last week, Jesse Brown interviewed me for his radio show, Search Engine, on Canada's CBC Radio One.  It was broadcast yesterday, and the podcast is now online.  There are three stories on yesterday's show; the one with me starts at minute 13:30 and lasts about seven minutes. 

Most of the interview focuses on OA to research literature, but the blurb, the introduction, and the final question focus on open courseware.

New OA journal of hematology and oncology

The Journal of Hematology & Oncology is a new peer-reviewed OA journal from BioMed Central.  See Delong Liu's editorial in the inaugural issue:

Abstract:   Journal of Hematology & Oncology aims not to specialize, rather to broaden and provide a platform for information exchange for all studies related to blood and cancer. It aims to include, not to exclude, all studies from basic research, translational research, case reports, and clinical trials. This journal allows the authors to keep the copyright so they can freely use and disseminate their articles as they please. All articles published in this journal are also archived in PubMed, PubMed Central, and other repositories. Therefore, this journal aims not to restrict, rather to make all published articles free and open to all.

Microsoft will complete its book-scanning project with the British Library

Microsoft and the British Library have been collaborating for almost a year on a project to digitize 19th century books from the BL collection (see my past posts on this project, 1 and 2).  What will happen now that Microsoft is pulling the plug on its book-scanning operations?  The BL explains in a May 28 press release that Microsoft will carry out its contract.  Excerpt:

...The mass digitisation of 19th century literature in partnership with Microsoft is one of fifteen British Library-led digitisation initiatives, currently taking place.

As part of the 19th century book project, the British Library has now successfully digitised 40,000 out-of-copyright items from its collections....

It is our intention that the material will be made available on the Library’s catalogue after the completion of a pilot....

Approximately 75,000 pages are being scanned daily by the digitisation studios at the British Library. A further 40,000 out-of-copyright books will be scanned as agreed in the Library’s contract with Microsoft....

How bloggers can help the cause

Sukhdev Singh, What can Bloggers do for Open Access? Sukhdev in Web Land, May 29, 2008.  Excerpt:

Last Saturday I participated in a Bar Camp in Delhi. This was organized by IBNMS and was named as Blog Camp Delhi....I [gave] a presentation on my favorite topic i.e. Open Access....,Open Access: What it is and why it is required for scholarly community? ...

I don’t know how much people in blogging or the wider domain of New Media know about this strange model of academic publishing. However bloggers, once made aware of it, can help [the] open access movement in number of ways. One way is to blog on Open Access itself. There are few already there and well established like the one by Peter Suber. Some others which I know are OA Librarian; The Imaginary Journal of Poetic Economics and I will also mention the one from student community – Open Students: students for open access to research. Second way is to blog about various open access resources. Every day, number of resources including journals, repositories, open courseware etc are launched and announced. These could be evaluated, annotated and listed under well planned categories (or tags) in a blog. Links of such tags or categories automatically collate resources into listings of related posts. It could be very similar to Digital Scholarship. Third way has to do more with subject experts. Scientists and Scholars can blog on how to promote open access within their own subject domain. Open Access Anthropology: Promoting Open Access in Anthropology is beautiful example. Very similar concept has been highlighted in a presentation - Blogging Archaeology: creating an Open Access source for knowledge. Fourth way is to blog about Peer-Reviewed Research. All such blog posts can be aggregated at one place. There could be many more ways to promote Open Access through blogging.

Comment.  A very good set of ideas.  On the first item, blogging about OA itself, see the list of Blogs about OA at the Open Access Directory.  Because OAD is a wiki, you can help keep this list comprehensive and up to date.

New OA journal of psychoanalysis and critique

S (the "journal of the Jan van Eyck Circle for Lacanian Ideology Critique") is a new peer-reviewed OA journal sponsored by the Jan van Eyck Academy.  The inaugural issue is now online.

Update.  Klaus Graf reports by email that all the S articles he's checked use CC-BY-NC licenses.  (Thanks, Klaus.)


Thursday, May 29, 2008

Time is short to comment on the NIH policy

Public comments on the OA mandate at the NIH are due by 5:00 pm (Eastern Standard Time), Saturday, May 31, 2008, less than two days from now

Submit your comments through the NIH web form.  But before you do, see some of the comments already submitted.  The pro-OA comments will give you ideas, and the anti-OA comments will show you what objections to answer and what perspective might predominate if you don't send in your own.

This time the NIH wants separate answers to four separate questions.  The web form has four separate spaces for them:

  1. Do you have recommendations for alternative implementation approaches to those already reflected in the NIH Public Access Policy?
  2. In light of the change in law that makes NIH’s public access policy mandatory, do you have recommendations for monitoring and ensuring compliance with the NIH Public Access Policy?
  3. In addition to the information already posted [here], what additional information, training or communications related to the NIH Public Access Policy would be helpful to you?
  4. Do you have other comments related to the NIH Public Access Policy?

If you're thinking that the NIH just concluded a round of public comments for its March 20 meeting, you're right.  See the comments generated by that round (and my blog post on them).  One persistent publisher objection is that the policy has not been sufficiently vetted and one purpose of the new round no doubt is to give the stakeholders one more chance to speak.  We must use it.  Publishers will.

Please submit a comment and spread the word.  Even if you have no suggestions to improve the policy, it's important to express your support.

Update (5/30/08, 1:15 pm).  I just submitted my own comment.  It's already up on the page of comments already submitted.  If you haven't submitted your own, feel free to use what you want from mine.  But for maximum impact, please customize it!  I haven't read all the comments already submitted, but I can strongly recommend the long, detailed comment submitted this morning by Heather Joseph on behalf of SPARC.  (Load the page of comments and search for "SPARC".)

UpdatePeter Murray-Rust wonders whether non-Americans may submit comments.  The answer is yes.  There are already comments online from Canada, Germany, India, and the UK.  The policy has international implications, most directly for readers outside the US, but also indirectly for authors, libraries, universities, societies, publishers, funding agencies, and governments outside the US. 

One step forward, one back for UK PSI

Michael Cross, An Inspired debate on access, The Guardian, May 22, 2008. See also the background on the Free Our Data blog.
First, some very good news. Civil servants revealed last week that the British government has begun work on a system to make all the geospatial data it holds on the natural environment available for free inspection and re-use. Now the bad news. In this context, "free" means we will still have to pay to download much key data, especially if it is to be published or otherwise used commercially.

The proposed "national geoportal" would create a single point of entry on the web to data held by public bodies such as local councils, Ordnance Survey (OS), the British Geological Survey and the Environment Agency. It is being considered as Britain's contribution to a Europe-wide geospatial data infrastructure to be created by 2019 under the EU Inspire Directive. ....

Interview with Jean-Claude Bradley

Bora Zivkovic, Doing science publicly: Interview with Jean-Claude Bradley, A Blog Around The Clock, May 23, 2008.
... You are one of the pioneers of Open Notebook Science. Could you, please, explain to my readers what this is?

Open Notebook Science is simply the practice of making one's laboratory notebook completely public in as close to real time as possible. In organic chemistry this is pretty straightforward - researchers must keep a notebook where they record what they do and observe in an experiment, generally with the intent of making a specific compound. In other fields, records may be kept in different formats but the idea is that the research group doing ONS should strive to do research transparently with as little "insider information" as is reasonable. In organic chemistry this means providing access to all raw data files (spectra for example) so that another researcher can independently verify all observations and conclusions made. ...

OA archive of sports literature and research

The LA84 Foundation has launched an OA archive of "more than 300,000 pages" of sports literature and research, including "academic journals, scholarly books, popular sports magazines of the late nineteenth and early twentieth centuries, and an extensive offering of Olympic publications". (Thanks to OA Librarian.)

Oregon to re-consider its copyright policy on statutes

The State of Oregon has scheduled a hearing for June 19, 2008 to consider its policy of claiming copyright over its laws. See the page at Public.Resource.Org or the story from May 21 at Ars Technica.

See also past OAN coverage of the issue.

No AIR in DOAJ

The Annals of Improbable Research ("Research that makes people LAUGH and then THINK") converted to OA back in December 2007.  But it just ran into a barrier that other OA journals won't face:  The DOAJ decided not to index it.  From the AIR announcement:

This week we learned that open-access research is serious business.

Recently, our magazine — the Annals of Improbable Research — went “open access”. We now put all our content online free, and are gradually adding the content from past issues, too.

Librarians, ever more squeezed for funds, had been urging us to do this. And then, they said, be sure to tell the Directory of Open Access Journals.

We wrote to the DOAJ, asking to be included on their list. DOAJ’s motto is “free, full text, quality controlled scientific and scholarly journals. We aim to cover all subjects and languages.” But a DOAJ administrator wrote back, explaining:

“I do not think we will be able to include the Annals of Improbable Research, even if I am sure the magazine does make people both laugh and think.

It is not, however, scientific or scholarly in the way we expect journals in DOAJ to be, meaning making people think, only.”

Stuart Shieber reflects on his new position

In New Job, Harvard Professor Downplays the Role of “Revolutionary”, Library Journal Academic Newswire, May 29, 2008.  Excerpt:

Last week, Harvard University professor Stuart Shieber made history—he was named the first director of Harvard’s newly minted Office for Scholarly Communication (OSC). In his new role, Shieber will oversee the implementation of the university’s groundbreaking open access mandate, which he helped author, and which many suggest could have wide-ranging implications for the future of scholarly communication. “Let’s not go overboard,” Shieber says with a laugh and an audible wince when asked if he views his new role as a historic opportunity. “People like to extrapolate that [the mandate] will have a revolutionary effect. But you can’t make a policy based on that extrapolation. Sometimes there’s too much talk about momentous, revolutionary effects, it gets too far in front of what is really happening. There are lots of things going on, and there will be changes. We’re just trying to do our part.”

That sober approach should be heartening to observers concerned with getting the implementation rolling. In a conversation with the LJ Academic Newswire this week, Shieber embraced a straightforward mission “to support the efforts of the Harvard faculty to make their collective scholarly output as broadly available as possible.” It’s a big job, Shieber conceded, and one he didn’t necessarily expect to fall to him, despite his role in authoring the policy....

Among the first, and perhaps the most central of his initial tasks, will be to establish the online repository that will be the fulcrum of Harvard’s OA mandate. “In theory, it is as simple as downloading some open source software and turning it on,” he said, “in practice, many complexities come up, such as having it work well with other systems already in place at the university.” Nevertheless, work on the repository is progressing, he says, and a beta could be in place shortly. Another big part of his new role will be outreach to faculty.
Perhaps the most engaging —if still unformed— aspect of Shieber’s new job, however, will be developing how Harvard will support and work with open access journals. “The OSC pertains not just to the open access policy, it’s broader,” he explains. “The policy offers open access the articles directly, essentially through author self-archiving. To the extent that we need to find alternative business models, it behooves Harvard to support those alternative business models.”

Shieber said he is looking at a few options to support open access journals. One is to work with the Harvard University Press (HUP), which he called an important ally. He said he has had many discussions with HUP about how the activities of the press and fit with the OSC, noting that HUP will soon publish its first journal in years, the Journal of Legal Analysis —an open access, faculty edited journal. “My hope is that this will be the first of several OA journals HUP will start to run....”

More broadly, Shieber’s goal is to see OA journals exist on “equal footing” with subscription-based journals. As of now, he says, they do not, because much of the money that underwrites the services of subscription-based journals comes from libraries while the money that underwrites OA journals comes mostly from author charges. “Authors don’t get underwriting help from the library when they publish in OA journals, while they do from publishing in subscription-based journals,” he explains. To put OA and subscription journals on a “level playing field,” he suggests, “you’d want to underwrite OA journals just as you do subscription journals.”

Both Shieber, and his co-sponsor in the FAS mandate, university librarian Robert Darnton, say they are confident of a continuing, vital role for academic journals in an open access future, and note that there has been much discussion at Harvard over whether OA might induce “a blowback effect” on the stability of those journals or peer review. Both believe the evidence does not threaten either journals or peer review, but “that doesn’t mean there is no uncertainty there,” Shieber added. “You can’t just say ‘don’t worry.’ You have too look at these issues. What we do know is we couldn’t keep going the way we had been going. We know that is not sustainable.”

PS:  For background, see my post on Stuart's appointment as Director of Harvard's Office of Scholarly Communication, my post on the HUP roll-out of the Journal of Legal Analysis (both May 22, 2008), and my newsletter article on the Harvard OA mandate (March 2, 2008).

Ginsparg to investigate and build on OA at Radcliffe

Paul Ginsparg has been named a Science Fellow for 2008-2009 at the Radcliffe Institute for Advanced Study.  (Thanks to Garrett Eastman.)  From the Radcliffe announcement:

Paul Ginsparg is a professor in the physics and information science departments at Cornell University. He is well-known as the creator of the on-line system arXiv.org that distributes scientific research results. At Radcliffe, he will embark on a theoretical and experimental investigation into how researchers’ interactions change as a result of ever-growing open access. Ginsparg plans to create tools and resources for researchers to communicate more efficiently with one another.

The importance of OA for taxonomy research

Kevin Zelnio, PLoS ONE Publishes First Taxonomic Paper, The Other 95%, May 28, 2008.  Excerpt:

[PLoS ONE just published its first species description:] an excellent paper by Fisher and Smith on the ants of Malagasy region....

I will talk about the ant paper in a separate post. First, I would like discuss further the role of open access publishing in taxonomy.

Why should one support open access publishing of taxonomic papers?

Visibility is important to the field of systematics, where the relevance is often lost amidst the taxonomic jargon. By removing the subscription barrier, taxonomists make their work accessible and noticeable to researchers all over the world. Increasingly, the need has never been greater for high quality taxonomy. The treatment of neglected tropical diseases relies on proper identification a the pathogen or parasite. Species form the fundamental unit of much of evolution and ecology. Sound knowledge of species and their attributes is basic to all other fields of biology ranging from the molecular to the metacommunity. While scientists might not agree on what a species is, there is no doubt about their importance and the necessity to identify and describe them.

The time is now for taxonomy and taxonomists to enter the digital age. New web technologies can prove effective at linking papers, potentially increasing readership and bringing disparate fields together. For instance, a paper describing a new species of pathogenic nematode can have hyperlinked keywords that summarize the findings, i.e. "Nematoda" "Genus species sp.nov." "Genus species (of host)" "Pathogenesis" "Endoparasite" "Locality Information", etc. Other articles of interest with hyperlinked keywords can be linked together for researchers to uncover. Species names themselves can be linked to the original paper, so one can find basic information about that species. This will make it easier to ground-truth simple observations about a species that can affect interpretations in other research, such as where it has been described from, variation in characteristics between sexes and sites, behavioral and diet observations and life history traits....

Should taxonomists forego traditional publishing outlets?

The better option would be for those outlets to go online and open access! If there is some success to PLoS ONE in their venture to publish papers of a taxonomic nature, hopefully it will inspire established journals to follow suit. If you believe strongly in the force of the digital age to implement positive change in science, support open access initiatives by publishing your articles there. One may posit that hybrid journals, where authors may elect to pay an additional fee to make their article accessible online for free, is a step forward in the right direction....Peter Suber notes one should proceed with caution when electing to publish in a hybrid journal for several reasons. In particular, hybrid journal options do not free up subscription money from libraries. Because it is a risk-free strategy for journals, there is not an incentive to get rid of subscriptions fees all together, since most authors do not elect the free-access option. Many publishers still do not make their publishing model or data on the efficacy of the hybrid option available. This makes it difficult to police whether they are reducing subscription fees in relation to author uptake of the free-access option, where high fees are paid to offset subscription fees....

More on the Microsoft exit from book scanning

Andrea Foster, Microsoft's Book-Search Project Has a Surprise Ending, Chronicle of Higher Education, May 29, 2008.  Excerpt:

It is hard to imagine a Microsoft venture falling under the weight of a competitor. But that's the post-mortem offered by many academic librarians as they ponder the software giant's recent and sudden announcement that it is shutting down its book-digitization project. The librarians' conclusion: Google did it....

Microsoft entered the scholarly digitization arena in October 2005, 10 months after Google did, and has been playing catch-up ever since.

Microsoft was not as ambitious as Google in the volume of material it sought to digitize and was not willing to devote as much money to the endeavor, librarians say....

"Microsoft was a little slower off the mark than Google," says Anne R. Kenney, university librarian at Cornell University. Her library has supplied both Microsoft and Google with books and articles for digitization. "It would have meant an awful lot of additional investment in this area for Microsoft to be a real competitor."

Still, she and other librarians say Microsoft's retreat from book digitization is a setback for the preservation of books. Many academic libraries will have to scramble to find other sources of money to make their books available online, they add. And Google, which restricts the public institutions that can use its scans and has also been accused of copyright infringement for some of its scanning activity, may not be a satisfactory alternative....

Microsoft is still scanning parts of Cornell's collection. Between 90,000 and 100,000 books will be digitized by the time Microsoft ends the program, probably this summer, says Ms. Kenney.

An additional half-million books and journals at Cornell will be digitized under a newer agreement the university has with Google....

Ms. Kenney says Google's strategy [to include copyrighted works] is one reason Cornell decided to form a partnership with the company. Cornell has a large agricultural-life-sciences collection that is under copyright but needs to be digitized, and Microsoft would not work on copyrighted material, she said. Another reason: Microsoft focused only on English-language materials, whereas Google is digitizing works in other languages. And Microsoft focused on books, while Google works on more journals....

The University of Toronto, one Microsoft partner, is reluctant to sign on with Google. That company does not allow its scanned works to be shared freely among public libraries, and sharing with the public is one of the university's goals, says Carole R. Moore, the university's chief librarian.

Microsoft digitized about 120,000 volumes from Toronto's libraries.

"We hope to double that," she said. "We're looking for other funding sources."

Peter Brantley, executive director of the Digital Library Federation, says much of his group's membership will also be scurrying to find new partners to help pay for digitizing books....

Indeed, some librarians say Microsoft's decision could come with a silver lining, forcing a larger and more diverse group of players to get involved in digitization.

"We've got to get help from many different angles," said Brewster Kahle, who recruited Microsoft and Yahoo to support the Open Content Alliance, the nonprofit book-digitization project he leads. "Microsoft has given us a great kick-start."

New OA journal of public health

Global Health Action is a new peer-reviewed OA journal affiliated with the Centre for Global Health Research at Sweden's Umeå University and published by Co-Action.  From Stig Wall's editorial in the inaugural issue:

...Co-Action Publishing is a relatively new Open Access publisher based in Scandinavia and one of only a handful of publishing houses worldwide offering a true OA publishing model for scholarly journals. The content of a journal such as GHA begs for Open Access, and it is therefore only natural that CGH and Co-Action Publishing should team up to ensure a great impact for the journal in years to come.

All articles published in GHA will be freely accessible online immediately after they have been accepted for publication and can thereafter be linked, read, downloaded, stored, printed, used, and data-mined by anybody with a computer and access to the internet....Moreover, the Open Access model offers additional multimedia benefits such as videos, audios, links to full datasets, unlimited colour budgets and interactive features, all of which the printed medium cannot provide. Co-Action Publishing will ensure that the best web technology supports the editorial team at CGH as well as the contributing authors and thereby enhance the scholarly content of GHA....

New OA journal of communication

Stream:  Culture/Politics/Technology is a new peer-reviewed OA journal of communication published by the Communication Graduate Student Caucus at Simon Fraser University.  (Thanks to Kate Milberry.)  From Martin Laba's editorial in the inaugural issue (Spring 2008):

...In every respect, Stream embodies the principles and values of the creative commons, and the practices of current and ongoing copyleft movements. From the licence under which articles are published in Stream, to the Open Journal System software that provides the foundation of the web site, to the open-source layout software and public domain typefaces, the journal stands as an exemplar of democratic, open access publication....

Online scholarly journals have the capacity to enhance, extend and elaborate public intellectualism and help to bridge and sustain crossover between the academic and the popular. The need is ever increasing to create greater resonance with a wide range of researchers, and other audiences seeking relevant and reasonably accessible debate and analysis around urgent and vitally important issues in communication and media environments. Stream's approach to both content and production is driven by this need, and by the strongest commitment to thoroughly democratic principles of research dissemination....

OA enhances error correction

Jeffrey Young, Journals Find Fakery in Many Images Submitted to Support Research, Chronicle of Higher Education, May 29, 2008.  This article is primarily about image fraud, but I've omitted most of it in order to highlight the OA connection. 

...As computer programs make images easier than ever to manipulate, editors at a growing number of scientific publications are turning into image detectives, examining figures to test their authenticity.

And the level of tampering they find is alarming....

One new check on science images, though, is the blogosphere. As more papers are published in open-access journals, an informal group of watchdogs has emerged online.

"There's a lot of folks who in their idle moments just take a good look at some figures randomly," says John E. Dahlberg, director of the division of investigative oversight at the Office of Research Integrity [at the US Department of Health and Human Services, which includes the NIH]. "We get allegations almost weekly involving people picking up problems with figures in grant applications or papers."

Such online watchdogs were among those who first identified problems with images and other data in a cloning paper published in Science by Woo Suk Hwang, a South Korean researcher. The research was eventually found to be fraudulent, and the journal retracted the paper....


Wednesday, May 28, 2008

Four years of a Spanish OA journal

Pep Simo and Jose M. Sallan, Intangible Capital: Four years of growth as an open-access scientific publication, Intangible Capital, 4, 1 (2008) pp. 1-7.  An editorial. 

Abstract:   This issue opens the fourth volume of the Intangible Capital journal, which makes its way towards the fifth year of publication. As usually, we start this volume by evaluating the previous one and tracing new directions. Among the main contributions during the year 2007, we consider important to highlight the following aspects: the renewal of the scientific indexation agreements, the platform change to OJS, the appointment of a new editor, new members included in the editorial board, the board of reviewers, the change towards a bilingual model, the new financing obtained and, the last but not the least, the work undertaken together with many scientific editors of open access Spanish journals for obtaining the positive evaluation of the CNEAI (National Commission for the Evaluation of the Research Activity) and thus, being a proof of scientific excellence.

Digitizing the Munich Hebraica Collection

Munich's Ludwig-Maximilans University and the University of Cologne are digitizing the Hebraica Collection of the Munich State Library, which includes 2,700 manuscripts from 1501 to 1933.  (Thanks to Welt Online via the Informationsplattform Open Access.)  Read the Welt Online article in the original German or in Google's English.

Glamorgan launches an IR

Wales' University of Glamorgan has launched an institutional repository.  From today's announcement:

An online tool which will allow people all over the world to access research at the click of a button has been launched by the University of Glamorgan.

Glamorgan's online Research Repository is the first in Wales to be launched as part of the Welsh Repositories Network programme, which began in April 2007.

The newly launched digital repository gives worldwide access to the University's published research and already the system has attracted some 450 visits from 30 countries across the world....

Dr Douglas Houston who manages the repository project at Glamorgan said, "Online research repositories are emerging as a worldwide phenomenon aimed at giving open Internet access to published research in order to accelerate the dissemination of information. As such, they extend the original purpose of the Internet, which was developed by Tim Berners-Lee in the early 1980s as a global information resource for researchers. The network of repositories in Universities throughout the world gives free access to research that would otherwise only be available in journals which in many cases have prohibitively high subscription rates." ...

The system is very easy to use and a research paper can be uploaded along with the bibliographical information search engines look for in as little as two minutes.

Update.  Also see the notes and photos of the launch ceremony from the Repositories Support Project.

More from the Belgian newspapers who don't want Google links to their headlines

Nate Anderson, Belgian papers demand huge fine from Google News, Ars Technica, May 28, 2008.  Excerpt:

..."[T]he Belgian court ruling (PDF) against Google...last year...[held]that Google News (Google.Actualités) was a gross violator of copyrights owned by the French and German daily press in the country. The offense? Using headlines and a sentence or two from these articles for Google News. Now, the AP notes that Belgian press group behind the case is back in court, seeking up to €49 million ($76.9 million) in damages for all those headlines.

The whole case might seem a bit absurd to readers from all the other countries where Google News is legal, but the Belgian Copiepresse trade group sees its business model at stake. Within a month of Google rolling out the service to Belgium in January 2006, the group began legal proceedings to have its materials removed. The argument was simple: the headlines and brief snippets that Google was using to highlight articles violated copyrights of daily papers in Belgium. The papers objected because they didn't want Google deep linking to their content; they wanted visitors to come to the site homepages and click around a bit, generating more page-views and keeping visitors from "bouncing" from one source to the next.

This was the basis for our own Ken Fisher's comment when the case was decided last year: "They actually resented the fact that Google News might direct readers to their content, because they feared that a search engine might do what it is designed to do: get people what they want the first time. Copiepresse's member companies would prefer that you hit their home pages and wander around aimlessly instead. No, I'm not joking."

The publishers also objected to some material remaining available in Google's cache after it had disappeared behind a paywall at the original site....

When Google lost the case, it not only pulled the stories from the News archive and stopped indexing them, but it yanked the papers from its main index. Apparently, this wasn't good for business, and the papers soon worked out a deal to get back in the index while remaining out of Google News....

[T]he Court went on to rule that Google did not qualify for the "citation and news reporting" exemptions in Belgian copyright law for a variety of reasons not particularly interesting to go into here.

So, Google lost the case and was ordered to remove articles from any publication that requested it within 24 hours of receiving such notice. Fines would be issued for any delay, but Google wasn't on the hook for massive damages. Now, however, Copiepresse wants those damages, and it wants a provisional fine of €4 million while the much larger fine is being worked out....

Comment.  See my post on the suit from February 2007.  In my newsletter the next month, thinking the case was over, I said, "The Belgian newspapers...are now vindicated and invisible."

Special issue of JIME on open education

The May issue of the Journal of Interactive Media in Education is devoted to Researching open content in education.  (Thanks to Jonathan Gray.)

Major report on author attitudes and experiences in Australia

Anthony Austin, Maree Heffernan, and Nikki David, Academic authorship, publishing agreements and open access: Survey Results, a new report from the OAK Law Project.  The report is dated April 2008 but was released today.  Excerpt:

5.1 Use of Online Repositories

Figure 19 illustrates that the majority of respondents (93%) are in favour of academics granting institutions a limited non-exclusive license to place items in a non-commercial, publicly accessible, online institutional repository.

Approximately half of the sample (53%) indicated that their university or institution promotes or facilitates Open Access. Almost half (45%) have deposited an item in an institutional or other repository to make it freely available online.  Less than one-in-four participants (22%) indicated that their institutional repository gives up-to-date
information on how many times the item has been viewed or downloaded....

5.3 Elements of Open Access

Figure 20 depicts respondents’ views regarding the relevance of certain elements of Open Access.

The most relevant elements identified by respondents were that it results in a wide dissemination of knowledge (63% stating that this is ‘extremely important’; mean=4.53) and that it encourages scientific, social and cultural advancement (60% stating that this is ‘extremely important’; mean=4.43).

Over half of respondents stated that broader access to the results of publically funded research, the distribution of information freely and without cost and the making of information available for re-use were ‘extremely important’.

Respondents thought that allowing a better understanding of how many people access their item and establishing institutional or other repositories were of lower priority (Although 58% thought that allowing a better understanding of how many people access their item was ‘very’ to ‘extremely important’ and 65% thought that establishing institutional or other repositories was ‘very’ to ‘extremely important’)....

5.5 Benefits of Open Access

Figure 21 presents respondents’ agreement with a range of statements regarding the benefits of Open Access.

The benefits that were of greatest relevance for respondents were: increased accessibility to research outputs (61% strongly agreeing; mean=4.48), easier access to material within specialized research field(s) (56% strongly agreeing; mean=4.39), and improved dissemination through broader circulation of research outputs (52% strongly agreeing; mean=4.37).

The benefits that attracted the lowest levels of agreement were: enhanced funding opportunities (17% strongly or somewhat disagreeing and 49% neutral; mean=3.29) or enhanced career advancement (14% strongly or somewhat disagreeing and 46% neutral; mean=3.40) and that it enables new forms of research (11% strongly or
somewhat disagreeing and 35% neutral; mean=3.64)....

5.9 Reasons for Not Depositing into Repositories

Figure 23 demonstrates that the main reasons identified by participants for not depositing an item into an institutional or other repository were a lack of awareness regarding appropriate repositories for the depositing of items (29%) and uncertainty regarding their copyright position (17%). Only 2% of participants cited disagreeing with Open Access principles and 3% cited a preference for placing their items on their personal website as reasons for not depositing an item....

6.1 Reasons for Publishing in Open Access Journals

More than half (59%) of respondents (n= 302) have never published in an Open Access Journal. For those that have published an item in an Open Access Journal (n=207 or 41% of the sample), most indicated that they did so because they have an Open Access Journal in their disciplinary area (45%) or because they desire to promote Open Access principles and ideals (29%; see Figure 28). Thirty-five participants specified other reasons for their choice of publishing in an Open Access Journal....

6.3 Reasons for Not Publishing in Open Access Journals

Almost one-quarter (22%) of respondents indicated that they have not published in an Open Access Journal because they were either unfamiliar with the process or they have no motivation to do so or it is not adequately recognised or acknowledged for the purposes of promotion (see Figure 29)....

Comment.  This significant survey asked all the right questions.  I've caught what I think are the most important excerpts, but the report is long (129 pp.) and I'll need more time to read it with care. 

Update.  Also see Bernard Lane, Dons wary of open access, The Australian, May 28, 2008 (another in a series of misleading headlines).  Excerpt:

A new survey by the Open Access to Knowledge Project at the Queensland University of Technology also reports that more than half of academic authors are unsure whether their publishing agreements with journals allow them to put a copy of their articles in an open access repository.

"It is, I suppose, a little unsettling to see that a lot of people are saying, 'we're not really sure what we're signing and how that will affect dissemination of our research'," said Brian Fitzgerald, OAK law project leader, although the survey showed enthusiasm for open access, especially among early career academics.

Professor Fitzgerald said universities needed to give researchers more advice about how to "strategically manage their copyright" for private benefit and public good.

"They're saying, yes we see the value of open access but we still publish in the traditional area ... some (academics) would like to understand how to put the two models together," he said.

Most in the survey wanted copyright advice, including template publishing agreements, from their institutions. "As researchers we can't close our eyes to a mechanism (such as the internet) to disseminate our work, especially if it's publicly funded," Professor Fitzgerald said....

More than half of the 509 academics who took part in the survey thought it too much trouble to negotiate with publishers. "They see that as almost prejudicial to the likelihood of being published, they don't want to rock the boat," Professor Fitzgerald said.

However, 87 academics had proposed a change to a publishing agreement and in almost every case the publishers had agreed....

Professor Fitzgerald said there was a contradiction between the willingness of almost all academics to grant universities a non-exclusive licence, so that copies of their articles could be deposited in an open access archive, and the fact that 63 per cent of academics signed away their rights to traditional publishers. Yet many publishing agreements were silent on an academic's right to disseminate copies via the web or a repository.

Combining the strengths of institutional and disciplinary repositories

Ulrich Herb, Anja Kersting, and Tobias Leidinger, Vernetzung von fachlichen und institutionellen Open-Access-Repositorien, Bibliotheksdienst, 42, 5 (2008) pp. 550-555.  Self-archived May 27, 2008.  In German but with this English-language abstract:

Both Saarland University and State Library SULB (Germany) and Konstanz University Library (Germany) are running Open Access Repositories. SULB is running - among others - a disciplinary repository for the Psychologist from German-speaking countries called PsyDok, whereas Konstanz University Library is running an institutional repository called KOPS (Konstanz Online Publication Server). In order to combine the strenghts of both types of repositories SULB and Konstanz University Library started a small project to implement an OAI-based harvesting routine. The metadata of psychological content in KOPS will be mirrored in PsyDok. This means that it also will be indexed by psychological databases and search engines focusing on a disciplinary repository like PsyDok but ignoring institutional repositories. Both repositories are taking benefit from this metadata-sharing: KOPS' repository managers have an additional marketing argument, PsyDok gains more content.

Three health librarians on OA

Dean Giustini interviewed three Canadian health librarians on the subject of OA:  Lindsay Glynn, Lorie Kloda and Denise Koufogiannakis.  (Thanks to Heather Morrison.)  Excerpt:

1. GS: What can librarians do to get involved in OA? What are the benefits?

Lindsay:  "Try to develop OA information pages at your library to improve general awareness of OA issues; increase the awareness of OA and competence among teaching staff, increase researcher knowledge about OA issues and increase amount of records in local and national repositories."

Denise:  "Show your solidarity with OA by featuring various open access journals on your library websites; do displays; have handouts ready; do presentations to the public, etc." (For more information, see Denise's powerpoint presentation.)

Lorie:  "Deposit any papers you have done for conferences into E-LIS so that other librarians can access the professional literature."

Denise offered these perspectives, also:

"Open access is a choice as to how you communicate with your peers. Why choose to limit communication when it is easy to make communication open? At EBLIP we are trying to bridge the gap between research and practice - if we do not make the content widely available to all, then we are not going to achieve that goal.

"The second is the librarian's role in supporting faculty/researchers and new methods of scholarly communication. This is an important role for librarians. Taking a leadership role to inform and support faculty re: open access enables us to be part of a changing system. For me this is simply about moving towards more equitable access to scholarly information."

"A good example for me has been the Canadian Journal of Sociology whose editor is at U of A. Over the past year or so Pam Ryan, Leah Vanderjagt and I worked with this editor to move his journal from a traditional print subscription model to an OA one. Without having the credibility of being editors ourselves and knowing the system/what was needed to make things work, it would have been much more difficult to facilitate the process. See Kevin's article about his process of moving to OA."  ...

Civil Rights Digital Library

The University of Georgia has launched the OA Civil Rights Digital Library.  (Thanks to Josh Fischman.)

Evaluating the new media work of new media scholars

Andrea Foster, New-Media Scholars' Place in 'the Pool' Could Lead to Tenure, Chronicle of Higher Education, May 30, 2008.  Excerpt:

Re:Poste is one of 600 creative works...by new-media students and faculty members...described in the Pool, which also contains about 2,000 reviews of those works. Starting in June, the Pool will have a much wider reach, as people in general will be invited to add material to the site, rate others' projects, build on their ideas, and find collaborators for their own projects.

The Pool, as yet little known, could provide a new avenue for new-media scholars to do their jobs. Eventually it could play a role in their tenure and promotion as well....

Here's how the Pool works:

Titles of new-media projects are plotted on a two-dimensional graph. People log in and post the reviews of projects, rating their appearance, function, and concept on a scale from 1 to 10. As works garner more reviews, they move from left to right on the graph. If reviews become more positive, the works move toward the top.

Accordingly, the most highly regarded and widely reviewed works migrate to the upper right corner of the graph.

The program calculates the ratings and takes into account the credibility of the reviewers....

The Pool also allows visitors to bore deep into a project via hyperlinks, in many cases viewing its evolution from conception to finish. They can see its creator or creators and read how others rated the project. They can see the works that inspired it and the works it inspired. Basic information about a project is posted by the developers....

Even if the Pool won't be used for decisions on tenure and promotion, says [Jon Ippolito, associate professor of new media at the University of Maine at Orono and co-creator of The Pool], it will encourage collaboration among scholars.  "Instead of people toiling away at their own lab bench or scholarly archive," he says, "people begin to share ideas and work from each other."

One feature of the Pool allows users to view scholarly connections schematically....

[Richard J. Rinehart, digital-media director and adjunct curator at the University of California at Berkeley Art Museum] says he is considering using the Pool to develop an open-source museum of digital art....

The Pool is one of two projects to promote scholarly collaboration that Mr. Ippolito has created with colleagues at Still Water, a research arm of Maine's new-media department.

His other project, ThoughtMesh, was created with Craig Dietrich, a new-media researcher and artist who just earned a master's degree in "intermedia" at the University of Iowa.

ThoughtMesh is a Web site that tags open-access scholarly papers with key words. Visitors can jump to passages in papers that contain those words. And they can see others' papers, throughout academe, tagged with the same words. A "cloud" of tagged words hovers above each paper....

PS:  For background, see my post on The Pool from December 2003, and my post on ThoughtMesh from October 2007.

Combining OA, wikis, community annotation, semantic processing, and text mining

Barend Mons and 22 co-authors, Calling on a Million Minds for Community Annotation in WikiProteins, Genome Biology, May 28, 2008.

Abstract:   WikiProteins enables Community Annotation in an Open Access, Wiki-based system. Extracts of major data sources have been fused into an editable environment with a link out to the original sources. Data from Community edits take place on automatic copies of the original data . Semantic technology captures concepts co-occurring in one sentence and thus potential factual statements. The concepts are selected from authoritative ontologies or databases. In addition, indirect associations via concept profile matching have been calculated. We here call on a 'million minds' to annotate a 'million concepts' and collect new facts from full text literature with the immediate reward of collaborative knowledge discovery.

I've omitted the links from the abstract because they presuppose a technology I don't have on my blog, apparently the technology described in the article.  To see it in action, surf over the article itself.  Keywords are highlighted in different colors:  blue for anatomy, yellow for genes and molecular sequences, green for living beings, and so on.  (Hover your mouse over a colored keyword to see its category.)  Clicking a keyword pops up a small window with a user-editable definition.  The window also offers the options to run a search on the term or to look up its entry in WikiProfessional or its "knowlet" in the Concept Web.  Unfortunately, users don't have the option to open the WikiPro or Concept Web entries in a new window, forcing us to leave the article we're trying to read.  My copy of Windows XP wanted to run Microsoft's MSXML 5.0 in order to read the article, and I refused, so I may be missing some of its functionality.

From the Rationale and overview section of the paper (again without links):

This paper aims to explain an experimental system for Community Annotation and collaborative knowledge discovery called WikiProteins. The exploding number of papers abstracted in PubMed has prompted many attempts to capture information automatically from the literature and from primary data into computer readable, unambiguous format. When done manually and by dedicated experts, this process is frequently referred to as curation. The automated computational approach is broadly referred to as text mining....We propose here that a combination of text mining and subsequent community annotation of relationships between concepts in a collaborative environment is the way forward. The future outlook to integrate data mining (for instance gene co-expression data) with literature mining, as formulated in the review by Jensen et al, is at the core of what we aim for at the text mining/data mining interface. To support the capturing of qualitative as well as quantitative data of different nature into a light, flexible, and dynamic ontology format we developed a software component called Knowlets. The Knowlets combine multiple attributes and values for relationships between concepts. Scientific publications contain many re-iterations of factual statements. The Knowlet records relationships between two concepts only once....This approach results in a minimal growth of the Concept Space as compared to the text space....PubMed grew beyond 14,000,000 abstracts in 2006 (by the end of 2007 the 17,000,000 mark was passed). In 2006, UMLS contained well over 1,300,000 concepts. Only 185,262 concepts from UMLS were actually mentioned in PubMed (2006 version) and therefore the concept space of the entire PubMed corpus could be captured in just over 185,000 Knowlets. The first section of this article describes the WikiProteins application and rationale in general terms. The second section describes three user scenarios enabled by the current status of the Knowlet-based Wiki system. In the third section (provided as supplementary data) a more detailed technical description of the system is given.

From today's press release:

Today sees the launch of a new collaborative website initially focusing on proteins and their role in biology and medicine. The WikiProfessional technology underlying the site has been developed based upon the collaborative Wikipedia approach. Described in BioMed Central’s open access journal Genome Biology, WikiProteins provides a method for community annotation on a huge scale.

The article is written by Barend Mons of the Erasmus Medical Center in Rotterdam, and the Leiden University Medical Center...and his co-authors...include Amos Bairoch of UniProt, Michael Ashburner of GO and Jimmy Wales, the co-founder of Wikipedia.

The source material for WikiProteins comes from a mixture of existing authoritative databases (such as the Unified Medical Language System, UniProtKB/Swiss-Prot, IntAct and GO), supplemented by concepts mined from scientific papers published in public literature databases. The automated data mining identifies ‘facts’ in these available resources, such as protein functions or protein-disease relationships. This process created over one million biomedical concept clouds – called ‘Knowlets’ – around each individual concept. The developers of the site now hope that many researchers will follow their call to annotate, via WikiProteins, the Knowlets for which they are leading experts. The method enables researchers to add data even from sources that are not openly available, such as from journals only accessible via publishers’ databases, immensely enhancing the potential for comprehensive coverage. Each page of text called up via the system is automatically indexed and concepts are connected to the WikiSpace, so that their definition comes up and the information can be edited directly from the page.

The resulting data in the Wiki is fully and freely accessible to the public, and entries can be annotated by any registered user. Mons said: “We here call on a million minds to annotate a million concepts and collect new facts from full-text literature with the immediate reward of collaborative knowledge discovery and recognition of Wiki-contributions to the scientific community.”

PS:  For background, see our earlier posts on WikiProteins, WikiProfessional, Knowlets, and Knewco (the company behind both WikiProfessional and Knowlets).

Update.  Also see Jan Velterop's post on WikiProfessional.  I'm new to the technology, but Jan is the CEO of KnewCo, the company behind it.  Excerpt:

...The idea is that the combined efforts of a ‘million minds’ would be able, in a collaborative intelligence exercise, to refine a system that 'distills' the essence of established knowledge as well as points to new knowledge that has a high likelihood of being established soon....

The concept (so to speak) is so far optimized for the life sciences and medicine, but there is no reason why it shouldn’t work in other areas as well. And in languages other than English. It is based on concepts, and those are of course valid in any language. It’s just the words or descriptions used for them are different....

Just imagine what that means. One of the beauties of the concept approach (as opposed to the keyword approach) is that search terms in one language could, for instance, yield search results in another. Think of Chinese researchers searching with Chinese terms for English literature (they can read English, but may find it more difficult to come up with search terms in English, in the same way that I find it sometimes easier to search with Dutch terms), yet getting served up with English search results. Things like that. Wonderful....

Update. See Euan Adie's critical comments on WikiProteins and Barend Mons' response.


Tuesday, May 27, 2008

Two preprint series hosted by Utrecht repository

The Logic Group Preprint Series and Artificial Intelligence Preprint Series have found a new home at Igitur, the institutional repository at the University of Utrecht.  From the announcement:

Two Philosophy preprint series, Artificial Intelligence Preprint Series (AIPS) and Logic Group Preprint Series (LGPS), are now digitally available via the Igitur Archive. For this project, all of the old issues of the series were scanned so that more than 300 preprints could be made freely available in digital form.

The LGPS regularly publishes articles and studies from researchers and teachers from the Logic Group of the Philosophy department. The Artificial Intelligence Preprint Series publishes articles and studies from researchers and teachers from the Cognitive Artificial Intelligence Group.

Another positive review of the E-LIS repository

Péter Jacsó, BOSS, E-LIS, and Haworth Press, Online Magazine, May/June 2008 (accessible only to subscribers).  An OA copy of the part on E-LIS has been posted to the AmSci OA Forum.  Excerpt:

The E-LIS database is close to my heart because it delivers for free what the publisher I pan later often does not deliver even for a fee --timely information about research in library and information science and technology. As of early 2008 it had 7,200 papers from about 700 journals.

As an open archive, it covers all fields of LIS from the theoretical to the highly practical, from school libraries to national libraries, from rare books to ebooks, all reflected in the excellent classified subject index. Sure, you can argue if there is a need for separate classes for use studies and for user studies, but the display of the number of postings for each class -to my delight- offers a rather convincing argument without saying a word.

Its coverage is highly international both in terms of the language and country of residence of the authors....

The strongest part of the software is its browsing feature, and E-LIS stands out with the number of indexes that can not only be searched but also browsed by subject, country, journal name, book name, author/editor name, and publication year. The only significant limitation is that there is no option to do exact phrase searching and differentiate between, say, information industry and industry information. For full-text databases this is important....

E-LIS is one of the many repositories that use the free and intuitive EPrint software, which is the brainchild of Stevan Harnad, the key figure of the open access movement. Harnad possesses an admirable combination of mental prowess and hyperactivity as the "archivengalist" of self-archiving who talks the talk and walks the walk tirelessly....

E-LIS is small, but it is growing. Its size doubled in the past 2 years; it could have tripled if we, the authors, would not just feel good after getting our papers published but would go that extra inch and legally deposit them in E-LIS, DLIST, and the slowly emerging institutional repositories, a process that takes just a few minutes.

PS:  Also see Jacsó's review of E-LIS from May 2007.

More on open-source drug discovery

Shambhu Ghatak has written a report on the Workshop on Knowledge Commons (New Delhi, January 18, 2008).  (Thanks to Subbiah Arunachalam.)  Excerpt:

On January 18th, 2008, Knowledge Commons, Delhi Science Forum, IIT Delhi, Red Hat and Sun organised a workshop on science policy for a very select group of 20 policy-makers....

The objective was to look at the Free and Open Source model of knowledge creation and examine the impact it can have on India. The highlight of the event was the session on Open Source Drug Discovery, a $34 million programme to fight diseases like tuberculosis, that are prevalent in India....

Milestone for QUT repository

QUT ePrints, the institutional repository at Queensland U of Technology has passed the milestone of 10,000 items on deposit.

Update to biblio on Google Book Search

Charles Bailey has released version 2 of his Google Book Search Bibliography.  From his description:

This bibliography presents selected English-language articles and other works that are useful in understanding Google Book Search. It primarily focuses on the evolution of Google Book Search and the legal, library, and social issues associated with it. Where possible, links are provided to works that are freely available on the Internet, including e-prints in disciplinary archives and institutional repositories. Note that e-prints and published articles may not be identical.

OA and authors' rights

Heather Morrison, Open Access, Authors' Rights and the Commons, a presentation at the Canadian Library Association Preconference 2008: Copyright 0.9, Vancouver, British Columbia (Canada), 2008. 

Abstract:   Open Access (OA) is beginning to open up interesting conversation about scholarship and copyright. There are already more than 3,300 fully open access, peer-reviewed scholarly journals listed in DOAJ, many millions of items available in open access archives. Research funders, universities and faculty themselves are requiring OA. A traditional copyright transfer agreement in which all rights are assigned by the author to the publisher, does not make sense in this environment. Most publishers are modifying how they work with authors. One approach is a more liberal copyright policy, which leaves some rights with the author. Some publishers use a license to publish approach, leaving copyright with the author and clarifying rights to publish.  Many authors are negotiating copyright, whether individually or through the use of Authors' Addenda. Some publishers and authors are using Creative Commons licenses.

New OA journal of massage therapy

The International Journal of Therapeutic Massage & Bodywork: Research, Education, & Practice is a new peer-reviewed OA journal from the Massage Therapy Foundation.  The journal has issued a call for papers, and expects to publish the first issue in August 2008.

Presentations from Fourth Nordic Conference

The presentations from the Fourth Nordic Conference on Scholarly Communication (Lund, April 21-23, 2008) are now online. (Thanks to the INIST Libre Accès blog.)


Monday, May 26, 2008

More on free/open and text/data

Stevan Harnad, OA Primer for the Perplexed, Open Access Archivangelism, May 25, 2008.  From the summary:

OA1 is Free Access and OA2 is Licensed Re-Use. Green OA self-archiving by authors, mandated by their universities or funders, can in principle provide OA1 or OA2, for either articles or data or both. However, it would be difficult, resisted by many authors, and probably unjust for universities to mandate Green OA1 for data or to mandate Green OA2 for either articles or data. (Funders are in a position to mandate more.)

Researchers may not want to make their data either freely accessible/useable or re-usable, and they may not want to make their articles freely re-useable. However, all researchers, without exception, want their articles freely accessible/usable (OA1).

This is the reason Green OA1 mandates are the highest priority. Authors all want Green OA1 and they report that they will comply, willingly (see Swan studies) and actually do comply (see Sale studies) with Green OA1 mandates from their universities and funders to self-archive their articles.

Moreover, OA1 for articles prepares the way and is likely to lead to OA1 and OA2 for data, as well as to some OA2 for articles.

That is why Green OA1 self-archiving and Green OA1 self-archiving mandates should be assigned priority....

Comment.  I agree with nearly all of this.  But I want to note two exceptions:

  • There are a couple of ways to interpret the claim that OA1 (gratis OA, weak OA, "free access") is a higher priority than OA2 (libre OA, strong OA, "licensed reuse").  If it means that we ought to give all our energy to the first, and succeed in attaining it, before lifting a finger for the second, then I can't agree.  However, if it means that we shouldn't delay progress on the first while we work on the second, then I agree and have often said so myself.  But I'd put the point this way:  We should work for both at once.  When we find ourselves in circumstances when the first is attainable but the second is not, which happens often, then we should accept the first, celebrate our victory, and keep working for the second.
  • I agree that, today, it would be politically difficult to adopt a green OA mandate which applied to data files or which used the stronger species of OA (OA2, libre OA).  But I don't agree that it would be unjust.  Stevan is right to predict resistance to such policies, today, but that resistance would be an artifact of historical conditions and customs (a topic for another day!), and these conditions and customs are changing even now.  I not only expect that such policies will be widely welcomed one day, but I'm working for that day.
  • BTW, there are already green OA data mandates at CIHR and ERC, and calls for them from the Chinese Academy of Sciences and the Organisation for Economic Co-operation and Development.

PKP releases OCS version 2.1

OA journals as social networks

Jon McGlone, Open Access Journals, a paper for a graduate seminar published online with CommentPress, May 8, 2008. 

From Chapter 8, Towards the Revolution: Open Access Journals as Social Networks:

In the world of scholarly journal publishing, the Public Library of Science, a non-profit publisher of several peer-reviewed open access journals, is already beginning to establish [a social networking] environment for scholars in the scientific community. On the social network level, users can create free accounts that include professional profiles, including areas of interest and research, school affiliation, and other professional details. Once a profile is created, the user is then able to “contribute” responses to articles. These usually take the form of advanced commenting, where a respondent can compose a reply that challenges or supports a statement in the article, always using citations in support of their arguments. Responses are moderated, and usually appear within a week of submission. Such types of responses mirror those conversations often found in the back pages of journals that are a type of post-publication review. Yet, as part of the digital environment, benefit from the cumulative participation of others and are not limited to two opposing voices.

Some of the journals at the Public Library of Science (PLoS) increase the user’s ability to comment on articles and contribute to the post-publication review of an article. In their publication on genetics, post-publication review manifests itself in three separate categories: notes, comments and ratings....

The implications of social network technology is yet to be fully harnessed by PLoS, but it appears that an initial framework of user accounts and profiles, rating and commenting is available. It would be interesting to see open access journal publishers begin to merge the social technologies of Facebook with the publishing and post-print review technology of PLoS, allowing users to connect with one another, and represent their real-world connections in the virtual environment and connect to relevant peer-reviewed scholarly material. Creating these types of services for scholars can add tremendous value to open access journals, and make them viable competitors to for-profit journals that are publishing only in the digital environment....

From the conclusion:

While it does not appear that open access journals will overturn the for-profit publishing industry, open access journal publishers do stand a chance to compete in the digital environment by harnessing the talents of their greatest ally and supporter, the library. In the past, the research library emerged in a brick and mortar sense to provide a knowledge commons for scholars, housing and preserving print journals. Today, similar type of action needs to be taken by libraries to ensure the growth of a new type of knowledge commons. This paper has looked at the rise of journals in the scholarly environment, their privatization, and how open access can help restore the open system that benefitted scholars of a distant past. It also has discussed the links between human modes of producing knowledge, and how revolutions to production of knowledge can stand to change humans. In today’s digital environment, where social networks increasingly become an integral part of life for many, such an application to scholarly communication seems rather necessary and increasingly fitting. Yet it is only through the continued support, advocacy and technical research into open access publishing, coupled with the conceptual thinking of scholars like Harnad that scholarly skywriting—or the fourth revolution of the production of knowledge—can truly be actualized.

Updated ranking of world repositories

Webometrics has updated its ranking of world repositories.  From the site:

Our schedule is to publish an updated version of the Ranking Web of World Repositories two times per year (January and July) as we already do with the other Rankings produced by our Cybermetrics Lab. In the meantime we want to offer a second and last beta version that collects some of the comments and advices we have received. The major changes are as follows:

- New repositories have been added, like PubMedCentral that was inadvertently missed from the previous edition. Other repositories not focused on research papers have been excluded.

- Rich files ranking is now including only Adobe Acrobat pdf files, as the numbers involved for other formats are very low for ranking purposes. For the same reason, only data extracted from Google and Yahoo are considered for this indicator.

- Scholar ranking is build now on the mean between the total number of items and those published between 2001 and 2008. This is to increase the weight of the "fresh papers" deposited.

Proposal to webmasters for 2009 edition

Usage data could be very useful for the Ranking Web of World Repositories as it has been suggested by most of the academic community. It is difficult to obtain data from every repository and in fact most of the figures are not comparable because there is no standard yet about statistics collection. We suggest using a common, simple and free system to solve, at least partially, this problem. We ask to webmasters to install Google Analytics at their websites. We will then be able to kindly ask you for the statistics collected during the period from July 1st -December 31st so this data can be used for the January 2009 edition of the Ranking.

In the latest ranking, the top five repositories in descending order are arXiv, SSRN, RePEc, E-LIS, and Citeseer.  Note that the ranking is a weighted average of four criteria, not a simple ordering by size.

OAN is six

Today is the sixth birthday of Open Access News.  This morning it had 13,921 posts, and should pass 14,000 in early June.


Sunday, May 25, 2008

Report on the APE 2008 meeting

Svenja Hagenhoff and Chris Armbruster have written a report on the APE 2008 conference, Academic Publishing in Europe (Berlin, January 21-23, 2008).  (Thanks to Arnoud de Kemp.)  Excerpt:

In the Opening Keynote, Prof. Dr. Rolf-Dieter Heuer (Research Director, DESY Hamburg, Director- General elect, CERN, Geneva) stressed the traditional importance of preprints in High-Energy Physics since the 1960s, with online circulation beginning in 1991. In a community in which the authors are the readers and vice versa, repositories have become the vehicle of scholarly communication as researchers need full access to text, data and all kinds of ancillary objects (e.g. conference slides). Journals serve as evaluation agencies and keepers of the record. CERN and the Helmholtz Alliance have committed themselves to establish open access as the publishing solution for HEP by redirecting subscription money to pay for publishing. The Sponsoring Consortium for Open Access Publishing in Particle Physics (SCOAP3) estimates that EUR10M is needed annually to fund the publishing of about 5,000 articles. Nearly half the sum for SCOAP3 has already been pledged by major European players and efforts are underway in North America and East Asia. Prof. Heuer clarified that he sees HEP OA publishing as an ideal test-bed for scientific OA publishing more generally
- in order to get the costs for peer review and publication controlled in the long run.

In the second keynote Dr. Arne Richter (Executive Secretary, European Geosciences Union) gave a visionary presentation of the future confluence of the internet and open access. Any scientific community may organise itself to publish the best journal in the field, strive for the highest impact factor and comprehensively enable re-use by adopting a Creative Commons Attribution License. Rent-seeking publishers would be unable to stop this trend because of the complementary nature of open access and the internet, which favours open content that may be searched, mined, downloaded, re-used and so on. Moreover, digital publishing technology and software has advanced to the point at which much of the publishing process may be automated, enabling a business model based largely on service charges for authors in need of support with preparing an article for publication....

Dr. Ulrich Pöschl (Max Planck Institute for Chemistry, Mainz) demonstrated how open access journals may reinforce their mission and standing by adopting a collaborative peer review process by having public peer review and an interactive discussion (e.g. the journal Atmospheric Chemistry and Physics)....

Dr. Birgit Schmidt (Göttingen University Press) emphasised that GUP was pro-actively pursuing an open access publishing strategy [for books], relying on a repository and connecting to DRIVER – the Digital Repository Infrastructure Vision for European Research. Up to 40% of GUP publications are in STM....

The closing panel Information in Science and Society was chaired by Arnoud de Kemp (Electronic Publishing Working Group in Börsenverein). The panel consisted of Barbara Casalini (Managing Partner, Casalini Libri, Fiesole), Gary Coker (Director of R&D, MetaPress, Birmingham (USA)), Dr. Annette Holtkamp (Scientific Information Specialist, DESY, Hamburg), Dr. Elisabeth A.L. Mol (Editorial Director, Springer Science+Business Media, Dordrecht), Prof. Dr. Rudi Schmiede (Darmstadt University of Technology) and Dr. Ing. Herman P. Spruijt (Vice-President, International Publishers Association, Geneva). Firstly, panellists gave their impressions, noting the presence of the Humanities alongside STM, voicing the conviction that OA was here to stay, encouraging further dialogue between the proponents of subscription-based and open access business models and highlighting that preservation costs are gaining more attention....Finally, a desire was expressed to investigate the consequences of top-level green mandates by funders such as NIH and the ERC....

Green OA and open data

On Thursday, Peter Murray-Rust posted some thoughts on open data in chemistry.  When I blogged them, I added this comment:

I follow and agree with all of this, with one exception:  [PMR said,] "Green Open Access is irrelevant to Open Data (I think it makes it harder, others disagree)."  I don't understand the claim or the argument, but I  imagine we'll hear more in time.... 

Today Peter responded to my comment in a blog post.  (Thanks, Peter.)  Excerpt:

Green Open Access describes a process - primarily of an author self-archiving her “paper” to an Institutional repository or their own web page. There are mechanisms for indexing repositories....

Green Open Access results in the full-text (versions may vary) of a paper being publicly visible, indefinitely, without price barriers. There are no default permissions - Green does not per se remove any permission barriers. In particular GOA does not actively support the extraction of data (of course an author may be permitted by some publishers to allow data extraction)....

GreenOA does not, in general, say anything about copyright or licences. The paper may or may not carry a publisher’s copyright, an author’s copyright and (frequently) none. There is almost never a formal licence. There is almost always no formal statement of policy for re-use....

There is no explicit mention in the GreenOA upload model for items other than the “full-text”. The repositories may provide such support but - at least in the early days - the focus was completely on full-text only....

I hope we can all agree on these and I’ll start making my argument here....

So by default GreenOA items are designed to be human-visible but without any support for Data, in any of upload, legal access and technical access. The primary goal of Stevan Harnad - expressed frequently to me and others - is that we should strive for 100% GOA compliance and that discussions on Open Data, licences and other matters are a distraction and are harmful to the GOA process. I suspect that many other do not take such a strong position. However if Open Data is irrelevant or inimical to GOA then it is hard to see GOA as supportive of Open Data.

However my main argument is that lack of support for Open Data in GOA is potentially harmful to the Open Data movement. Let’s assume that Stevan’s approach succeeds and we get 100% of papers in repositories through University mandates, funders et. al. (I’ll exclude chemistry from the argument). GOA will encourage the deposition of full-text only.

So a GreenOA paper may often be a cut-down, impoverished, version of what is available - for a price - on the publishers website. It may, and usually will, lack the supporting information (supplemental data). It will probably not reproduce any permissions that the publisher actually allows. So - if we concern ourselves with matters other than human eyeballs and fulltext - it is almost certainly a poorer resource than the one on the publisher site....

So my major concern is that GreenOA will lead to substandard processes for publishing scientific data. I’d be happy to find Repositories that insist on data upload. I doubt they are common.

So here is a challenge to the community: How many instances are there of crystallographic data (CIF) self-archived with GreenOA papers. It’s allowed to archive the data. There are enough publishers (Wiley, Elsevier, Springer) who allow GreenOA. If no-one can find examples then again I would justify the use of “irrelevant”....

Many funders (Wellcome, and we heard from Robert Kiley 8 other major UK medical funders) require ultra-strong-OA for their archival. Because they care about data. And several publishers (PLoS, BMC) also insist on CC-BY. This is, of course, great for scientific data. But it’s a long way from GreenOA.

Comments 

  • First, I generally agree with PMR's opening characterization of green OA.  I'd only add that we should distinguish green OA itself from the strategy proposal (which I do not endorse) to slow down on the pursuit of open data until we succeed with open texts.  As usual, I think we should proceed on all fronts at once.  I generally agree as well with PMR's understanding of the state of open data in OA repositories.  But in describing this state, I'd put the accent in a different place. 
  • It's true that most OA repositories today are optimized for texts and not optimized for data.  It's also true that few institutions (universities, funders, publishers) encourage or require the deposit of data files in repositories.  Finally, it's true that most OA repositories will accept data files, even if few researchers are depositing data files.  With this background, my response reduces to to two quick points:
    1. First, it doesn't follow that green OA is "irrelevant" for open data, merely that we are under-using the opportunities it provides for open data.  We shouldn't confuse researcher practices or institutional policies with repository capacities or green OA.  If under-using an opportunity made it irrelevant, then conservation would be irrelevant to climate change and green OA would be irrelevant even to text files. 
    2. Second, we have a long way to go to make most repositories as useful for data files as they are for text files.  But it doesn't follow that green OA is irrelevant or harmful for open data, merely that its capacity to help users do useful work with OA data files must continue evolving.
  • There are many projects trying, in many different ways, to make green OA even more relevant and useful for data than it is now, e.g. by increasing data deposits in repositories and allowing fuller use of data already on deposit.  For example, see ASSDA (from ANU), CESSDA (from NSD), Commons of Geographic Data (from the U of Maine), DANS (from the Royal Netherlands Academy of Arts and Sciences), LEAP (from AHDS), LinkingOpenData (from W3C), Pangaea (from a coalition of German research institutions), and StORe (from JISC).

Introduction to the Health Commons

John Wilbanks, Executive Director of Science Commons, has made a 6.5 minute video on his vision for a Health Commons.  I recommend it as a succinct overview of the obstacles slowing down the development of new cures and the solution he's proposing.

For more detail, see the just-released white paper he co-authored with Marty Tenenbaum, Health Commons:  Therapy Development in a Networked World, May 2008.  Tenenbaum is the founder of CommerceNet and CollabRx.  Excerpt:

The current path to drug discovery also perpetuates old traditions of information and intellectual property control. This deeply set inability to capture collective learning dooms everyone to revisit infinitely many blind alleys. The currency of scientific publication encourages individual scientists to hoard rather than share data that they will never have the time or resources to exhaustively mine. And, the wealth of “negative” information gleaned from clinical trial data is mostly lost to the need for companies to safeguard their commercial investments. Although computational and systems biology, aided by Moore’s law, make it feasible to systematically search the vast space of targets, leads, and interactions, this potential is limited in practice by lack of access to data, compound libraries, specimens, and shared services essential for economies of scale. As a result, many biological promising leads, and the knowledge surrounding them, are ultimately discarded.

Thankfully, we have a rare moment in time where we can change the entire system in one motion by establishing a collaborative ecosystem of knowledge and research services that can be rapidly assembled to develop new therapies with unprecedented efficiencies and economies of scale. We can create the same radical increase in efficiency for scientific research that commerce saw in the 1990s, as secure Internet transactions transformed many vertically integrated industries into horizontally integrated ecosystems of service providers and consumers. The explosion of contract vendors in biotechnology, covering the spectrum from gene to protein to drug discovery, development and trials, is one factor. The emergence of the Semantic Web for science is part of the story, as is the existence proof that common use licensing can create explosive value in software and culture. And the power of the network to bring these elements together into a unified system, a Health Commons, is the final piece of the puzzle....

Health Commons is a coalition of parties interested in changing the way basic science is translated into the understanding and improvement of human health. Coalition members agree to share data, knowledge, and services under standardized terms and conditions by committing to a set of common technologies, digital information standards, research materials, contracts, workflows, and software....

Scientific publishing is integral to the drug development process. But in the digital age, we must question whether the unit of a published paper is really the most efficient means of disseminating scientific knowledge. The elegance, clarity, and value of a carefully assembled, constructively peer-reviewed, professionally copy-edited and laid-out research article is clear. However, in this process, much information is delayed, or worse, lost. Interim data and results are typically discarded, especially the results of failed experiments, dooming others to waste time rediscovering them over again. Clinical trial data may never be published; a trial that fails because of an unknown toxicity, for which data has been captured previously, is both expensive –-- and tragic for the patients involved. Although journals and funding agencies are committed, in principle, to requiring data associated with publications be made available, in practice, this only succeeds in the few cases for which community endorsed repositories exist. And beyond access to data, there’s the deeper issue of making the conclusions conveyed in a scientific paper available in a structured form that can be understood and manipulated by computers as well as human scientists.

In Health Commons, all this will be different.  By integrating the TOPAZ publishing platform, which currently supports PLoS ONE, into Health Commons, publication of research results will be a visible and automatically staged process. Knowledge will simply be promoted from one’s personal repository in the Commons to be shared with one's laboratory, shared with one's collaborators, and ultimately to be made publically accessible. PLoS, along with other participating publishers, will provide vetting at many levels from community voting to review boards, as appropriate. Review by one's peers will occur at many stages: formal editorial boards could still provide traditional journal imprimaturs alongside more radical experiments in community voting....

Beyond ensuring timely access to knowledge by humans, semantic annotation is also the key to making that knowledge machine-understandable....

Because the range of potential applications is unlimited, computer access to published data and knowledge is likely one day to be at least as important as eyeball access....

The Health Commons is too complex for any one organization or company to create. It requires a coalition of partners across the spectrum....

Health Commons is a new and very practical project, not just a plan or vision.  The founding partners are Science Commons, CommerceNet, Public Library of Science, and CollabRx .

More on Microsoft's withdrawal from Academic Search, Book Search, and book scanning

Here are some comments from around the web on Microsoft's decision to pull the plug on Academic Search, Book Search, and book scanning.

From Brewster Kahle at the Internet Archive (note that Microsoft has been a partner in the Open Content Alliance, run by the Internet Archive):

The Internet Archive operates 13 scanning centers in great libraries, digitizing 1000 books a day. This scanning is financially supported by libraries, foundations, and the Microsoft Corporation. Today, Microsoft has announced that it will ramp down their investment in this area. We very much appreciate their efforts and funding in book scanning over the last 3 years. As a result, over 300,000 books are publicly available on the archive.org site that would not otherwise be.

To their credit, they said they are taking off any contractual restrictions on the public domain books and letting us keep the equipment that they funded. This is extremely important because it can allow those of us in the public sphere to leverage what they helped build. Keeping the public domain materials public domain is where we all wanted to be. Getting a books scanning process in place is also a major accomplishment. Thank you Microsoft.

Funding for the time being is secure, but going forward we will need to replace the Microsoft funding. Microsoft has always encourage the Open Content Alliance to work in parallel in case this day arrived. Let's work together, quickly, to build on the existing momentum. All ideas welcome.

Onward to a completely public library system!

From Farhad Manjoo at Salon:

The company says it "recognizes" that closing these services will "come as disappointing news" to publishers and Web searchers. And yet Microsoft says it must shut them down anyway, because letting people search through books and academic journals no longer fits into the company's business strategy.

What's that new strategy? Microsoft wants to help people who have "high commercial intent." I am not making that up. Satya Nadella, the company's vice president for search, actually uses those words. Microsoft would simply prefer to build search engine just for people looking to buy stuff....

On the other hand, if you are, inexplicably and ungratefully, simply looking for information, Microsoft wants no part of that. Why don't you go to Google or some kind of soup kitchen, you no-good freeloader?

This is heroically stupid. Seriously, is it any wonder that this company -- this company which has, for a decade now, flailed about in all its efforts online -- has found itself so outgunned by that Ph.D.-machine over in Mountain View? ...

To be sure, Google wants to make money, and it, like Microsoft, has been fantastically successful at that. But on many of its products, Google makes no money at all.

It sees no cash in scanning library books or searching scholarly journals....

But Google derives enormous indirect benefits from these non-commercial projects. College students, for instance, spend endless hours on Google's Web search engine, as well as on Google Scholar and Google Books, as part of the research. Where do you suppose the students will be inclined to go, later on, when they're looking for sunglasses?

Google's willingness to spend on not-in-it-for-the-money projects also surely helps it recruit the best minds in tech. I've spoken to Googlers who joined the firm primarily because they believed in its mission....

From Rick Prelinger on the Association of Moving Image Archivists list:

...The good news is that Microsoft is removing the restrictions that it had placed on the out-of-copyright books they paid to scan. These books will be available through the Internet Archive and the Open Library (http://openlibrary.org). The Open Library supports full-text queries. MSFT is also letting the IA keep the extensive scanning infrastructure that it partly paid to develop....

The bad news is that MSFT's significant support for digitization will be winding down. We are working to find funding so that we can continue, and even increase, our efforts. We would like to keep the cultural heritage that's held by the world's major libraries accessible through the public and not-for-profit sector, rather than through a small number of commercial enterprises.

I think there's a important lesson here for public and nonprofit archives and libraries. We can't rely on the commercial sector to build and maintain persistent, long-lasting collections. If we're going to fulfill our mission to preserve cultural heritage, we will have to find ways to do it within noncommercial institutions, organizations that can take a longer view without falling victim to short-term pressures.

[PS:  For a response to Prelinger's concluding lesson, see Jim Lindner's post to the same list.  Thanks to Klaus Graf for links to them both.]

From Danny Sullivan at Search Engine Land:

...Gosh, Google somehow seems to be able to run a sustainable business model and devote some energy and resources into indexing books and scholarly information, even if those generate little to no revenue. They do it in part because they think it's good business to provide all types of searches, not just those that will earn them money.

In the middle of a search war, I can understand that a "distraction" like book and academic search might seem like something to Microsoft that has to go. However, Microsoft's not hurting for cash to keep it up, if it wanted. Dropping it makes Google seem less like the evil giant working for its own benefit that Microsoft would hope people view it as....

From Richard Wallis at Panlibus:

...It is interesting that they have come to the realisation that the best way for a search engine to make book content available will be by crawling content repositories created by book publishers and libraries.  - The question of course is who’s search engine.

Without doing much reading between the lines, it is clear that Microsoft have failed to see a business model in the worthy job of digitizing the world’s books.  I wonder if there is one, or does the answer lay with open data projects like the Open Library, the Million Book Project, and the sharing of libraries.

OA and public intellectuals

Kate Milberry, The public intellectual: Bridging the scholar/activist divide, a presentation at the Annual Conference of the International Communication Association (Montreal, May 22-26).  Excerpt:

The public intellectual, to my mind, is one who not only engages in civic life, but is motivated by a sense of responsibility and a shared humanity to “be of service.” This is guided by a simple statement: Be the change you want to see in the world....There are at least four strategies being useful, for making change:

1. Public dissemination of research/ideas: The case of Open access

2. Civic engagement: Walking the walk

3. Subversive Teaching

3. Norms-based research...

Public dissemination of research: The case of Open Access

The public intellectual free shares her intellect, research, thoughts and ideas, with the broader community - through public lectures, media interviews, and popular and academic articles

This strategy counteracts monopolies of knowledge, which Innis identifies as biased idea structures that control and legitimate – or authenticate – knowledge. They further promulgate and reinforce power imbalances within society while at the same time concealing such imbalances

According to Innis, knowledge monopolies develop in conjunction with closed communications....

The academy is perhaps one of the main producers of monopolies of knowledge, aside from industry and privatized science. The academic journal publishing system – as one example - is heavily reliant on copyright and expensive subscriptions, which effectively restrict this knowledge to the rarified environs of the ivory tower....

Open knowledge production is a self-conscious practice that has historical and theoretical roots in the technical development of the computer and computer networking....

The Creative Commons initiative is one major outcome of the FSM, moving the copyright debate to another level. Declaring only “some rights reserved”, Creative Commons uses private rights to make public goods....

The idea of open knowledge production has now moved beyond software, into cultural production, and also into academia, with initiatives like open access journals, open genetics, open geodata and open content....

These publishers rely on the free labour of academics – in terms of writing, editing and reviewing – yet claim the copyright for all articles and continue to increase subscription fees beyond the rate of inflation. This makes their journals unaffordable to many universities, whose libraries have been forced to cancel subscriptions, reducing the number of titles they carry....

The open access movement, which supports free and open access to all scholarly research online, has contested the old (print-based) publishing model, demonstrating that knowledge is a non-depletable resource – a public good, and not a commodity....