Open Access News

News from the open access movement

Saturday, May 30, 2009

Google searches on public data

Google launched its search service for public data in late April.  It didn't get much attention before it was overshadowed by the publicity surrounding the mid-May launch of Wolfram|Alpha.  But it's definitely worth a look.   Here's a quick comparison of the two.

Like Alpha, Google's public data search returns graphs displaying data in response to a search query.  Like Alpha, it cites the sources for its data.  Like Alpha, it only knows what it knows.  While Alpha will return a polite error message when you ask about data it doesn't have ("Wolfram|Alpha isn't sure what to do with your input"), Google defaults to the results of an ordinary Google search on your searchstring. 

Unlike Alpha, when Google returns a graph, the graph is interactive, giving you the option to add or subtract lines of relevant data.  For example, if you search for "unemployment rate USA", you'll get a chart as the first hit on the return list, and an ordinary hit list below it.  If you click on the chart, you'll have options to superimpose on the US curve the unemployment curves for any state or combination of states.  If you expand the outline under a state's name in the left sidebar, you'll have the same options to view the unemployment curves for any county or combination of counties. 

Like Alpha, when Google public data has answers, it's very useful.  When it doesn't, we can only hope that it doesn't stop adding new datasets.

Also see Google's help page on public data search, and its page on how to add new open datasets to the service.

Don't confuse the public data search service with other recent data-related Google innovations such as Google Squared, Rich Snippets, Wonder Wheel, and Timeline (which shouldn't be confused with Google's News Timeline).  For a good review of the latter cluster of innovations, see Laura Gordon-Murnane's article in the the new Information Today.

Comment.  Wolfram|Alpha and Google are both proving that making datasets OA enables third parties to amplify their utility.  Wolfram and Google are certainly not the first to do so, but they're among the most conspicuous and influential.  The lesson:  If you have a dataset you're willing to make OA, then make it OA.  If you don't know of free online tools to make the data queryable, interactive, or visual, don't wait for someone to develop them.  Just make the file OA and let other people work on that side of things.  For years now we've had this situation with texts:  if you make a text freely available online, others will find it, use it, crawl it, and at the very least improve its discoverability.  One reason to be excited:  We're entering that age for data files.  Another reason:  the enhancements possible for data files are much richer than those possible for text files.

Google Wave and open science

Cameron Neylon is one of the first to see the implications of Google Wave for open science.  From his reflections (today):

Yes, I’m afraid it’s yet another over the top response to yesterday’s big announcement of Google Wave, the latest paradigm shifting gob-smackingly brilliant piece of technology (or PR depending on your viewpoint) out of Google. My interest, however is pretty specific, how can we leverage it to help us capture, communicate, and publish research? And my opinion is that this is absolutely game changing - it makes a whole series of problems simply go away, and potentially provides a route to solving many of the problems that I was struggling to see how to manage.

Firstly, lets look at the grab bag of generic issues that I’ve been thinking about. Most recently I wrote about how I thought “real time” wasn’t the big deal but giving the user control back over the timeframe in which streams came into them. I had some vague ideas about how this might look but Wave has working code. When the people who you are in conversation with are online and looking at the same wave they will see modifications in real time. If they are not in the same document they will see the comments or changes later, but can also “re-play” changes....

Another issue that has frustrated me is the divide between wikis and blogs. Wikis have generally better editing functionality, but blogs have workable RSS feeds, Wikis have more plugins, blogs map better onto the diary style of a lab notebook. None of these were ever fundamental philosophical differences but just historical differences of implementations and developer priorities. Wave makes most of these differences irrelevant by creating a collaborative document framework that easily incorporates much of the best of all of these tools within a high quality rich text and media authoring platform....The Waves themselves are XML which should enable straightforward parsing and tweaking with existing tools as well.

One thing I haven’t written much about but have been thinking about is the process of converting lab records into reports and onto papers.  While there wasn’t much on display about complex documents a lot of just nice functionality, drag and drop links, options for incorporating and embedding content was at least touched on. Looking a little closer into the documentation there seems to be quite a strong provenance model, built on a code repository style framework for handling document versioning and forking....

Finally the big issue for me has for some time been bridging the gap between unstructured capture of streams of events and making it easy to convert those to structured descriptions of the intepretation of experiments.  The audience was clearly wowed by the demonstration of inline real time contextual spell checking and translation. My first thought was - I want to see that real-time engine attached to an ontology browser or DbPedia and automatically generating links back to the URIs for concepts and objects. What really struck me most was the use of Waves with a few additional tools to provide authoring tools that help us to build the semantic web, the web of data, and the web of things....

Google don’t necessarily do semantic web but they do links and they do embedding, and they’ve provided a framework that should make it easy to add meaning to the links. Google just blew the door off the ELN [Electronic Laboratory Notebook] market, and they probably didn’t even notice.

Those of us interested in web-based and electronic recording and communication of science have spent a lot of the last few years trying to describe how we need to glue the existing tools together, mailing lists, wikis, blogs, documents, databases, papers....That problem, as far as I can see has now ceased to exist. The challenge now is in building the right plugins and making sure the architecture is compatible with existing tools. But fundamentally the framework seems to be there. It seems like it’s time to build.

New issue of Research Information

The June/July issue of Research Information is now online.  Here are the OA-related articles.

Excerpt from Murphy's article on Co-Action:

Just over two years ago, in the heart of the Nordic countryside, three women embarked on a new venture: to launch a journal publisher and consultancy service. As well as its all-female founding team and base away from any established commercial or publishing hub, the new publisher, Co-Action Publishing, has bucked tradition by opting for the open-access (OA) publishing model.

The three founders, Anne Bindslev, Caroline Sutton and Lena Wistrand, are all former executives of the Nordic division of Taylor and Francis. In their old jobs they had noticed a growing interest in OA from the large publisher’s society clients and they concluded that this was the most promising approach for a new, small publisher.

Sutton said: ‘2007 was an interesting time. BioMed Central and PLoS been around for some time and Hindawi had converted its last two subscription titles to OA. Such publishers had shown that it really was a viable model, but at the same time there were not too many people doing it – few of the established publishing houses were entertaining the idea of OA publishing – so we could still be early into the market.’

Sutton does not think that it would be possible today to launch a new publishing house based on subscription journals. ‘It takes five or six years for a journal to really become established enough to generate a profit. With OA and a publication fee model you are earning revenue at the same time as you are incurring costs,’ she explained.

‘We used our own savings rather than having an external investor and have tried to make everything as virtual as possible. It surprises me that more small OA publishing companies haven’t been formed already.’ ...

Another project the company has embarked on is to provide services to other groups which want to create their own open access venture. They can, of course, hand a whole venture over but otherwise they can buy ‘pieces of help’ in packages put together in partnership with its suppliers.

Sutton believes there will be a lot of growth in independent projects from groups that can run a publication themselves but might need help in setting up their systems. Co-Action Publishing has a new tool to help with this, called

As a small company with a base outside of the world’s publishing centres, Sutton believes that it is critical to talk with others in the industry. The company is a member of STM. It has also been involved with other OA publishers such as PLoS, Hindawi and BioMed Central in setting up a new trade organisation to specifically address the interests of OA publishers – the Open Access Scholarly Publishers Association (OASPA)....

[Quoting Sutton:] ‘Given that Co-Action Publishing ended its first year with a small deficit and this year we expect to break even, I have to say that our formula for OA publishing is working for us.’

The benefits go beyond business issues though: ‘Working as an OA publisher is professionally stimulating,’ said Sutton. ‘On the one hand, the ties we have to the research community, to libraries, research councils, and academia in general are much stronger – there is a sense that we are working from the same side of the fence.

On the other hand, active contributors to OA publishing discussions are talented people who dare to envision where scientific communications may be heading. This combination offers a fantastic platform from which to design innovative solutions and create new opportunities that are beneficial to the research community,’ she concluded.

Podcast on OA to government data

Transforming Government Data, a 52 minute podcast interview with Clay Johnson first broadcast on NPR, May 26, 2009.  (Thanks to ResourceShelf.)  From the blurb:

Sunlight Lab's Director Clay Johnson was a guest on the nationally-syndicated The Kojo Nnamdi Show, a program produced by National Public Radio-affiliated WAMU FM, where he joined a panel discussion on how non-profits and cities like Washington, D.C., are enlisting help from civic-minded developers to help make government data more open and usable.

Taking innovation and access seriously

Kaitlin Thaney, $120m - will it help, and a look at the greater issues, Sniffing the beaker, May 29, 2009.  Thaney is the project manager at Science Commons.  Excerpt:

...[I support the] NIH's recent commitment of $120 million over five years for drugs and therapies for rare and neglected diseases....But the real question...[is] will that make a difference?

One of [the] meetings I attended over the last 3+ months of travel was a recent summit on innovation and access by the National Organization of Rare Diseases (apparently "neglected" is implied? :) ) ...It was their annual meeting, and as somewhat expected, included an initial 1.5 hour "congratulations" and offering of thanks to one another for their "contributions to rare disease research" as a kickoff to the meeting.

Excuse me here for being cynical and a bit brash, but were they congratulating one another for a drug pipeline that a) is insanely costly, b) takes approx. 17 years to get a drug to market IF it succeeds, c) doesn't work in favor of the community they're serving? ...

[I]n the midst of all of the congratulating for thousands of people in their "network" not having any sort of drug or treatment, these two words ["innovation" or "access"] - the themes of the event - went unmentioned.

I raised my hand to have a turn, puzzled by all of this and made a comment, which was later backed and echoed by Janet Woodcock of the FDA and Francis Collins, the famous geneticist (thanks to both). I talked about the reasons we were all there - to talk about "innovation" and "access" in terms of access to research, accelerating scientific discovery, new "innovative" models to help fix this broken pipeline we all were dancing around, and get therapeutics and results to patients faster, cheaper and more successfully. It was astonishing a) how many people were nodding and smiling when I brought this to the forum and b) the fact that if not said, it may have gone unmentioned for the rest of the meeting. All of a sudden, the tone changed - with Francis Collins emphasizing the importance of Open Access and Janet Woodcock even saying "Put information into the public domain".

Small wins in an area that still needs a bit of coaching (like others, certainly) on making better use of a poorly funded wing of disease research.

Will $120 million over 5 years make a difference? Certainly, in some respect. How large of a difference depends on what model is constructed to hopefully better share the scientific knowledge we're pumping tens of hundreds of thousands of dollars into, the funding model, etc. Perpetuating the "walled garden" approach does not "fix" the system....

PS:  Note that Francis Collins may soon be the next Director of the NIH.

DataFerrett for US govt open data

In January (or so) the US government released DataFerrett, a free tool for searching, browsing, combining, and analyzing open data released by the federal government.  DataFerrett can draw data from many different sources, display it in graphs or tables, and produce reports. 

Friday, May 29, 2009

Forthcoming delayed OA journal of workplace learning

Impact: Journal of Applied Research in Workplace E-learning is a forthcoming journal published by the E-learning Network of Australasia with a 6-month delayed OA policy. (Thanks to Mark Lee.)

New OA journal on spatial concepts

The Journal of New Frontiers in Spatial Concepts is a new peer-reviewed OA journal from the Karlsruhe University Press.  Though the journal has been publishing articles since February 2009, in German and English, it will not officially launch until June 9.  It's Karlsruhe's first online journal.

Also see the announcement in German or Google's English.

Wilbanks talk on libraries and the commons

John Wilbanks has posted a slidecast (slides and audio) of his talk on libraries and the commons, prepared for the Canadian Association of Research Libraries.

More on the Wellcome Trust OA mandate

Robert Kiley, Open Access Mandates:  View from the Wellcome Trust, a slide presentation from today's RIN/RSP meeting, Research in the open: How mandates work in practice (London, May 29, 2009). 

The other presentations are not yet online.

New open anthropology community

Open Anthropology Cooperative is a new community site. (Thanks to Open Anthropology.)

OCLC releases OAI tool for museums

Online Computer Library Center, OCLC releases software suite to help museums exchange data, press release, May 22, 2009. (Thanks to Information Today.)

OCLC Research has released a software suite to help museums exchange object descriptions and share data, the result of a cooperative effort made possible by a grant from The Andrew W. Mellon Foundation to further develop infrastructure for museum data exchange.

OCLC is using the Mellon grant to fund projects involving OCLC Research on behalf of the RLG Partnership and its art museum partners to build an information architecture and model behaviors that museums can use to routinely exchange data.

Museums participating in this effort have a common interest in being able to share information about collection items and digital images from their own institutions, with other art museums, and with content aggregators such as ARTstor or OCLC.

The software was released as part of the OCLC Research Museum Data Exchange Project, which supported museums from the RLG Partnership in defining requirements for tools, and created or contracted the creation of code. ...

Museums now have access to COBOAT and OAICatMuseum 1.0 software. COBOAT is a metadata publishing tool developed by Cognitive Applications Inc. (Cogapp) that transfers information between databases (such as collections management systems) and different formats. ...

COBOAT software is now available on the OCLC Web site under a fee-free license for the purposes of publishing a CDWA Lite repository of collections information ...

OAICatMuseum 1.0 is an Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) data content provider supporting CDWA Lite XML. It allows museums to share the data extracted with COBOAT using OAI-PMH. OAICatMuseum was developed by OCLC Research and is available under an open source license ...

CISTI launches gateway to datasets, resources on data management

National Research Council Canada Institute for Scientific and Technical Information, NRC-CISTI launches gateway to scientific data, press release, May 14, 2009. (Thanks to Fabrizio Tinti.)

Scientific data generated during the research process can be an important resource for researchers, but only if it is accessible and usable. Thanks to a new initiative of the NRC Canada Institute for Scientific and Technical Information (NRC-CISTI) researchers now have a central gateway for easier access to Canadian scientific, technical and medical (STM) data sets and other important data repositories.

The Gateway to Scientific Data will help ensure that the valuable data generated by Canadian researchers is more easily accessible so that it can be re-used for other research endeavours. With the ability to access and use data from a multitude of sources, researchers will be better positioned to turn research into discoveries and innovations. ...

Along with links to data sets, the new Gateway provides links to selected policies and best practices guiding data management and curation activities in Canada. It also includes links to selected journals and upcoming conferences and meetings.

The Gateway to Scientific Data is part of NRC-CISTI's contribution to a broader national initiative undertaken by the Research Data Strategy (RDS) Working Group to address the challenges and issues surrounding the access and preservation of data arising from Canadian research. ...

Science Commons expands outreach on OA data

Tom Sinclair, ‘Open source’ technology is his passion, The Nelson Institute Blog, May 27, 2009.

If the phrase “open source” warms your heart, you’ve got a soul mate in Puneet Kishor.

The Nelson Institute Ph.D. student in environment and resources has been named a Science Commons Fellow by Creative Commons, a nonprofit corporation that promotes free, legal sharing of creative cultural, educational, and scientific materials.

As such, Kishor says he intends to “evangelize” open access to geospatial data, science, and technology. His interest in open access stems from a recent stint as a science and technology policy fellow at the National Academy of Sciences and as an elected charter member of the Open Source GeoSpatial Foundation. ...

Pressure to publish and TA journal delays make OA journals look good

Melissa Gregg, Damn the publishers, The Australian Higher Education, May 27, 2009.  (Thanks to Colin Steele.)  Excerpt:

How much longer can Australian universities accept the lack of outlets to publish research in this country? The Excellence in Research for Australia [ERA] initiative will make publishing outcomes more important than ever....

Scholarly publishing for unknown authors is in a state of almost complete lock-down. Leading professors will tell you it's been that way for years....

The ERA emphasis on quality and quantity bears no realistic relationship to the opportunities that are available to the majority....

The situation is nothing short of alienating. The highlight of the job - getting published - has become an exercise in minimising losses from poor odds.

If emerging scholars were actually consulted about the changes affecting their prospects, they'd testify that open access journals with effective peer review systems already demonstrate alternatives to this model. Aside from the worldwide exposure it offers research, the great benefit of online publishing is its speed. It allows young academics to contribute to their field in a timeframe that can match today's steep requirements for employability....

Comment.  Note that the January 2009 draft guidelines for the ERA research assessment program expect that most research articles will be deposited in OA repositories.  (I don't know whether this expectation made it into the final version of the guidelines.)  Apparently that deliberate, direct support for green OA has been supplemented in practice by inadvertent, indirect support for gold OA, as scholars like Gregg discover that the delays at conventional TA journals hinder their career advancement under the new rules.

Thursday, May 28, 2009

Suggestions on cancelling journals

Barbara Fister, Notes from a Catastrophe: Easing the Pain of Budget Cuts, Library Journal, May 28, 2009.

Nine suggestions for librarians on how to cancel journal subscriptions when rising prices and shrinking budgets make it necessary.  Here's the ninth:

Take advantage of a teachable moment: Discuss with faculty how you see their students doing research. Help them understand how much full-text databases and the familiarity of Google have influenced undergraduate research practices. Talk about what's behind the crazy escalation in the cost of journals. Tell them how to find journals they can publish in that are open access and why that may make their own research more likely to be cited. You could even make an opportunity to take your own stand —as we did [at Gustavus Adolphus College] when our library passed its own open access pledge....

OA survey for scholars in the humanities and social sciences

OAPEN (Open Access Publishing in European Networks) is running a survey on OA for scholars in the humanities and social sciences. 

Don't be deterred by the German introduction.  The survey itself is in English.  (Here's the introduction in Google's English.)

PS:  In my first draft of this post, I mistakenly said that the whole survey was in German.  Thanks to Klaus Graf for the correction.

Over the horizon: document sharing with Google Wave

This morning at the Google I/O Developer Conference (San Francisco, May 27-28, 2009), Google launched a developer preview of a new communications platform called Wave 

Wave will be open source, rest on open standards (particularly HTML 5), and offer open APIs.  It's an ambitious, versatile tool that will implicate OA primarily in the ways in which it supports document sharing and collaborative document writing. 

For an early review of Wave's capabilities, see the article by Juan Carlos Perez in PC World or the article by MG Siegler in TechCrunch.

Also see the Wave "about" page, the Wave developer blog.

Update (5/29/09).  The 120 minute video demo of Wave wasn't available yesterday but it's available today.  Recommended. 

New platform for interacting with gov. data

Socrata is a new platform for publishing and interacting with government data. Most of the data currently on the site is OA, but the platform "enables [publishers] to set specific dataset pricing levels".

OA bibliography of Canadian books moves to new home

Library and Archives Canada, Launch of Book History Databases, press release, April 24, 2009. (Thanks to ResourceShelf.)

Library and Archives Canada is pleased to announce the launch of five major databases which catalogue thousands of publications on the history of print culture in Canada from its beginnings in the sixteenth century to the twenty-first century [History of the Book in Canada].

These scholarly electronic resources, which have been available through the Internet via Dalhousie University's website since 2003, were originally funded in 2000 by a Major Collaborative Research Initiative Grant from the Social Sciences and Humanities Research Council of Canada. Not only were they developed to support the preparation of publishing six volumes (three in English, three in French) chronicling the history of print culture in Canada, but also to establish important resources for advancing the field when the project concluded. ...

Recognizing the value of this rich suite of databases for Canadian scholarship, Library and Archives Canada has agreed to receive and relaunch the databases, so that they will continue to be publicly accessible and will grow in importance as new records are added and thus contribute to Canadian studies and History of the Book scholarship in significant ways.

Notes on OA to PSI in Canada

Mark Tovey, Change Camp Ottawa: Open Data and Open Access, WorldChanging Canada, May 23, 2009. Notes on a session on public sector information at ChangeCamp Ottawa (Ottawa, May 16, 2009).

Milestone at Open Ed. News

Open Education News recently passed the 500 post milestone.

Comment. Congratulations!

Article metrics at PLoS

Peter Binfield, Article-Level Metrics at PLoS (presentation to NISO), everyONE, May 27, 2009.

A couple of weeks ago I had the opportunity to give a presentation entitled “Article-Level Metrics (at PLoS and Beyond)”, to a webinar organised by [the National Information Standards Organization]. The presentation, and synced audio, can be viewed at Myplick. The web resources which are mentioned in the presentation are located here. ...

Although a handful of journals have now started to provide online usage data for each article, PLoS is going further than this. We are at the start of a program to provide citation data, usage data, social bookmarking activity, media coverage, blog coverage, commenting activity, ’star’ ratings, and more, on every article that we publish. This presentation explains our motivation for this program, as well as what we have done so far and our plans for future developments. ...

Wednesday, May 27, 2009

Hoping to launch as an OA publisher

Verlagsstarter has been nominated to win some stART.hilfe funds from the Duisburg Philharmonic Orchestra and upload-Magazin to help it launch as an OA publisher.

More on OA at U. Pittsburgh Press

Peter Murray, Online Editions of Out-of-Print Books Results from Library/Press Partnership at Univ of Pittsburgh, Disruptive Library Technology Jester, May 26, 2009.

... Earlier today, I had a conversation with Rush Miller, library director at the University of Pittsburgh, about the joint effort between the university press and the university library system. Cynthia Miller (press director) and Rush arrived at approximately the same time 15 years ago at the University of Pittsburgh. Over the course of that time, the two have shared many discussions about open access content. A few years ago, they established a model for working together: the press would clear the rights for books (the press generally had the rights to publish in paper, but not digital) while the libraries would digitize the books, mount them on library servers, and do the graphic design work for the online site. With this model, they mounted 15 titles from the press’ Latin American series. The libraries also supplied the Chicago Digital Distribution Center (CDDC) with the digital scans for the Bibliovault print-on-demand service. The library has seven full-time people in the digital services department, plus support from systems analysis and developers from elsewhere in the library.

They had been closely studying the usage and sales data with the trial content and had found that online access didn’t necessarily cannibalize print sales. In fact, one title sold about 100 copies last year while having near zero sales the previous few years. (Adoption for a course is the suspected reason, and the item was probably found because the digital edition was online). Books that have been out of print for 20 years are now getting use as soon as the digital editions are available.

With the initial success, the libraries and press moved forward with digitizing and mounting the 500-title backfile represented by this announcement. This was a significant effort on the part of the press to clear the rights for all of these titles — about a year’s worth of work. The partners are already looking forward to another round of titles to be digitized and mounted online. ...

See also our past post on the recent announcement.

Radio interviews with Bora Zivkovic in Serbian

Bora Zivkovic has linked to podcasts of his interviews with Radio Belgrade on OA in Serbian.

More on the access crisis: UC libraries face budget cuts of up to 20%

The University of California Libraries has released an Open Letter to Licensed Content Providers, May 26, 2009.  (Thanks to ResourceShelf.)  Excerpt:

The University of California Libraries ask all information providers with whom we negotiate content licenses to respond to the major fiscal challenges affecting higher education in California in a spirit of collaboration and mutual problem-solving. We expect to work with each of our vendors at renewal to develop creative solutions that can preserve the greatest amount of content to meet the information needs of the University of California’s students, faculty, and researchers.

The University of California Libraries, including the California Digital Library (CDL), share the economic concerns expressed in the Statement to Scholarly Publishers on the Global Economic Crisis issued by the Association of Research Libraries and the Statement on the Global Economic Crisis issued by the International Coalition of Library Consortia. The economic crisis affecting libraries is particularly acute in California....

As a state-supported institution, the University of California has experienced significant budget reductions in fiscal year 2009, with more reductions to come. The $531 million shortfall now anticipated in state funding for the 2009-10 fiscal year amounts to nearly 17 percent of the $3.2 billion the state provides UC annually. Numerous cost containment measures are in place across the university, including salary and other compensation freezes for senior managers, hiring curtailments for other staff, travel restrictions, and other mandated reductions. More information about the UC budget situation is available on the University’s Web site....

UC Libraries are being hit hard by the budget reduction mandates in effect at each of the UC campuses. Targeted reductions to library materials budgets for fiscal year 2010 vary across the campuses, with some as high as 20%. Many campuses have been alerted that additional cuts will be levied in fiscal year 2011. Coupled with the typical inflationary increases for scholarly publications, the erosion of library buying power will have a profound and lasting impact on all of the UC libraries. Monographic purchasing has already been seriously curtailed, and every electronic content license is being placed under careful scrutiny.

Comment.  In addition to the ARL and ICOLC statements mentioned in the letter, also see the statements from RIN (in March 2009) and NERL (in April 2009). 

Bringing Gutenberg ebooks to more readers

Michael Hart, New Goal Set for Project Gutenberg: One Billion Readers, Project Gutenberg News, May 24, 2009. (Thanks to ResourceShelf.)

The first goal of Project Gutenberg was simply to reach totals of estimated audiences of 1.5% of the world population, or the total of 100 million people.

With the advent of cell phone access we are now setting our goal at 15% of the world population or 1 billion.

Given that there are approximately 4.5 billion cell phones now in service around the world, that means we would have to reach just over 1/5 of all cell phone users to accomplish this. ...

This has to include many more languages than English, of course, so our effort also has to be multi-lingual, if we are to reach anyone beyond the number of people comfortable enough with English to read our eBooks on their cell phones.

As many of you know, we already have well over a thousand book titles in French, followed by lesser numbers in German and the other more popular languages, but not nearly enough to really, sincerely, say we are offering a library in these languages. ...

OJS in Romanian and Welsh

Yahoo Image Search adds CC filter

Danish anthropology journal opens access to its backfile

Ethnologia Europaea has decided to provide OA to its back issues, with a three-year moving wall.  However, it has only digitized its file back to 2004.  It's still looking for funds to digitize the issues from 1966-2004.  (Thanks to

OAN is 7

I forgot to mention that yesterday was the 7th birthday of OAN.  Yesterday we had 17,391 posts, which comes to about 7 a day for 7 years.

Tuesday, May 26, 2009

Blog notes from Open Repositories conference

Charles Bailey has posted a roundup of links to blog coverage of Open Repositories 2009 (Atlanta, May 18-21, 2009).

Update. See also notes by Elliot Metsger, Peter Murray-Rust, and Les Carr.

Next NIH Director: probably Francis Collins, probably soon

Francis Collins said to be contender to run NIH, Los Angeles Times, May 23, 2009.  Excerpt:

Francis S. Collins, the scientist who led the U.S. government drive to map the human genetic code, is the leading candidate to run the National Institutes of Health, a source familiar with the selection process said.

Screening for Collins, 59, is in the final stages, said the source. Collins would take over an agency that President Obama has made key to his plans for reviving the U.S. economy and overhauling healthcare. The 27 institutes and centers under the NIH umbrella employ more than 18,000 people and fund research at thousands of universities and medical schools.

The former head of the National Human Genome Research Institute, a member agency, Collins became a driving force in the race to catalog the 3 billion letters of the human genetic code. As director of the institutes, Collins will face calls to boost spending on cancer research and free science from politics as well as financial conflicts of interest.

"NIH is a huge enterprise, and I think Francis has very good experience with getting the best out of a huge enterprise from what he did in the genome project," said David Baltimore, a biology professor at Caltech who won the 1975 Nobel Prize in medicine, in a telephone interview in February. "He's also very well liked in Congress."

Collins didn't respond to efforts to reach him. The White House declined to comment....

Comments.  This matters for two reasons:

  1. Collins is not just a leader in mapping the human genome, but in making the results OA.  He has also defended OA at the NIH's PubChem against anti-OA lobbying by the ACS.  Kathy Hudson, Director of the US Genetics and Public Policy Center, described Collins as "a tireless champion of data sharing and open access to scientific information...."  When Celera made its genomic data OA in 2005, Collins told the Baltimore Sun that "[t]his data just wants to be public....It's the kind of fundamental information that has no direct connection to a product, it's information that everybody wants, and it will find its way into the public."  Collins would be the most experienced defender of OA ever to take the reins of a US federal agency.
  2. The fact that Collins is in the final stages of vetting means that we'll soon have an NIH Director.  The position has been vacant since Elias Zerhouni stepped down in October 2008, and the leadership vacuum has impaired the fight against the Conyers bill.  Note David Baltimore's assessment that Collins is "very well liked in Congress." 

Also see our past posts on Collins.

Canadian cities moving on open data

City of Vancouver embraces open data, standards and source, CBC News, May 22, 2009. (Thanks to Michael Geist.) See also our past post.

Vancouver city council has endorsed the principles of making its data open and accessible to everyone where possible, adopting open standards for that data and considering open source software when replacing existing applications. ...

[City councillor Andrea] Reimer had argued that supporting the motion would allow the city to improve transparency, cut costs and enable people to use the data to create new useful products, including commercial ones. She had also noted that taxpayers paid for the data to be collected in the first place. ...

According to Reimer, only a few other cities such as Washington, D.C., San Francisco and Toronto have started moving toward this kind of increased openness. ...

Toronto Announces Open Data Plan at Mesh09, Visible Government, April 13, 2009. (Thanks to

City of Toronto mayor David Miller announced [the city]'s plans for an open data catalouge at Mesh09 [Toronto, April 7-8, 2009] last week. Miller, who is in charge of the 6th largest government body in Canada, made a strong case for the benefits of open government data. His arguments (transcribed from video) deserve repeating:

... I am very pleased to announce today at Mesh09 the development of, which will be a catalogue of city generated data. The data will be provided in standardized formats, will be machine readable, and will be updated regularly. This will be launched in the fall of 2009 with an initial series of data sets, including static data like schedules, and some feeds updated in real time.

The benefits to the city of Toronto are extremely significant. Individuals will find new ways to apply this data, improve city services, and expand their reach. By sharing our information, the public can help us to improve services and create a more liveable city. And as an open government, sharing data increases our transparency and accountability. ...

WHO innovation plan approved after dropping R&D treaty

William New, Broad Plan On IP, Innovation In Developing Countries Approved At WHO, Intellectual Property Watch, May 22, 2009.

Applause broke out at the annual World Health Assembly Friday as agreement was reached at the end of a five-year process to devise a plan for boosting research and development on and access to drugs needed by developing countries. Now with the full assembly’s approval, the focus turns to five-year implementation and as-yet unclear ways to pay for it. ...

Agreement in committee was reached after a group of developing countries eager to discuss a possible treaty on biomedical R&D dropped a demand to include the WHO as a stakeholder in discussions about the treaty ...

The approved global strategy and plan of action on public health, innovation and intellectual property aims by 2015 to train over 500,000 R&D workers, improve research infrastructure, national capacity and technology transfer, and lead to numerous other outcomes such as creating 10 public access compound libraries and 35 new health products (vaccines, diagnostics and medicines). ...

The WHO legal counsel gave an opinion to the committee that dropping the WHO as a stakeholder would not prejudice the R&D treaty issue as it is addressed in a separate expert working group on financing to continue deliberations this year under a mandate from the 2008 assembly. Those proposals are still on the table and could go the assembly next year, the counsel said. It also would not prevent any member state from making any proposals to the Executive Board as is standard WHO process. ...

To accomplish all of the proposed activities was estimated by the secretariat to cost nearly $150 billion over the period of implementation. But several participants de-emphasised those estimates as hard to verify. ...

Meanwhile, NGOs Health Action International and IQsensato this week issued a proposed way for countries to monitor implementation of the strategy and action plan. The proposal is available here. ...

Comments. Background:

  • The World Health Assembly, the governing body for the World Health Organization, formally approved the Global Strategy and Plan of Action on Public Health, Innovation and Intellectual Property. The broad plan was drafted and revised through a working group over several years. The WHA had approved a draft plan last year, which did not include complete timeframes, progress indicators, estimated costs, and lists of stakeholders.
  • The plan includes an element directly related to OA, element 2.4(b), "strongly encouraging" publicly-funded researchers to self-archive. This element was watered down from a mandate in 2007.
  • The plan also suggests working on an R&D treaty. The draft treaty includes an OA mandate. The final version of the plan approved retains the reference to the treaty, but removes WHO from the stakeholders: i.e., WHO will not proceed with work on the treaty under the aegis of this plan. However, discussions on the treaty can proceed independently of WHO, and the treaty can continue to be discussed through other mechanisms at WHO (and in fact is already included in ongoing discussions on another topic). So the removal of WHO from the list of stakeholders in an R&D treaty doesn't kill the treaty, but it delays WHO's involvement indefinitely.

Draft code of conduct for public health data sharing

On May 8, Elizabeth Pisani released the first draft of the Bamako data sharing code of conduct.

The code arose from last year's Global Ministerial Forum on Research for Health (Bamako, Mali, November 17-19, 2008), where participants formulated the Bamako Call to Action on Research for Health, which included a call for "open and equitable access to research data, tools, and information...."  For more background, see Pisani's slide presentation at the November 2008 meeting on the need for a data sharing code of conduct, and a report on the the discussion following Pisani's presentation. 

From the May draft code:

...What is driving the exponential growth in knowledge in areas such as genetics, astrophysics, information technology? Data sharing....

Epidemiology and public health have been left behind in this data sharing revolution, mired in a culture that restricts access to data and information. This is in part because of a perceived need to protect the privacy of individuals involved in research. But public health is a public good; in public health research there’s an ethical imperative to use information gathered from individuals to benefit the greatest possible number of people. Public health deserves to advance at the same speed as genetics, where data sharing has led to an explosion of progress. The World Health Organisation and several funders of public health research, led by the Wellcome Trust, are thus supporting the development of a code of conduct to encourage greater sharing of public health data. The code seeks to provide guidance for funders of data collection and for institutions that collect and analyse data, including those who perform secondary analysis on data collected by other people. The principles espoused by the code are universal....

The draft code presented here is the product of initial discussions between epidemiologists and data managers from all continents. They gathered with a number of representatives from governments, international organisations and major funders of public health research in London on October 6th, 2008 to agree on the core principles in the code. The discussions of this Working Group were informed by a background paper which reviewed the major challenges to more open exchange of public health data, challenges that can be categorised broadly as incentive-related, capacity-related, ethical and technical. The draft code is structured around these four areas. The background paper has been updated to reflect the outcome of the meeting, and is appended here....

A code of conduct on data sharing is an important first step in striking the balance between the advancement of science and the rights and needs of individuals and communities....

To the extent possible, the code promotes the sharing of micro-level data -- that is, individual level records. There may occasionally be reason to restrict access to individual level data. There is rarely any reason at all to restrict access to aggregated data....

We support the maximum public access to data of public health importance compatible with the following principles:

  • The protection of privacy of individuals from whom data are gathered
  • Fair reward for the work of data collectors and primary investigators
  • Maximum public health benefit delivered in a reasonable time frame....

Limited-time exclusive access for primary researchers

Data are available to the research team involved in data collection and their institutional partners for a fixed period (between six and 18 months) before they are shared. This allows the research team a head start on data analysis and publication....

Following a period of exclusive access for primary researchers where necessary, the most common levels for access to data of public health importance will be:

Fully open access

Data (anonymised where necessary) are made available in machine-readable formats on publicly-accessible websites. This is most desirable and should be encouraged where feasible and compatible with privacy....

Controlled public access

Data are made available to authorised users after a screening process. This is likely to be the most common form of access for data of public health importance....

Collaborative access among scientists

Data are made available to other scientists in a collaborative network. Collaborative access may be necessary for complex datasets that include sensitive information where anonymisation is difficult (e.g. longitudinal data sets including HIV status)....

Exclusive access for primary researchers

Data are only available to the research team involved in data collection and their institutional partners. This is currently the norm in public health data collection, but it is precisely this norm that the current code seeks to change. There are few cases in which this degree of exclusivity is necessary in the long term....

Increasing the incentives to share data

Under the Code of Conduct on Data Sharing we agree to:

Put past data sharing performance on a par with publication as a criterion for evaluating the performance and job suitability of scientists, as well as evaluating grant proposals.

Reward concrete plans for data sharing when evaluating funding proposals for research and routine health systems functions such as surveillance.

Develop citation standards and indices for shared data sets; commit to using them when publishing secondary analysis.

Require registration of public-health related research and data collection in open access data-bases to facilitate data discovery and create demand for shared data.

Encourage submission of micro-data to public repositories as a condition for journal publication of research results.

Promote a “creative commons” approach, in which derived datasets and secondary analysis files based on shared data are in turn made publicly available.

Support an ombudsman system to oversee the fair use and proper acknowledgement by secondary users of shared data....

Using technology to increase data sharing

Under the Code of Conduct on Data Sharing we agree to:

Commit to a single metadata standard for datasets of public health interest....

Ensure that metadata are open access and machine-readable, even for data that are shared under the controlled or collaborative access standards.

Support the development of “open source” software for management, documentation and analysis of public health data....

Taking the code forward...

In trying to meet the needs of [a] huge and varied constituency, the current draft code is vague: phrases such as “promote x” and “encourage y” predominate. As the code develops, we hope that it will become more concrete: “Funding institutions commit to investing in x”, “Secondary analysts agree to provide y”....


More on Merck's Sage

Rick Mullin, Merck Seeds An Open Database With Computers And Data, Chemical & Engineering News, May 25, 2009.

Stephen Friend and Eric Schadt came to Merck & Co. in 2001 when the drug company purchased Rosetta Inpharmatics. They will be leaving this summer and taking with them a data-packed 10,000-processor computer cluster at Rosetta's facilities in Seattle.

Friend and Schadt are launching Sage Bionetwork, an open-platform database for sharing and disseminating complex disease biology data. What's spurred them is the massive influx of biological data in drug research and the need for collaboration in understanding the biological mechanisms of disease. ...

"We are headed toward a clinical genomic Tower of Babel where people each have their own view of what's going on and can't talk to each other," Friend says. "The reason I am leaving Merck is that I fundamentally believe we need a different mechanism, a space between the private and public sectors that will reward people for sharing and that will make disease biology a precompetitive space." ...

Friend says Merck's willingness to share the data in Seattle with other research organizations by simply handing it over to Sage is an indication that large drug companies are becoming more willing to work collaboratively and are beginning to broaden the definition of public, precompetitive data. "I don't think the pharmaceutical industry is willing to share compound data," he says, "but it is willing to share disease biology data."

Sage intends to expand its data center through partnerships. Its first partner, the Fred Hutchinson/University of Washington Cancer Consortium, is local. But Schadt envisions partnerships worldwide. The group is in talks with the Wellcome Trust, in London, and with potential partners in China. Sage's tentative launch date is July 1.

See also another story in the same issue on the use of cloud computing for research, including its use at Sage and comments by John Wilbanks.

See also our past posts on Sage.

Update. Schadt has announced he'll be taking a new day job rather than working at Sage full-time.

Update. See also this interview with Friend.

... [Q:] Do you see signs within pharma that pre-competitive sharing can gather momentum?

I think it is and it isn’t... People are waking up: Oh my god, I don’t know what to do with the data. Where they are not willing to be pre-competitive is when they start in with strategies. People have made a mistake in trying to get companies to cooperate when they absolutely need to have an advantage and to have something that’s theirs. That split between what is and what’s not pre-competitive has gotten garbled. Why won’t pharma companies work together? That’s the wrong argument. ...

Milestone for IR at Aberystwyth U.

Nicky Cashman, CADAIR's 2000, posted to JISC-REPOSITORIES, May 22, 2009.

Along with the recent announcement from Stirling and STORRE (many congratulations to you), I too would like to blow the proverbial trumpet for Aberystwyth University’s online repository CADAIR.

As of 21st May 2009, we now have over 2,000 deposits - 69 of which are theses, the majority of items being journal articles. The remainder is made up of book chapters and conference proceedings. ...

At present, Aberystwyth does not have a mandate for its academics and the dissemination of full text theses and dissertations is still optional. Therefore, with continual raising of awareness and encouraging academic staff and [post-graduate] students to deposit, the message is getting through and being positively taken on board by many.

EU geodata policy headed for UK law

Jo Walsh, INSPIRE Directive heading towards UK law, Open Knowledge Foundation Blog, May 24, 2009.

INSPIRE, the directive establishing a spatial data infrastructure for environmental information in Europe, is heading into UK law at last. [The Department for Environment, Food and Rural Affairs] is doing a consultation on the transposition of the law and [the Open Knowledge Foundation] will hopefully co-submit a response by 26th May with the Open Rights Group, a summary of the responses is on the okfn-discuss mailing list.

In short it is fairly good news for those of you who are tiring of having requests for information about data holdings from the likes of Ordnance Survey, Transport for London, refused under [Freedom of Information] on the grounds of commercial confidence. Public authorities affected by the Freedom of Information Act 2000 Schedule 1 will be obliged to make the metadata for their geodata holdings available to the public free of cost, from 24th December 2010 (okay, so it’s still a bit of a wait). Additionally, “view services” complying with the Web Map Service spec will have to be available in just over 2 years time, for which there will be a “presumption of public access”.

So we will see Ordnance Survey’s MasterMap available in full via WMS (if still restricted for commercial use) or there will be a very good reason why it is not. ...

Extending the Bermuda Principles

Elizabeth Pennisi, Group Calls for Rapid Release of More Genomics Data, Science, May 22, 2009.  (Thanks to Garrett Eastman.)  Accessible only to subscribers.  Excerpt:

In 1996, at a meeting in Bermuda, researchers participating in the Human Genome Project opened the floodgates of DNA data by agreeing to release sequence information daily into a public database....In 2003, the genome-sequencing community reiterated this pledge at a follow-up meeting in Fort Lauderdale, Florida, and came up with guidelines on how prepublication data should be used.

Now pressure is mounting to extend the Bermuda Principles to a broad range of publicly funded projects that go beyond sequencing. They include whole-genome association studies, microarray surveys, epigenomics scans, protein structures, large-scale screening of small molecules for biological activity, and functional genomics data, only some of which are now covered by prepublication data-release policies....Last week, at the International Data Release Workshop held in Toronto, about 100 researchers, ethicists, and funding agency representatives began to hammer out guidelines for such efforts....

At the meeting, participants debated how to ensure that researchers who release data early get credit for the work and a chance to publish their analyses first....

[Another] issue is how to permit access to the data while protecting privacy — a task complicated by the fact that some databases contain information from multiple countries that vary in their patient-protection rules....

In the next several months, a final report —and, it is hoped, a publication on the topic— should help spell out how to extend prepublication data release beyond the sequencing community and further the discussion on controlled-access databases....

PS:  I can't find a web site for last week's data-release workshop in Toronto where the new guidelines were taking shape.  If anyone can help, please drop me a line and I'll update this post.

US commitment to global health should include commitment to OA

The U.S. Commitment to Global Health:  Recommendations for the Public and Private Sectors, National Academies Press, May 20, 2009.  Prepublication edition of a major report from the Institute of Medicine (IOM) of the US National Academy of Sciences.

Also see the IOM splash page on the report and its press release from May 20, 2009.  From the report itself, see especially

From the press release:

To fulfill America's humanitarian obligations as a member of the international community and to invest in the nation's long-term health, economic interests, and national security, the United States should reaffirm and increase its commitment to improving the health of developing nations....

The study was sponsored by the Bill & Melinda Gates Foundation, Burroughs Wellcome Fund,, Merck Company Foundation, Rockefeller Foundation, U.S. Department of Health and Human Services, U.S. Department of Homeland Security, and U.S. Department of State.

From Chapter 3:


3-3. The U.S. research community should promote global knowledge networks and the open exchange of information and tools that enable local problem solvers to conduct research to improve the health of their own populations.

  1. Funders of global health research should require that all work supported by them will appear in public digital libraries, preferably at the time of publication and without constraints of copyright (through open access publishing), but no later than six months after publication in traditional subscription-based journals. Universities and other research institutions should foster compliance with such policies from funding agencies and supplement those policies with institution-based repositories of publications and databases....

From Appendix F (by Anthony So and Evan Stewart):

...To ensure greater access to scientific publications, several strategies have been deployed. One has involved tiered pricing, and the other, the pooling of published research in open access journals or repositories....

Across disciplines ranging from electrical engineering to mathematics, the free, on-line access of journal articles corresponded to higher mean citation rates.  Several studies suggest that open access articles have a higher citation rate than closed-access articles.  This held true even when comparing open-access articles compared to non-open-access articles in the same journal. Importantly, the impact of open access publication on citations in journal publications was twice as strong in the developing world....

Several prominent medical research funders have made open access a condition of grant support....

The sharing of research data and materials enables the scientific community to confirm study findings and also to build upon the work of others. Access to these building blocks of research, however, may also be encumbered for reasons similar to those encountered over scientific publications. The difference is that access to data and materials enriches immensely the pursuit of new hypotheses that derive or go substantially beyond its original research use....

As with publications, open access may also multiply the impact of research data. For example, in a 2007 study of 85 cancer microarray clinical trial publications, the public sharing of available data contributed to a 69% increase in citations.  While half the trials in the study made their data publicly available, they comprised 85% of the total citations....

[T]he willingness of some funders and even some universities to support upfront fees for publication in open access journals is a promising step in this direction, perhaps one that might be emulated when patenting to protect public access is at stake [by paying the transaction costs of patent pooling]....

Yet arguably if publicly funded research were not freely available, the taxpayers would have paid for the results several times over—grants for the academic research, salaries for those academics giving their time for peer review, and subscriptions for such journals....

This calculus of “pay now or pay more later” might guide where the public ought to direct its investments to maximize the returns to the health care system. For example, in the value chain of scientific journal publication, paying the publication fees for open access journals is one way of supporting a business model that encourages the sharing of knowledge. Going further, the U.S. government could develop a system of supporting open access journals that publish peer-reviewed, publicly funded research. For those open access journals that charge publication fees, it could build support into the direct or indirect cost structure of grants. For those open access journals that do not charge fees, it could provide direct or indirect subsidies. Either way, it could support journals that provide open access rather than impose subscription fees on patients, providers and universities....


Cornell is considering an OA journal fund

Cornell University is considering a fund to pay publication fees at fee-based OA journals.  Thanks to Philip Davis for the alert and this summary and comment:

...John Hermanson, professor of biomedical sciences and chair of the library board, presented the Open Access author fund proposal on March 11 to the Cornell Faculty Senate. Like other proposals, the fund would cover the author processing fees for those who wish to publish in Open Access journals.

According to the Senate minutes, Cornell University Librarian, Anne Kenney, is “extremely interested” in allocating $25K of library funds, with the other $25K being matched by the Provost.

Anticipating the question on whether $50K was sufficient, considering that some journals charge up to $3,000 per article, Hermanson claimed:

We don’t know if that is enough. From the previous experiences we have been discussing initially, it’s probably more than is necessary because there are a number of open access journals already available to faculty to publish in and they are relatively low in terms of volume submission they have seen....

Of course, low demand may not be the future for author publishing funds, and this is where governance becomes a significant issue....

I spoke with Hermanson on this issue, and he responded that the details have not yet been worked out by the library board, but that the board understands the importance of implementing policies and priorities on how the monies should be spent.  These details should be no secret....

From his Cornell Faculty Senate presentation, Hermanson understands that publication is tightly coupled with the promotion and tenure of junior researchers and that denying publication funds may have serious deleterious effects on future careers....

Update (5/26/09).  Also see Stevan Harnad's comment:

It is beyond my powers of comprehension to fathom why Cornell University would want to throw $50K of scarce library funds at funding Gold OA publication (for at most 1% of Cornell's annual journal article output) without first mandating Green OA (for the remaining 99% of Cornell's annual journal article output) at no cost at all....

If and when all of Cornell's annual journal article output -- about 7.5K articles per year, according to Web of Science -- is made Green OA by a self-archiving mandate, and all other universities do likewise, the planet will have 100% Green OA to all journal articles. If and when the availability of universal green OA induces institutions to cancel all their journal subscriptions, then Cornell's $9M annual windfall cancellation savings will be more than enough to pay the peer review costs for Gold OA for its annual 7.5K articles. Paying a much higher price per article pre-emptively now, when the relevant funds are still tied up in subscriptions, while not even providing Green OA to 100% of Cornell's own research output, is a real head-shaker....

Comment (5/26/09).  I can't agree that OA journal funds are squandering money.  They support OA journals, which need support in parallel with (not just after) our support for OA repositories.  But I do agree, and have often argued (see also 1, 2, 3, 4), that "any university which understands the need for OA should also adopt a strong policy to ensure green OA for its research output....Unlike a gold OA policy, a green OA policy covers all the peer-reviewed articles published by faculty, regardless of the journals in which they choose to publish."

More US institutions join SCOAP3

Monday, May 25, 2009

Brief guide to EPrints v3.2

Les Carr has posted a brief slide presentation on the features of EPrints 3.2 (currently in alpha release, with a public beta expected later this year).

Presentation on IRs and preservation

Dorothea Salo, Digital preservation and institutional repositories, presented at the Summer Institute for Data Curation (Urbana, Illinois, May 21, 2009). A slide presentation.

Presentation on OAI-ORE

Herbert Van de Sompel, An Overview of the OAI Object Reuse and Exchange Interoperability Framework, presented at Inforum 2009 (Prague, May 26, 2009). (Thanks to Charles Bailey.) A slide presentation on OAI-ORE.

See also our past posts on OAI-ORE.

OA books to solicit reader feedback

Keith Fahlgren, Collaborative Publishing Based on Community Feedback, O'Reilly Labs, May 21, 2009. (Thanks to Charles Bailey.)

... [O]ur first manuscript (Programming Scala) is now available for public reading and feedback as part of our Open Feedback Publishing System. The idea is simple: improve in-progress books by engaging the community in a collaborative dialog with the authors out in the open. To do this, we ... built a system to regularly publish the whole manuscript online as HTML with a comment box under every paragraph, sidebar, figure, and table.

After the impressive success of the Rough Cuts program from Safari Books Online, which we've long supported, and Real World Haskell, which used a similar system, we we're extremely eager to try the idea out with more titles. Here's how Bryan O'Sullivan, one of the authors, summarized the idea once they were close to submitting their manuscript for publication:

How has our system of open, incremental development worked out? In my estimation, it has been a fantastic success, far overwhelming my expectations.

  • We have received 7153 comments so far.
  • That's an average of 1.73 comments per paragraph.
  • The usual number of technical reviewers for a technical book is 2.
  • 748 people have commented so far on our drafts.

Feedback from our readers has had a profound effect on the development of the book. ...

Bryan has since open sourced his Django-based feedback system ...

Yahoo releases open geodata

Gary Gale, Announcing GeoPlanet Data, The Yahoo! Geo Technologies Blog, May 20, 2009.

... Today at Where 2.0 2009 in San Jose, we’ve announced the public release of GeoPlanet Data, a downloadable resource of the geo data that underpins both [Yahoo!] GeoPlanet and Placemaker™.

GeoPlanet Data is a freely available, tab delineated download which is released under the Creative Commons Attribution license. Both GeoPlanet and Placemaker use GeoPlanet Data’s:

  • millions of place names in multiple languages
  • WOEIDs [Where on Earth IDs] - Geo Technologies’ unique and permanent identifiers
  • vertical and neighbouring relationships for each place ...

On the pressure for society journals to sell out

Peter Murray-Rust, Should the Foology Society sell its journals to commercial publishers, A Scientist and the Web, May 17, 2009.

I received a request from a well-known learned society, which I will anonymise as the Foology Society ... A well-known scientist and long-standing member and officer of the Society (Prof. Foo) rang me and asked if I could give her informal advice about whether the Society should sell its flagship journal to a commercial publisher. The motivation was not primarily to raise revenue, but fear about the commercial prospects of society journals. She sent me the following which epitomizes the concerns of many of the society officers:

Libraries will target most of their cost cutting attentions on the smaller academic/not-for-profit sector subscriptions in order to protect the large commercial contracts such as the “Big Deal” and similar consortia. The days of small independent publishing are over, out-licensing is the only way to protect publishing income.

She regards this as a catastrophe for the society and the journal and asked if I could provide contrary views.

I immediately replied that on no account should the society sell its journal – it was the crown jewels and far too many societies had sold these. ...

As I blogged recently a major asset in C21 will be trust. I still trust learned societies to behave honorably (and when they do not it is deeply upsetting). I do not now trust commercial publishers to act honorably in all circumstances. The lobbying in Congress, Parliament, Europe by commercial publishers is often directly against the interests of scientists, most notably through the draconian imposition of copyright. The PRISM affair highlighted the depths to which some publishers will go to protect their income rather than the integrity of the domain. For Elsevier to finance PRISM to discredit Open Access science as “junk” while publishing “fake journals” means that no society can rely on their integrity. ...

It is possible that consortia such as Highwire can provide a critical mass but I don’t know enough about their purchasing or selling influence (if any). I believe they might provide the sort of bundle that Prof Foo needs.

How is the Foology Soc to continue to remain solvent? ... My own view is that societies must become points of rapid innovation, perhaps by teaming up with Universities and maybe through organs such as JISC. ...

Report from Sound Archives Film Image Repository project

Julie Allinson, SAFIR (Sound, Film, Archives Image Repository) Project: Final Report, report to JISC, January 2009. (Thanks to Charles Bailey.) Executive summary:

The SAFIR (Sound, Film, Archives Image Repository) Project was established to aid the University of York in starting a much longer project to establish a multimedia repository and flexible centralised Digital Library infrastructure. It was a small project with clear aims to

  1. Gather and examine user requirements;
  2. Evaluate and select software based on those requirements;
  3. Devise policies, processes and profiles to support the ingest of data and the creation of metadata,
  4. Implement the software with some access control and basic interoperability with other systems
  5. Review the copyright status of resources and clear any copyright as necessary

The project approach was to devise a work plan and a set of work packages, recruit staff and to focus on key project goals. In selecting an open source software product for the Digital Library infrastructure, the project took a more developmental direction and thus some level of re-planning and flexibility have been required, including agreeing a 6 month extension to the project. The project team work closely together and also with colleagues in Computing Service, the Library and the Virtual Learning Environment team. Engagement with external users and colleagues is critical and has been achieved through an Academic Advisory Group and direct meetings with academics. A Steering Group provides strategic leadership.

The project has achieved its aim and now has a functioning implementation of the Fedora Commons software with the open source Muradora add-on in place to provide a public interface for searching, browsing and accessing objects. Two collections of image objects have been migrated into the Digital Library and a basic level of access control is in place. An extensible metadata creation tool has been developed to create rich image-specific metadata. This tool uses a number of techniques to expedite the process of describing images. A range of policies and public documents have been created to support the development of the Digital Library, including a requirements specification containing high- level model and detailed requirements; and our content model for images, a document which provides a blueprint for how we create, describe and manage images within the Digital Library, including copyright and licensing issues. Intangible outcomes include the expertise we have built and the contributions to other work across the library and repositories community.

Creating a Digital Library is challenging and time consuming, particularly when building a bespoke system as we have chosen to do at York. The project continues to need careful management to ensure users remain engaged without raising expectations, along with good communication and realistic estimates on how long development will take. The decision to follow a software development path was taken in order to build in-house skill and tailor to our specific needs. The success of this will only become known as we roll out our system to users and make further customisations.

SAFIR is the start of our work to build a Digital Library and has provided a focus for driving the project forward. Over the coming two years we will extend the project, working closely with users to test usability and extending our content collections to incorporate a richer range of digital object types and subject disciplines.

A Norwegian perspective on the OA impact advantage

Bjarne Røsjø, Meir sitering med åpen publisering,, May 20, 2009.  On the advantages of OA, especially for citation impact.  Read it Norwegian or Google's English.  (Thanks to Karen Marie Øvern.)

Trends in digital scholarly communication

Nancy L. Maron and K. Kirby Smith, Digital Scholarly Communication: A Snapshot of Current Trends, Research Library Issues no. 263, May 2009.  Excerpt:

...Summary of Findings

Digital innovations are taking place in all disciplines....

Digital publishing is shaped powerfully by the traditions of scholarly culture....

Some of the largest resources with greatest impact have been in existence a long while....

Achieving sustainability —especially for those resources with an open-access mandate— is a universal challenge....

Final report on the launch of the Welsh Repository Network

Jackie Knowles and Stuart Lewis, Final Report of the Welsh Repository Network Start-up Project, JISC, April 21, 2009.  (Thanks to Charles Bailey.)  Excerpt:

The aim of the Welsh Repository Network (WRN) was to put in place an essential building block for the development of an integrated network of institutional digital repositories in Wales. The project entailed a centrally managed hardware procurement programme designed to provide every HEI in Wales with dedicated and configured repository hardware. In close collaboration with the technical, organisational and operational support specifically provided for Welsh Higher Education Institutions (HEIs) within the JISC funded Repositories Support Project (RSP), also delivered from Aberystwyth University, this initiative provided a cost-effective, collaborative and decisive boost to the repository agenda in Wales and helped JISC achieve the critical mass of populated repositories and digital content that is a stated objective of the Repositories and Preservation Programme....

At its most practical level the principal deliverable of the WRN project has been the provision of repository hardware capacity in each and every HEI in Wales which, in combination with the hands-on technical support provided by the RSP, enabled all 12 HEIs to have functional institutional repositories by March 2009. More generally, the project has contributed a series of case studies and test sites that provide the wider JISC community with practical insights into the process of matching alternative organisational models, repository types and hardware configurations to different geographical and institutional settings. The main conclusion to be drawn from the WRN is that while providing funds for procuring hardware helps to push repository development up the institutional agenda, the support that goes with the funding, especially the technical support, is a far more crucial factor in generating a successful and lasting outcome.

From the body of the report:

It was agreed that the phased establishment of the WRN would be characterised by the following principal components: ...An operational open access digital repository in every Welsh HEI, that forms an integral part of the learning, teaching, research and administrative fabric of its host institution....

The predicted outcomes from the project plan were as follows: ...Trigger creation of 10 new open access digital institutional repositories and place an additional 2 pilot repositories on a firm footing, thereby building capacity and contributing to the achievement of critical mass that is an objective of the Start-Up strand of the Repositories and Preservation Programme....Act as a catalyst for institutional commitment to open access in Wales and provide a powerful stimulus to collaborative repository development based on best practice and common standards....

The project has successfully facilitated all these outcomes....

A key aspect of the project has been that the technical support provided via the WRN has allowed the development of repositories at a level that would have otherwise been impossible, particularly in some of the smaller institutions involved....

More generally, the project has raised the profile of the research agenda within Wales, especially through the publicity received upon the launch of the repositories and the kudos of being the first country within the UK to offer national coverage of Open Access repositories. Moreover, the regular and well attended series of WRN meetings of institutional representatives held during the past two years and the associated email discussion list has created a genuine community of repository managers in Wales, who now exchange information and share ideas and who have derived confidence and encouragement in their institutional work from the mutual support that the network provides.

PS:  Also see our February post on the launch of the Welsh Repository Network, which gave the WRN some of the kudos noted in the report.

Amsterdam closes its OA journal fund

The University of Amsterdam has had to shut down its OA journal fund.  From the site:

Researchers of the UvA can no longer obtain funding from the Open Access fund.

The consequences of the closure of the fund are:

  • All honoured applications will be paid for. 
  • Requests will no longer be granted.
  • The settlement with BMC will not be extended. Articles which have been submitted before May 7, 2009 stating the BMC membership account of the UvA will be granted. Articles which are submitted after May 7, 2009 can unfortunately not be granted; the costs of these articles must be paid by the author(s) or the department.
  • The discount of PLOS will be maintained in 2009 as the membership for 2009 has been paid for.

The UvA's Open Access was active from 2007- May 2009. Due to a precarious financial situation the UvA has decided that the OA fund will not be extended after 2009....

Comment.  This is a sad casualty of the recession, which is affecting OA and TA resources alike.  In my predictions for 2009, I thought this might happen ("It will be harder than launch or replenish funds to pay publication fees at fee-based OA journals..."), even though, as access declines, the recession will simultaneously strengthen the case for OA. 

Also see our post on the launch of the Amsterdam fund in January 2007.

An OA mandate for the University of Pretoria

The University of Pretoria senate voted unanimously to adopt an OA mandate, to take effect immediately. Here's the key language:

  1. To assist the University of Pretoria in providing open access to scholarly articles resulting from research done at the University, supported by public funding, staff and students are required to
    • submit peer-reviewed postprints + the metadata of their articles to UPSpace, the University’s institutional repository, AND
    • give the University permission to make the content freely available and to take necessary steps to preserve files in perpetuity.
  2. Postprints are to be submitted immediately upon acceptance for publication.
  3. The University of Pretoria requires its researchers to comply with the policies of research funders such as the Wellcome Trust with regard to open access archiving. Postprints of these articles are not excluded from the UP mandate and should first be submitted as described in (1). Information on funders' policies is available at [Juliet].
  4. Access to the full text of articles will be subject to publisher permissions. Access will not be provided if permission is in doubt or not available. In such cases, an abstract will be made available for external internet searches to achieve maximum research visibility. Access to the full text will be suppressed for a period if such an embargo is prescribed by the publisher or funder.
  5. The Open Scholarship Office will take responsibility for
    • Adhering to archiving policies of publishers and research funders, and
    • managing the system's embargo facility to delay public visibility to meet their requirements.
  6. The University of Pretoria strongly recommends that transfer of copyright be avoided. Researchers are encouraged to negotiate copyright terms with publishers when the publisher does not allow archiving, reuse and sharing. This can be done by adding the official UP author addendum to a publishing contract.
  7. The University of Pretoria encourages its authors to publish their research articles in open access journals that are accredited.

Comment.  This policy breaks important new ground.  It's the first OA mandate for South Africa, and the first for Africa at large, either from a university or a funder.  And it's another unanimous vote!  I applaud the mandatory language, the requirements for both deposit and permission, and the timing (deposit immediately upon acceptance).  Kudos to all involved.

Update (5/25/09).  Also see Eve Gray's comments.


Sunday, May 24, 2009

1 year review from PARSE.insight

The PARSE.insight project has released a review on its first year of work. (Thanks to Fabrizio Tinti.)

The motivation of the PARSE.Insight project is to contribute to the long-term access to the digital resources created by scientific endeavour. ...

In the first year of project the main emphasis of the project has been surveying communities with an interest in digital preservation to build up insight, and developing a draft roadmap for the e-infrastructure. ...

Another clear message [from the surveys] is that researchers would like to (re-)use data from both their own and other disciplines, and it is suggested that this is likely to produce more and better science. However more than 50% report that they have wished to access digital research data gathered by other researchers which turned out to be unavailable. ...

During the course of the year, it was proposed to broaden the project scope away from preservation towards a more general science data infrastructure. ...

In the sustainability and evaluation work, the focus has been on the progress towards an international standard for audit and certification of digital repositories. A workshop was held in the US at which excellent progress was made on the draft standard, which is now close to submission to the ISO process. ...

By the end of the project an important base of data will have been assembled concerning the attitudes and practices of a wide range of scientific communities concerning digital preservation and science data infrastructure. This will provide an excellent body of evidence for policy makers, strategists and funders. The data will be both broad and deep (from the interviews and case studies).

The project will revise its roadmap, which will influence the agenda of development in the science data infrastructure for the coming years. The roadmap will be complemented by an understanding of the gaps with respect to the current situation. Additional stakeholders will be involved, from the Alliance for Permanent Access and through the series of workshops that are to be organised.

See also our past posts on PARSE.insight.

Apple drops its objection to an iPhone app for OA books

Last Thursday Apple rejected Eucalyptus as an ebook reader for the iPhone because it would display OA books, some of which "contain inappropriate sexual content".  Today Apple reversed itself. 

For details, see the Thursday blog post by Jamie Montgomerie, the Eucalyptus developer, and his update today.  (Thanks to Cory Doctorow.)

Individualized requests as prompts to self-archive

According to the KEI staff, "Under US FOIA laws, if an agency receives three requests for the same documents, they are required to put the data on their own web page."

Comment.  It's a very enlightened rule.  I've long urged an equivalent rule for scholars, and this is a good opportunity to urge it again.  If you receive even one request for an email copy of one of your articles, then self-archive the article.  It takes about as much time as sending the article as an attachment to your requesting colleague.  It will save you time responding to future requests, and spare other readers the need to request their own copies.  Of course routine self-archiving is even better.  But if you forget, regard every query as a reminder.

Mixed news on the Medical R&D Treaty

James Love, The World Health Assembly takes step back, but leaves door open, for medical R&D Treaty, Knowledge Ecology Notes, May 22, 2009.  Excerpt:

This morning the 62nd session World Health Assembly agreed to a resolution on public health, innovation and intellectual property that, among other things, settled outstanding issues regarding the “stakeholders” for various parts of the Global Strategy and Plan of Action. (GS/PoA). With regard to the issue of a possible medical R&D treaty, the outcome of the negotiation was something of a split decision. On the one hand, the WHA agreed that the WHO would not be a stakeholder, in terms of the specific element of the WHO Global Strategy document. On the other hand, it was also agreed that the proposal for an R&D treaty would legitimately be considered by the WHO Expert Working Group on R&D Financing (EWG), that is reporting to the WHA in 2010, and that any country could present a proposal for discussions on an R&D treaty at any future meeting on the WHA or WHO Executive Board (EB). So the R&D Treaty is out, but it is in the EWG, and it might be back at the WHA next year....

In the views of many NGOs, government negotiators and WHO staff, this was an absurd result. If discussions on a medical R&D treaty are supposed to take place, why would the WHO be excluded from the discussions? What does this say about the WHO? What does it mean for the future of an R&D treaty?

The primary opposition to the medical R&D treaty is coming from the pharmaceutical industry, which does not like where the conversation is headed, with demands for more transparency, ethical norms for research, and attention to priority setting and accessibility of products. (See, for example, the proposals here) The pharmaceutical industry also sees a medical R&D treaty as something that competes for policy space and paradigm framing with treaties and agreements that focus on strong IPR.

There is also opposition from some high income countries that fear new obligations to pay for R&D for priority projects, such as for treatments for neglected diseases, new antibiotics, open databases or materials libraries for medical research, or other public health priorities and public goods.

The debate this week showed how willing the WHO Secretariat is, under Dr. Chan’s leadership, to bow to pressure from the pharmaceutical companies and the US and the European Union.

Whether or not this is a victory or a defeat for the treaty opponents remains to be seen. Clearly it is a setback to remove the WHO as a stakeholder. But now that the treaty has become “forbidden” fruit, more countries and NGOs are interested, and in some unexpected ways, the debate about an R&D treaty may have moved forward. For example, today some countries have already indicated an interest in revisiting the issue at the January 2010 WHO Executive Board (EB) meeting, or pushing for a treaty in the EWG, and there is likely to be a more detailed review of the policy inside the Obama Administration, which was surprised by the controversy....

PS:  Also see our past posts on the Medical R&D Treaty and especially our post from last Friday on recent US efforts to kill it.  Note that the draft treaty includes a provision, §13.1, which would mandate OA to publicly-funded research.

Putting IP first, environment second

Mark Weisbrot, Green technology should be shared, The Guardian, May 20, 2009.  Excerpt:

The battle over intellectual property rights is likely to be one of the most important of this century. It has enormous economic, social and political implications in a wide range of areas, from medicine to the arts and culture – anything where the public interest in the widespread dissemination of knowledge runs up against those whose income derives from monopolising it.

Now it appears that international efforts to slow the pace of worldwide climate disruption could also run up against powerful interests who advocate a fundamentalist conception of intellectual property

According to Inside US Trade, the US chamber of commerce is gearing up for a fight to limit the access of developing countries to environmentally sound technologies (ESTs). They fear that international climate change negotiations, taking place under the auspices of the United Nations, will erode the position of corporations holding patents on existing and future technologies....

[B]ig business doesn't want to take any chances. Today they are launching a new coalition called the Innovation, Development and Employment Alliance (IDEA). (You've got to love the Orwellian touch of those marketing consultants). Members include General Electric, Microsoft and Sunrise Solar. They will reportedly also be concerned with intellectual property claims in the areas of healthcare and renewable energy....


  • On the one hand, IP interests are wealthy, skilled at lobbying, and well-represented in most national legislatures and WTO delegations.  On the other, environmental interests can be their match, even if they are not yet their match.  The growing but largely unsuccessful academic and consumer coalition to reverse the destructive evolution of IP law could gain decisive strength from the environmental movement.
  • But note that patent maximalism is largely separate from the kind of copyright maximalism that occasionally threatens OA.  For example, the Chamber of Commerce, which is leading the US contingent to put patent interests ahead of the environmental, has long supported the NIH OA policy.

Update (6/11/09).  Unfortunately it's beyond my scope to track this topic in detail.  But on June 10, the House of Representatives adopted HR 2410, which --in Section 329-- unmistakably puts patents first and the environment second.