Open Access News

News from the open access movement

Saturday, October 22, 2005

Book author wants her book in Google Print

Book author to her publishing company: your lawsuit is not helping me or my book,, October 20, 2005. Jason Kottke quotes from letters and blog postings from Meghann Marco. Excerpt (quoting Marco):
I'm a book author. My publisher [Simon & Schuster] is suing Google Print and that bothers me. I'd asked for my book to be included, because gosh it's so hard to get people to read a book [but Simon & Schuster refused.]...Kinda sucks for me, because not that many people know about my book and this might help them find out about it. I fail to see what the harm is in Google indexing a book and helping people find it. Anyone can read my book for free by going to the library anyway. In case you guys haven't noticed, books don't have marketing like TV and Movies do. There are no commercials for books, this website isn't produced by my publisher. Books are driven by word of mouth. A book that doesn't get good word of mouth will fail and go out of print. Personally, I hope that won't happen to my book, but there is a chance that it will. I think the majority of authors would benefit from something like Google Print.

Friday, October 21, 2005

Balancing access and privacy for research data about people

The National Research Council Panel on Data Access for Research Purposes has published Expanding Access to Research Data: Reconciling Risks and Opportunities. Like all books published by the National Academies Press, it will be available in a full-text OA edition and in a priced, printed edition. In this case, the OA edition is ready now and the priced edition is still forthcoming. Excerpt:
The most critical data [for public policy] are microdata --data about individual people, households, and businesses and other organizations. The benefits of providing wider access to microdata for researchers and policy analysts are better informed policies. The risk of providing increased access to microdata is increased risk of breaching the confidentiality of the data....We believe that the changes we recommend will result in wider access to high-quality anonymized public-use files as well as to potentially identifiable microdata. But such expanded access requires expanded procedural and legal protections. The panel believes that users, like agencies, should be held accountable for safeguarding the confidentiality of microdata files to which they are granted access. We recommend that statistical agencies set up procedures for monitoring any breaches of confidentiality that may occur, as well as their causes and consequences. We recommend that agencies require auditing of license holders and penalties for violations of the license....The statistical system of the United States ultimately depends on the willingness of the public to provide the information on which research data are based. To ensure such willingness, there must be scrupulous attention to assuring the informed consent of data providers, as well as continuing research into public attitudes relevant to data collection, privacy, and confidentiality.

More on the OECD report on OA

Philipp Bohn, The OECD takes on digital content, INDICARE, October 21, 2005.
Abstract: The OECD’s Working Party on the Information Economy (WPIE) has recently published four extensive reports on digital content. Their relevance for the DRM discussion is analyzed in the course of this article. Where applicable, they are also contrasted with differing findings and positions.

From the body of the article:

The report then introduces the concept of open access publishing. Authors following this concept "grant to all users the free, irrevocable, worldwide, perpetual right of access to copy, use, distribute, transmit and display the work publicly and to make and distribute derivative works, in any digital medium for any responsible purpose" (Bethesda Statement of Open Access Publishing 2003). Articles and papers are usually based on publicly funded research. Accordingly, funding agencies and institutions are more and more adopting the open access policy. Thereby, they are stressing the importance of knowledge creation and distribution and the integration of all the actors and activities within innovative systems. According to the report, DRM does not lend itself to the idea of open access publishing, as it is primarily meant to limit users’ rights in terms of openness and interoperability....I largely agree with WPIE’s assessment of the situation in online gaming and scientific publishing, especially when it comes to open access publishing.

Deeper implications of digitization

Ben Vershbow, google is sued... again, if:book, October 20, 2005. Excerpt:
Google calls the publishers' suit "near-sighted." And it probably is. The benefit to readers and researchers will be tremendous, as will (Google is eager to point out) the exposure for authors and publishers. But Google Print is undoubtedly an earth-shaking program. Look at the reaction in Europe, where alarm bells were rung by France, warning of cultural imperialism, of an english-drenched web. Heads of state and culture convened and initial plans for a European digital library have been drawn up....Google's book scanning touches a deep nerve, and the argument over intellectual property, signficant though it is, distracts from a more profound human anxiety -- an anxiety about the form of culture and the shape of thoughts. If we try to grope back through the millennia, we can find find an analogy in the invention of writing. The shift from oral to written language froze speech into stable strings that could be transmitted and stored over distance and time....This change not only affected the modes of communication, it dramatically refigured the cognitive makeup of human beings (as McLuhan, Ong and others have described). We are currently going through another such shift. The digital takes the freezing medium of text and throws it back into fluidity. Like the melting of polar ice caps, it unsettles equilibriums, changes weather patterns. It is a lot to adjust to, and we wonder if our great-great-grandchildren will literally think differently from us....In Phaedrus, Plato expresses a similar anxiety about the invention of writing....[Quoting Plato:] "For this discovery of yours [writing] will create forgetfulness in the minds of those who learn to use it; they will not exercise their memories, but, trusting in external, foreign marks, they will not bring things to remembrance from within themselves. You have discovered a remedy not for memory, but for reminding. You offer your students the appearance of wisdom, not true wisdom. They will be hearers of many things and will have learned nothing; they will appear to be omniscient and will generally know nothing."...As I type, I'm exhibiting wisdom without the reality. I've read Plato, but nowhere near exhaustively. Yet I can slash and weave texts on the web in seconds, throw together a blog entry and send it screeching into the commons. And with Google Print I can get the quote I need and let the rest of the book rot behind the security fence. This fluidity is dangerous because it makes connections so easy. Do we know what we are connecting?

Another plea for Google to prevail

Victor Keegan, A bookworm's delight, The Guardian, October 21, 2005. Excerpt:
Google's latest ambition - to digitise practically every book ever written so they can be searched in a fraction of a second - is so alluring that I find myself hoping it will win the lawsuit brought against it by publishers, even though I can't for the life of me work out which side is legally in the right as regards copyright....[The project] could increase the knowledge of practically everyone willing to learn, cut down the years of research needed to do a PhD, and may even provide a legitimate reason for continuing improvements in our school and college examination results....There is no problem digitising books that are out of copyright - which could lead to a boom in the rapidly expanding print-on-demand industry. But publishers are very angry about Google scanning books that are still covered by copyright protection, even if they are out of print and even though Google has offered an opt-out clause for authors and publishers not wanting to be part of it. Google argues that for books not in the public domain it will merely provide pointers that contain the search terms used with, at most, a few lines of text. So if you want the whole book you will have to buy it at Amazon or your local bookshop. That seems fair enough and a lot less damaging to authors than readers going into their local library and photocopying page after page....Since, in the absence of this initiative, [in-copyright but out-of-print] books would stay in literary limbo, isn't Google doing the world a favour?...Google wouldn't exist if its content providers [for ordinary or non-book searchers] had demanded royalties. Why didn't they? It was partly because the internet hasn't managed to find an efficient system for collecting micro-payments. But it was also because there was, and is, a kind of collective, if subconscious acceptance that the benefits of having all that information available for nothing far outweighs the messiness of asking everyone to pay, say, 1p every time they view a page.

Can the Contract Commons help universities negotiate better deals?

Contract Commons is a new initiative to help schools and governments get better deals from vendors and to help vendors better understand the needs of their public-institution clients. (Thanks to Brian Robinson via Tom Hoffman.) From the web site:
We intend to make it easier and more cost-effective for vendors and clients to think through relevant issues, memorialize them in cogent and legal agreements and build balanced, ongoing relationships. Contract Commons will also build a public education contracting community for procurement officials. The community will have access to various tools, including: [1] "Best of breed" technology contracts for public education, [2] Annotations to those contracts provided by top legal and technology professionals, [3] A searchable library of contract clauses, [4] A community forum to encourage debate, discussion and collaboration among procurement officials and vendors, [5] Primers on open source technology and contracts, and advice on how to integrate existing procurement practices with open solutions, [6] An expert contract drafting "wizard" to walk procurement officials through the business and legal issues necessary to consider when negotiating for technology, [7] A clearinghouse of vendor information, including information about vendor products and contract terms to create broader markets in public sector technology.

(PS: Though not directly related to OA, I'd be interested to hear from university libraries that use Contract Commons to improve their bargaining power with publishers and datababse vendors.)

ICSU calls for equitable access to data

A new ICSU report calls for equitable access to scientific data. From the October 20 press release:
Complex changes in data production, distribution and archiving--and issues they raise regarding who pays for data, who preserves it and who has access to it--should prompt an international initiative that ensures current and future scientists worldwide will have the information they need, according to a new report on challenges to data management and access presented today to the International Council for Science (ICSU). The report--written by an expert panel appointed by ICSU -- was formally presented today at the ICSU 28th General Assembly in Suzhou, China. It calls for establishing an international scientific data and information forum to promote a more coordinated approach to data collection and distribution. Such a forum could also play a key role in ensuring that scientists in developing countries have equitable access to scientific data and information....[According to Roberta Balstad, director of Columbia University's Center for International Earth Science Information Network and chair of the ICSU Priority Area Assessment (PAA) on Data and Information:] "For example, we don't always have the necessary legal and regulatory frameworks in place to get the full benefit of scientific data. We lack a coherent approach to preserving and archiving the incredible wealth of information being produced. And the more the access to long-term reservoirs of data becomes central to the modern scientific enterprise, the more it exacerbates inequities between scientists in rich and poor nations."...The panel examined a range of issues that affect data generation, quality and access. For example, its report notes that while public sector funding of data collection has been "a major factor" driving scientific progress over the past 50 years, decisions regarding data are often fragmented and taken without consultation with the scientific community. The result in "extreme cases" can be actions driven by political, administrative or budgetary factors that do damage to scientifically valuable data series. Meanwhile, the panel cautions that as the private sector plays a greater role in amassing and disseminating data, there is a risk that market demand, not scientific priorities, will determine what is collected and preserved and who has access. The panel notes that commercial interest in data collections can lead to license and user fees and intellectual property claims on data that become impediments to research. The report recommends that data produced commercially or through public-private partnership be provided for research and education purposes either free or at nominal cost. Price and other access barriers to scientific data weigh most heavily on researchers in poor countries...."A major problem for scientists in low-income countries is their lack of access to scientific publications, both as a means of learning about research in other parts of the world and as an outlet for their own research results," the report observes. Scientists are frequently charged not only to view but also to publish articles.

(PS: The press release doesn't link to the report it discusses, and the closest thing I can find at the ICSU web site is this report from December 2004. Note to ICSU: Can you make life easier for readers who want to learn what you're doing?)

More on OA and security

Yesterday the International Council for Science (ICSU) issued a press release on scientific freedom. Excerpt:
Warning that changes in the global political climate and concerns about international terrorism pose new challenges to scientific freedoms, the International Council for Science (ICSU) today urged its members to consider a renewed and broader commitment to the organization's bedrock Principle of the Universality of Science. A statement on threats to the Principle was formally presented by ICSU's Standing Committee on Freedom in the Conduct of Science to the ICSU 28th General Assembly in Suzhou, China....The committee's review of the Principle of Universality cites two distinct threats. There are today greater restrictions on the freedom to associate, which are leading to the relocation or cancellation of scientific conferences. There are also increasing restrictions on the freedom to pursue science, including politically motivated boycotts against countries and scientific institutions, and new security policies that have a chilling effect on such matters as hiring decisions, access to equipment and materials, and scientific publication....The committee also points to a new emphasis on security that has imposed restrictions that, even when driven by legitimate concerns, end up "undermining the Principle of Universality." According to the committee, "these issues are often complex and may manifest themselves as cumbersome or time-consuming new procedures and regulations or even re-interpretation of existing regulations" that prompt, among other things, censorship by authorities or "self-censorship by scientific publishers." "They affect individual scientists," the committee observes, "but also have broader policy implications involving careful judgments as to the appropriate balance between the freedom to pursue science and national and international policy imperatives." The committee has proposed that ICSU adopt a restatement of its Principle of the Universality of Science that will serve both as strong call for scientists to recognize their responsibilities while insisting on maintaining their rights. The proposed language declares that:
"This principle embodies freedom of movement, association, expression and communication for scientists as well as equitable access to data, information and research materials. In pursuing its objectives in respect of the rights and responsibilities of scientists, the International Council for Science (ICSU) actively upholds this principle, and, in so doing, opposes any discrimination on the basis of such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age."

Live blogging of OAI4

Jeremy, OAI Day 1, Part I, The Digital Librarian, October 20, 2004. Excerpt:
Apparently, the conference is to be webcast, and the video will be uploaded five minutes after each session. URL for agenda and presentations: here....Welcome, Maximillion Metzger: CERN has been involved since the early 90’s in managing their own institutional repository. CERN participation in the open access movement due to its obligation to make the results of its work public. Erland Kolding-Nielson: LIBER is a major organization of major European research libraries. Its aim is to assist the research libraries to become a functional network across national bounaries. Herbert Van de Sompel: Herbert’s talk is a technical talk about OAI-PMH, especially about utilizing the OAI-PMH protocol for harvesting actual resources in addition to just metadata.

The entry also includes notes on talks by Michael Nelson, Simeon Warner, and Stu Weibel.

Also see OAI Day 1, Part II, October 20, 2005. Excerpt:

The first presentation after the break was the Ockham presentation. Eric [Lease Morgan] spoke (loudly) for Ockham, and how Ockham promotes open access to scholarly communication via lightweight protocols such as OAI-PMH....Eric described the Ockham alerting service to show a fairly simple alerting tool that utilizes OAI-PMH to pull content and SRU/W for search access. It alerts via email and RSS. Eric then described MyLibrary@Ockham to show a ‘how to find more like this one’ type of service. Eric related the registry to Herbert’s earlier presentation on aDORe and his use of OAI-PMH and Compound Objects alongside services. The best demo was Eric’s display of applying the spell recommender service to a search of the British Library. Stu Weibel presented for Jeff Young on Jeff’s WikiD project. WikiD extends the notion of Wikis to support multiple collections containing arbitrary XML schemas, and also it provides a number of different interfaces for access, such as OpenURL, SRU/W, RSS, OAI-PMH, etc....Johan Bollen presented on “A framework for assessing the impact of units of scholarly communication based on OAI-PMH harvesting of usage information.” Johan made a point that change in the scholarly communication process is a user-driven revolution (links, blogs, search engine use, etc.)....The final presentation of the day was by Tim Brody, on “Incites into Citation Linking using the OAI-PMH”. Tim showed a model for IR linking in Institutional Repositories. Tim recommends coupling OpenURLs within Simple DC accessed via OAI-PMH. His approach is to add OpenURLs not only for the object, but for the record itself, and to also add in reference data into a record as OpenURLs for cited items. Interesting, but it isn’t clear to me that the reward is great enough to overcome the barrier to adoption, especially if this is a bridging solution as opposed to a long-term solution. If folks start taking up the COinS approach, this may have a chance of being adopted, as COinS might encourage a range of functions that current do not exist.

New issue of Interlending & Document Supply

The latest issue of Interlending & Document Supply (vol. 33, no. 3) is now online. Here are the OA-related articles. Only abstracts are free online, at least so far.

More on Sun's GELC

Sharing Teaching Tools Online, Wired Campus Blog, October 20, 2005.
Scott McNealy, the CEO of Sun Microsystems, has long promoted the use of open-source software, in which volunteers work collaboratively online to build and improve computer code. Now he hopes to bring the same basic concept to the development of online teaching materials. In a keynote speech on Wednesday at the Educause annual meeting in Orlando, Mr. McNealy announced the creation of a nonprofit organization called the Global Education and Learning Community, which will provide a framework for educators to work together to develop and distribute educational resources online. For now the project is focusing on K-12 education, but in an interview after his speech, Mr. McNealy said that he hopes to extend the system to higher education eventually. "It's optional. No parent, teacher, or student has to use GELC content, but it's there," he added. "It's just a broader and deeper and richer platform on which to build your curriculum, and at zero cost."

More on Publishers v. Google

John Battelle, The AAP/Google Lawsuit: Much More At Stake, Searchblog, October 20, 2005. Excerpt:
I spent some time yesterday and this morning speaking with Allan Alder, counsel for the AAP (see my initial post on this here). I came away convinced of what I initially suspected but so far had not stated: this is a far bigger issue than simply book publishers wanting to protect their business models (though there's plenty of that in here as well.)...[T]here are a few larger issues percolating here that bear discussion. First, who is making the money? Second, who owns the rights to leverage this new innovation - the public, the publisher, or ... Google? Will Google make the books it scans available for all comers to crawl and index? Certainly the answer seems to be no. Google is doing this so as to make its own index superior, and to gain competitive advantage over others. That leaves a bad taste in the publisher's mouths - they sense they are being disintermediated, and further, that Google is reinterpreting copyright law as they do it. And this is not just about books. If Google - and by extension, anyone else - can scan and index books without permission, why can't they also scan and index video? Look at who owns the book companies that are suing - ahhh, it's Newscorp (Harper Collins), Viacom (Simon&Schuster), Time Warner (Little Brown). As I said, I plan more posts/pieces on this, as the issues raised - of innovation, of intellectual property rights, of business models, of more perfect search - are fascinating. But they are also nuanced in that they reflect some of our most treacherous technology/policy debates: the tension between DRM and innovation, between a creator's rights and the public good, between open and closed (the Craigslist/Oodle debate, for example, is very much related to this). After staring at this for a day or so, it's clear to me that this case will go to court. No one wants to settle. Google is digging in, and so is the media world. Folks, we have a real battle on our hands.

Google signing up German publishers

Deutsche Welle staff, German Publishers Warm to Google Library, Deutsche Welle, October 20, 2005. Excerpt:
This week Google introduced, the German version of the search engine Google posted online this time last year. Google lets users type in a search term then scans its digitized library for the word or expression and produces a list of books where the term is mentioned. Users can then click on the results and a few pages of the text appear allowing users to read up online. The idea is to supplement Internet users with another source of material. In contrast to their American colleagues, German publishing houses have reacted well to the database. Google said that is discussing terms with all major German publishers. Langenscheidt, which publishes a large selection of dictionaries, said they had got on board. "We are starting with 160 books," Hubert Haarmann, head of the electronic publishing division and the publisher, told the Financial Times Deutschland. "We see it as an additional distribution possibility." An increased and more direct reach to the consumer is just one way Google is promoting its new project to skeptical publishers. The company also says that publishers will be able to monitor interest in titles through the search engine, and use the information in deciding whether to reprint certain books. Google has also promised publishers a cut of the advertising that will appear on the site. Not all German publishers are on board. In fact, the German association representing publishers has announced it will begin its own project where publishers can scan their own books, rather than let someone else do it.

(PS: Deutsche Welle misunderstands what's happening. The interviewed German publishers --like most US publishers-- support the Google Publisher project. There's no evidence yet whether they support or oppose the very different Google Library project. For more on the difference, see my article from the October SOAN. Basically, Google Publisher is opt-in and Google Library is opt-out. All five publishers now suing Google over its Library project are happy participants in the Publisher project.)

More on Publishers v. Google

There is now a Slashdot thread on the publisher lawsuit.

ODF plug-in for MS Office

Two Australian groups have developed an OpenDocument Format plug-in for Microsoft Office. For details, see the project page or Sam Varghese's story in The Age.

More on Publishers v. Google

Susan Kuchinskas, Google Print Hits The Fan,, October 19, 2005. Excerpt:
Many AAP members are participating in the Google Print Publishers' Program, which lets them offer books for copying, specify how much of a book can be revealed to searchers and earn a share of revenue from ads shown by Google against search results. But AAP President Patricia Schroeder said publishers were angry about the Library Program. "Part of why they were so surprised that they went ahead with the library program is that every one of the plaintiffs is one of their partners in Google Print," Schroeder said. "It's a funny way to treat your partners."...[The Google Library project] was presented as a fait accompli [to publishers], and Google already had been scanning books in the collection of the University of Michigan for nearly a year. Jim Gerber, Google director of content partnerships, recently told that the search Goliath had to wait until all the contracts with libraries were signed before it could reveal the project to publishers....In response to publishers' complaints, Google added two new features to the Library Project. Publishers can give Google a list of books they want added to their accounts if Google scanned them from the library; or they can give the company a list of books they didn't want scanned....Google claims that scanning and indexing the books is covered by fair use guidelines; it's taken to describing the Library Project as "creating a digital card catalog." "If that's all they're doing, they only need to copy the bibliographic material. If they want to make it searchable, it's not longer a card catalog," Schroeder responded.

In an e-mailed statement, David Drummond, Google's vice president for corporate development and general counsel, said, "Google Print is an historic effort to make millions of books easier for people to find and buy. "Creating an easy to use index of books is fair use under copyright law and supports the purpose of copyright: to increase the awareness and sales of books directly benefiting copyright holders. This short-sighted attempt to block Google Print works counter to the interests of not just the world's readers, but also the world's authors and publishers." Schroeder said, "I keep being blown away by how they seem to think they have the right to take everything from everybody, because it's going to be so good for you. 'You don't get it, but fine, we're going to take it.'" She said that regardless of whether it might, for example, be good for someone in Bangladesh to be able to search through a book, "We have the right to decide these things." The AAP and the other publishing organizations have been criticized for "old media thinking," but Schroeder said the organization will continue to work with the Open Content Alliance, which has similar plans to build a searchable index of works in print.

Note from OAI4

JISC has posted a short note from Neil Jacobs on the CERN workshop on Innovations in Scholarly Communication (OAI4), now in progress (Geneva, October 20-22, 2005). Jacobs is the manager of JISC's Digital Repositories Programme. Excerpt:
This international gathering has developed considerable momentum since the first workshop in 2001. Initially a wholly technical meeting, it now brings together representatives from the library, technical, publishing and academic communities to address the latest developments in repositories and the technical and metadata standards which underpin their development and use. Topics under discussions today include: the building of federations of repositories in order to make their contents available in increasingly useful and relevant ways to users; improving measures of research output, and developments in open access. Another theme is the need for the re-engineering of the infrastructure that supports scientific data creation and analysis, so we can do more of it, better, faster and cheaper. Among other questions delegates will be asking, and perhaps answering, are: how does one know which version of a research paper one is getting from a repository? And how does one know one can trust the repository to give this information?

More on OA to books

Denise Troll Covey, Acquiring Copyright Permission to Digitize and Provide Open Access to Books, CLIR, October 2005. An OA report also available in a priced ($25) printed edition.
Abstract: What are the stumbling blocks to digitization? Is copyright law a major barrier? Is it easier to negotiate with some types of publishers than with others? To what extent does the age of the material influence permission decisions? This report, by Denise Troll Covey, principal librarian for special projects at Carnegie Mellon University, responds to many of these questions. It begins with a brief, cogent overview of U.S. copyright laws, licensing practices, and technological developments in publishing that serve as the backdrop for the current environment. It then recounts in detail three efforts undertaken at Carnegie-Mellon University to secure copyright permission to digitize and provide open access to books with scholarly content.

The case for OpenDocument Format

David Berlind, Could ODF be the Net's new, frictionless document DNA? Between the Lines, October 12, 2005. A compelling case for the OpenDocument Format, partly as the endpoint for nearly very kind of work and partly as the midpoint for translating any kind of format into any other kind.

Thursday, October 20, 2005

Data on one journal's OA experiment

T. Scott Plutchak, The impact of open access, Journal of the Medical Library Association, October 2005. (Thanks to Charles W. Bailey, Jr.) Excerpt:
Between June of 2004 and May of 2005, the number of unique users accessing the Journal of the Medical Library Association (JMLA) and its predecessor, the Bulletin of the Medical Library Association (BMLA), on the National Library of Medicine's PubMed Central (PMC) system averaged just over 20,000 per month. When I first saw these numbers on the PMC administration site, I was astonished. The members of the Medical Library Association (MLA) itself (who we might presume are the main audience of the JMLA) number only about 4,500, and the print run of the journal is generally in the neighborhood of 5,000 copies. It seemed likely to me that the number of unique readers in any given month would be just some fraction of that core audience....I wondered if PMC has some kind of formula that they use to translate the number of IP addresses into number of readers, so I emailed Ed Sequeira, the project coordinator, at PMC. Further astonishment! He pointed out that it was likely that my supposition about DHCP was balanced by the aggregation of users behind corporate firewalls and then told me that, from surveys that they have done, there are half again as many actual users per IP address. Thirty thousand unique readers?...I can think of few things more likely to gladden the heart of an editor than this kind of evidence of the reach and impact of the journal on which he lavishes so much time and attention. I have no doubt that we would not be seeing these sorts of numbers if the JMLA were not freely available on the Web. From the standpoint of readership and reach, MLA's experiment with open access would appear to be a resounding success. But much of the discussion of open access during the past few years has focused on the risks. What of those?...So I looked at the revenue and membership figures for the last ten years. I wanted to examine the trend lines and see if anything appeared to change significantly around 2001/02, when the JMLA went up on PMC....Subscriptions had been falling for a decade, but the drop from 2002 to 2003 was far more dramatic than the previous declines. The number of subscriptions declined again in 2004, although not as dramatically, but revenue went up slightly, thanks to a modest rate increase. Whether this indicates a trend or not is still too early to say....Perhaps more worrisome from the standpoint of the long-term health of the association is the impact of an open access journal on the members' willingness to remain members. Here, the results are more encouraging. Total membership has declined during the entire period, but the biggest drop occurred in 2000/01, just before the PMC debut....To probe the views of members further, I worked up a quick online survey....I asked what degree of impact the JMLA's free availability had had on their decision not to renew their membership. Seventeen respondents fit in that category. Fourteen indicated little to no impact, two were neutral, and one indicated that it had had a major impact. When I asked the current members if the JMLA's free availability would make them more or less likely to renew their membership, 61% indicated that it would have no bearing; but, for 30%, it would make them somewhat to much more likely to renew. On the downside, 5% felt that it would make them much less likely to renew....Other questions in my survey indicated that the free availability would make people much more likely to read articles from the older issues and would make potential authors more likely to submit manuscripts. These, of course, are the things that an editor loves to hear....Despite what I said near the beginning of this editorial, it is too early to label the experiment an unqualified success. But has the attempt been worth it so far? I look again at the PMC statistics. Twenty to thirty thousand unique users? Has it been worth it? Oh, yes!

(PS: Read the whole article for Plutchak's judicious qualifications on the data. This is an exemplary report of a journal OA experiment. I wish other journals would follow suit, even if their experiments are less successful. We need to know not only what works and what doesn't, but what works in which niche.)

Two studies of OA among society publishers

Gary D. Byrd, Shelley A. Bader, and Anthony J. Mazzaschi, The status of open access publishing by academic societies, Journal of the Medical Library Association, October 2005. (Thanks to Charles W. Bailey, Jr.) Excerpt:
The following is a brief report on the results of two recent studies conducted in partnership with the Association of Academic Health Sciences Libraries (AAHSL) and designed to look at the changing publishing practices of academic societies. Carried out from July 2003 through December 2004, these studies looked at the characteristics of journals published by academic societies affiliated with the Association of American Medical Colleges (AAMC), the Association of Learned and Professional Society Publishers (ALPSP), and High Wire Press as well as titles listed in the Directory of Open Access Journals (DOAJ). The first study was cosponsored by AAHSL and AAMC through its Council of Academic Societies (CAS), which included some ninety-four member societies representing academic disciplines taught in schools of medicine. The primary goal of this study was to help these societies, as well as AAMC member institutions and their libraries, understand the problems and opportunities faced by the CAS society journals as they shift from paper to electronic publishing. The second study was cosponsored by ALPSP, High Wire Press, the American Association for the Advancement of Science, and AAMC and was conducted by the Kaufman-Wills Group in Baltimore, Maryland. Called “Variations on Open Access,” this study sought to determine the potential impact of open access publishing on the business, editorial, and licensing practices of scholarly society journal publishers.

Holtzbrink launches searchable book repository

VNU staff, Macmillan parent opens digital repository, Information World Review, October 19, 2005. Excerpt:
Holtzbrinck Group, the parent company of scientific and business journal publisher Macmillan is to develop a searchable online repository for digital book content, which will be made available to other publishering companies. The facility, provisionally called BookStore, will offer services ranging from a basic digital storage facility to branded "interfaces" for e-commerce. Publishers will then be able to sell their material in various digital formats to users, and will have the option of making content accessible to search engines such as Google and Yahoo. Germany based Holtzbrinck hopes it will appeal to trade publishers that are keen to exploit digital opportunities, but are loathe to invest heavily in proprietary systems or to surrender control of copyright material - a point of issue with Google....The plans were welcomed by Google Print . Senior product manager Adam Smith said: "We are pleased to see publishers moving more into the digital environment, and welcome all efforts to make more content digital and searchable."

Using CC licenses on non-OA content

Jordan S. Hatcher, Can TPMs help create a commons? INDICARE, October 19, 2005.
Abstract: The Common Information Environment (CEI) recently released a report concerning the possibility of using Creative Commons licenses for information produced by public sector bodies (Barker et al. 2005). One of the issues that came up during the study was the compatibility of Creative Commons (CC) licenses and Digital Rights Management technologies (referred to here as Technical Protection Measures). Many public sector bodies felt that password protection schemes were a practical necessity and would not consider CC if they could not place materials behind a password. This article expands upon the conclusion in the report that CC licenses do allow password schemes and considers a broader scope of TPMs. Though any organization or individual looking to implement TPMs on CC licensed content must tread carefully, TPMs can be used to enhance the attractiveness of CC licenses.

GELC moving off on its own

Sun Microsystems is spinning off its Global and Education Learning Community (GELC) as a separate non-profit organization. GELC supports open-source software, open content, open standards, and open infrastructure for education. For more details, see yesterday's press release.

More on Publishers v. Google

Burt Helm, Google's Escalating Book Battle, Business Week, October 20, 2005. Excerpt:
The lawsuit is a setback to Google's library program, the case has broader implications: A ruling against Google could disrupt its aims to digitize and make searchable all kinds of media and information...."If Google were to lose this, it might hinder not just Google Print but all sorts of technologies," says Fred von Lohmann, senior intellectual property attorney at the Electronic Frontier Foundation. Chances are next to nil that a negative ruling would jeopardize the Internet search business, since Google's practice of copying Web pages for search purposes is generally accepted by Web publishers. But a legal ruling forcing Google to gain explicit permission from all other copyright holders could hobble attempts to apply the same method to existing media, like books, film, or sound recordings in programs like Google Print and Google Video. "Web search would not exist today if you had to go door-to-door asking permission," says Google's intellectual-property counsel Alexander MacGillivray. The sheer volume of information is too vast to get permission on a case-by-case basis....In the summer negotiations, members of the AAP proposed that Google use the Bowker database, which assigns a specific number, called an ISBN, to every book published since 1967, for determining which books required permission. For all out-of-print copyrighted titles without ISBN numbers, the AAP would have a "more relaxed" agreement with Google, says Alan Adler, general counsel of the AAP. The AAP offer was untenable, Google's MacGillivray says. "If copyright law were such that if [a library] wanted to create a card catalog it had to find every single person who had the rights to these books...imagine how few books" would be accessible, he says. "It turns copyright law on its head."...Google says that, regardless of the suit, it plans to resume scanning copyrighted books on Nov. 1....Almost exactly a year ago, publishers showered praise on the search giant when it announced a slightly different program called Print for Publishers at the Frankfurt Book Fair in Germany. In that program, Google invited publishers to send it specific titles that it would scan so that it could make excerpts show up in search results, and publishers lauded the program as an innovative way to promote new and old titles alike. All of the companies involved in the litigation are partners with Google in that program, and say they plan to remain partners.

More on Publishers v. Google

Hiawatha Bray, Publishers battle Google book index, Boston Globe, October 20, 2005. Excerpt:
Jonathan Zittrain, holder of the chair in Internet governance and regulation at Oxford University and a visiting professor at Harvard Law School, said he's rooting for Google to win the lawsuit. But he added that from a legal perspective, the case is a tossup, with good arguments on both sides. "As a matter of legal doctrine, this is a case on which reasonable people can disagree," Zittrain said. But he believes that Google's project would provide substantial benefits to the public. At the Harvard University Library, director Sidney Verba said the lawsuit was unfortunate but might at least provide clear legal guidance on the digitizing of copyrighted books. "I'm sorry that it had to go to court," said Verba, "but I'd love to see some legal decision about it to clear things up, because it is right now terribly uncertain."

More on the publisher suit against Google

Scott Carlson, 5 Big Publishing Houses Sue Google to Prevent Scanning of Copyrighted Works, Chronicle of Higher Education, October 20, 2005 (accessible only to subscribers). Excerpt:
In their complaint, filed in the U.S. District Court for the Southern District of New York, the [five publishers] charged that Google is infringing copyright to "further its own commercial purposes." The publishers asked the court to forbid Google to reproduce their works and to require Google to delete or destroy records already scanned. The only remuneration the publishers seek is that Google pay their legal fees....David Drummond, Google's vice president for corporate development, released a statement denouncing the lawsuit as "shortsighted." He said it "works counter to the interests of not just the world's readers, but also the world's authors and publishers." He said that Google's project falls under copyright law's fair-use provision, that it would make books easier to find and buy, and that it would inevitably "increase the awareness and sales of books directly benefiting copyright holders." Patricia S. Schroeder, president of the publishers' group, said that publishers had been taken aback when Google announced its library-scanning project late last year. She said the publishers had held meetings with Google, in the spring and through the summer, repeatedly asking the company not to scan books under copyright. For a while this summer, Google stopped scanning copyrighted books while the negotiations were going on. But then Google announced that it would resume scanning books under copyright [on November 1]. "We don't seem to be able to get their attention," Ms. Schroeder said. "Instead, we get, 'This is for the global good,' and, 'This will be good for you, but you just don't get it.' We seemed to be talking past each other." "The real fear is that if Google can do this, anyone can do this," she added. "The precedent is just terrifying." Asked why the publishers did not also sue any of the universities involved, many of which are discussed in the complaint, Ms. Schroeder said: "Google is clearly the instigator. They are the driving force behind this." James L. Hilton, interim university librarian and associate provost for academic-, information-, and instructional-technology affairs at the University of Michigan [one of libraries letting Google scan its books], said he was disappointed by the lawsuit. "We believe that this project has enormous benefit for humanity" in allowing people to search entire texts of obscure and long-out-of-print works through a computer, he said in a telephone interview Wednesday afternoon from the Educause conference, in Orlando, Fla...."From a public-policy standpoint," he added, "I think it would be very unfortunate if a judge decided to shut this down."

More on the publisher suit against Google

John Battelle, Here We Go Again: Publishers Sue Google, Searchblog, October 19, 2005. Excerpt:
I really don't get this. I have been both a publisher and an author, and I have to tell you, these guys sue for one reason and one reason alone, from what I can tell: Their legacy business model is imperiled, and they fear change. Of course, if they can get out of their own way, they'll end up making more money. But that never stopped these guys - the MPAA, the RIAA, and now, the AAP.

More on the OCA

Michael Bazeley, Consortium aims to digitize classic books, tech papers, Knight-Ridder, October 19, 2005. Excerpt:
"Our goal is to help with the expansion of human knowledge," said Dave Mandelbrot, Yahoo's vice president of search content. "What we'd like to see in two or three years is a major collaborative effort where libraries are contributing material and publishers are providing permission" to digitize their content....In fact, Mandelbrot said, the group's goal is to make the content available so that any search engine can index it and make it available. Digitizing the world's cultural archives, from television shows to classic books, has long been a burning ambition of Internet Archive founder Brewster Kahle. In fact, Kahle's Internet Archive has already launched an effort to digitize books called the Million Book Project, a collaboration with Indian and Chinese agencies and Carnegie Mellon University. "The real crime is that we have all these people using the Internet for research, but we don't have some of the best content on it,' Kahle said....In addition to working with libraries to scan older content whose copyright is expired, the consortium will collaborate with publishers and authors who want to make their works available on the Web. In some instances, the Open Content Alliance will give copyright-holders the option of releasing their material under a Creative Commons license, an alternative licensing scheme that encourages re-use and distribution of content.

More on the Open Content Alliance

Wade Roush, Digitize This, MIT Technology Review, October 20, 2005. Roush interviews David Mandelbrot, Yahoo's VP for search technology. Excerpt (quoting Mandelbrot):
Over the time we were discussing forming the alliance, Google did launch their program, and we looked at their program for ideas about what they were doing and things we might want to do differently. We do want to have copyrighted works available through the Open Content Alliance -- but only with the express permission of the copyright holder. Secondly, we mainly want the alliance to focus on this theme of openness. One of the things we've seen with other [digitization] programs is they tend to use proprietary technologies to host the content, so it's impossible for third-party search engines to crawl it. So we're using XML and PDF and making the content easily crawlable by search engines. It was important to make this project open so that entities that contribute know they're not just benefiting one search engine....When it comes to these digitization efforts, the publishers have primarily been speaking through the publisher's associations rather than individually, because they're concerned about any kind of retribution that could come from search engines if they're critical of any particular effort. But what we have heard from the publishers' associations is that they're very happy about the approach we're taking. The Association of Learned Professional Society Publishers, for instance, has been very positive about our program, because of the fact we are working with the copyright holders in advance....We're encouraging participation in the alliance by all entities that are engaged in digitization efforts. The Open Content Alliance has already had a very preliminary discussion with Google about its participation, and we encourage Google to contribute work that they digitize to this alliance. We don't see the alliance as offering a competing digitization effort, but rather as establishing a set of guidelines for the sharing of content.

Some European authors worried about Google Print

Now that Google Print is expanding into Europe, some European authors are expressing their anxiety and disapproval. (Some, undoubtedly, have the opposite reaction.) Nancy Gohring has one account in Digit (October 20) and she and China Martens have another in Macworld (October 20).

More on authors vs Google

Jonathan Band, The Authors Guild v. The Google Print Library Project,, October 15, 2005. Excerpt: "On September 20, 2005, the Authors Guild and several individual authors filed a complaint in federal district court in New York alleging that Google is engaging in "massive copyright infringement" through the Google Print Library Project. This culminated months of publisher condemnation of the initiative, which involves scanning the collections of five major research libraries and making the full text of the books searchable on Google. Despite the allegations of infringement, libraries, users, and some authors have welcomed the Project, insisting that it will actually stimulate demand for books by helping readers identify books that contain the information they seek. These varying perceptions of the Print Library Project stem in part from confusion over exactly how much text will be viewable in response to a search query. Publishers and authors should carefully study precisely what Google intends to do and understand the relevant copyright issues before supporting the Authors Guild’s lawsuit."

Google Print books accessible to some, not all

Klaus Graf, How Google Print is Blocking Not-US-Citizens, Archivalia, October 19, 2005. Excerpt:
I do not agree with the praise of Google Print e.g. in the weblog entries of Peter Suber's "Open Access News". Google is - unlike Yahoo's - definitively no advocate of the Public Domain. Google is blocking users outside of the US from viewing books which are Public Domain world wide (resp. in the US, the European Union, Canada, Australia etc.). From the items of the Google Library program one can see outside the US only items published before ca. 1865 freely....Google should give free access to all books published before 1923 (PD in the US) AND of which the author is 70 years dead (European copyright term). Google is claiming copyright for simple facsimile reprints of PD works made e.g. by the publisher Kessinger. This is clearly COPYFRAUD in the US! The cooperating libraries (including Oxford outside the US) should not support a project which is only serving US interests not serving the enrichment of world wide knowledge and Public Domain. I would like to read from Open Access advocates like Suber clear words against Google's discriminating practice.

(PS: I've praised Google Print for the access it provides, not for the access it denies. I'm as puzzled and unhappy as Klaus that Google blocks European access while supporting US access to scanned books in the public domain, and I've written about it twice --here and here. I've also praised Brewster Kahle's book-scanning projects, which are now part of the Open Content Alliance, for surpassing Google Print in providing true open access.)

OA satellite images of Pakistan restored

Declan Butler, UN opens access to earthquake shots, Nature, October 19, 2005. Excerpt:
High-resolution satellite images of Kashmir, which was hit hard by a magnitude-7.6 earthquake on 8 October, have begun to reappear on public websites, much to the relief of aid workers. The pictures were removed last week from all public-access websites belonging to the United Nations (UN) and its relief partners, including the International Charter on Space and Major Disasters (see 'Quake aid hampered by ban on web shots'). A senior official at the charter, who asked not to be named, told Nature that the UN decided to ban public dissemination of photos of the area after a meeting on 10 October. The official told Nature that the meeting discussed an official reminder from Pakistan about the political sensitivity of the area, which was issued after the earthquake. Pakistan and India have long fought over Kashmir, and there were concerns that pictures could compromise security in the region. Tasnim Aslam, a spokeswoman for Pakistan's foreign ministry, told Associated Press in Islamabad yesterday that "No one in the Pakistan government has made a request that such maps be removed." Nature's sources emphasize that the UN decision was a precaution against a deterioration in relations with Pakistan. After pressure from relief groups seeking wider access to the images, the UN met again on 17 October, and reversed its decision. It sent a memo to all involved parties on the morning of Tuesday 18 October advising them that the ban on photos had been lifted....Since the ban has been lifted, AlertNet has published detailed maps of the region based on satellite-images, showing, for example, which roads are blocked.

New OA journal on PPARs

PPAR Research is a new peer-reviewed, open-access journal from Hindawi Publishing. From the web site:
PPAR Research is a multidisciplinary journal devoted to the publication of original high-quality, peer-reviewed research articles on advances in basic research, as well as preclinical and clinical trials, involving Peroxisome Proliferator-Activated Receptors (PPARs)....Published research encompasses, but is not limited to, the following areas: [1] Molecular features of PPARs and their obligatory heterodimer partners RXRs, [2] Biological processes involving PPARs and/or their obligatory heterodimer partners RXRs, [3] Discoveries of natural substances and synthetic agents that act as PPAR or RXR modulators, and [4] Preclinical studies and clinical trials involving modulators of PPARs and/or RXRs.
For more detail, see the press release.

University of Pennsylvania cuts 2,255 subscriptions, blames price hikes

Jesse Rogers, As costs rise, library cuts journals, The Daily Pennsylvanian, October 20, 2005. Excerpt:
Students combing the stacks at Van Pelt Library may notice they have a little extra breathing room. The library has cut 2,255 journal subscriptions from its 2004-05 holdings, as journal prices have increased faster than the library's budget. But the size of the materials budget -- $13.1 million allotted for books, journals, magazines, periodicals, films and electronic resources -- is not to blame, library officials said. Rather, officials blame big publishing companies, which they say have raised prices as the companies have bought up academic journals over the last two decades. In 1993, journals accounted for 64 percent of the materials budget. This number has increased to almost 70 percent in the 2005 materials budget. Publishing giant Reed Elsevier claims 18 percent of the market in science, technology and medical journals. An annual subscription to the chemistry journal Tetrahedron, published by Elsevier, costs the library $31,600. The Brown University library system has also criticized price increases. However, it has not had to cancel subscriptions since the early 1990s, as its materials budget has kept pace with journals' price increases. The 2005 Brown materials budget stands at $5 million, of which 66 percent is devoted to journals. As research libraries across the nation decry price increases, Penn's library system is calling for reform through its Winning Independence Web site. Linked to the library system's Web site this September, the site encourages professors to be active on journals' editorial boards and to push for fair pricing policies. At the heart of the uproar over pricing is frustration -- on the part of the library and some professors -- with publishers' restrictive copy agreements. Many journal publishers require faculty members to sign over their copyright as a condition for publication. This prevents professors from submitting published journal articles to online archives such as Penn's [OA repository] Scholarly Commons, which is one way the library can increase its holdings in the face of a limited budget.

Tagging, authority, and findability

Gene Smith interviews Peter Morville on tagging, authority, and findability in an October 19 posting on You're It. Excerpt:

Gene: How is authority related to findability?

Peter: My authority article stirred up a fascinating discussion on Web4Lib centered around this question. Historically, librarians have been comfortable with the notion that the most frequently cited academic papers (and their authors) are also the most popular, findable, and authoritative. But many are horrified by the migration of this concept to the public web. In truth, the comparison is not totally fair. Scholars invest more thought and structure into their citations than we invest in our links. But the revolution in authority is real. In a world where we can select our sources and choose our news, we must increasingly make our own decisions about what to believe and who to trust. And thanks to the well-documented anchoring bias, we’re highly influenced by the first information we find. In this sense, Google’s algorithms are as much about authority as relevance. And this is why the subtitle of Ambient Findability is “What We Find Changes Who We Become.”

Gene: I know many people who don’t get tagging. Do you think tagging is a novelty? Or can you see some persistent value in it that will keep tags around?

Peter: I hate tagging. It’s too much work. It’s so much easier to drag and drop an email message into a folder than it is to construct keywords that define its aboutness. And with respect to refindability, using Google Desktop’s full text search is infinitely better than relying on the semantic poverty of tags. On the other hand, as one element of Google’s multi-algorithmic search solution, tags in the form of links are a wonderful source of collective intelligence. Also, as ubiquitous computing yields an Internet of (non-textual) objects, user-defined tags will be important alongside the manufacturer-supplied metadata.

Ask Microsoft to support the OpenDocument Format

Microsoft has said that it will support the OpenDocument Format if there is enough consumer demand. The OpenDocument Fellowship has launched a petition to register consumer demand.

New address for NIH public-access policy FAQ

The NIH public-access policy FAQ has moved to a new address.

(PS: My FAQ, which focuses on answering publisher objections, has not moved.)

Collaborative OA to research data

Linda O'Brien, E-Research: An Imperative for Strengthening Institutional Partnerships, Educause Review, November/December 2005. Excerpt:
[L]ibraries have know-how not only in managing, making accessible, and preserving scholarly resources but also in forming federations and collaborations to share published scholarly work. But the nature of scholarly communication is changing, with researchers wanting access to primary research data, often in digital form. No longer is scholarly communication a final discrete publication that is to be managed, made accessible, and preserved.10 Libraries may even risk fading from existence if they don’t respond effectively to the changing environment. In e-research, it is the primary research data that must often be managed, made accessible, and curated. Clifford Lynch argues that the role of libraries will shift from primarily acquiring published scholarship to managing scholarship in collaboration with researchers who develop and use this data. Currently in the majority of existing eresearch projects, the researchers, having the domain-specific knowledge, have sought to perform these tasks of managing and making accessible the research data. This data may be generated across multiple countries and across multiple research projects. Many are now realizing that this data is valuable beyond their initial research, which has a limited life. But who will take responsibility for the longer-term curation of and access to this data? Unlike their recognition of the need for IT know-how, those in the research community have not often recognized the role that librarians could play in providing specialist know-how in managing, preserving, and making accessible the research data. Research is changing dramatically. It is becoming more multidisciplinary, more collaborative, more global, and more dependent on the capabilities offered through advanced networks and large data storage.

JISC will tackle the version control problem

JISC is seeking proposals "to undertake a scoping study on version identification in relation to content in academic repositories." Proposals are due by November 18, 2005. See JISC's Word doc for the project details.

Publishers join authors in suing Google

Five publishers have sued Google for copyright infringement. From the October 19 press release by the Association of American Publishers (AAP):
The Association of American Publishers (AAP) today announced the filing of a lawsuit against Google over its plans to digitally copy and distribute copyrighted works without permission of the copyright owners. The lawsuit was filed only after lengthy discussions broke down between AAP and Google’s top management regarding the copyright infringement implications of the Google Print Library Project. The suit, which seeks a declaration by the court that Google commits infringement when it scans entire books covered by copyright and a court order preventing it from doing so without permission of the copyright owner, was filed on behalf of five major publisher members of AAP: The McGraw-Hill Companies, Pearson Education, Penguin Group (USA), Simon & Schuster and John Wiley & Sons. The suit, which is being coordinated and funded by AAP, has the strong backing of the publishing industry and was filed following an overwhelming vote of support by the 20-member AAP Board which is elected by, and represents, the Association’s more than 300 member publishing houses. “The publishing industry is united behind this lawsuit against Google and united in the fight to defend their rights,” said AAP President and former Colorado Congresswoman Patricia Schroeder. “While authors and publishers know how useful Google's search engine can be and think the Print Library could be an excellent resource, the bottom line is that under its current plan Google is seeking to make millions of dollars by freeloading on the talent and property of authors and publishers."...Over the objections voiced by the publishers and in the face of a lawsuit filed earlier by the Authors Guild on behalf of its 8,000 members, Google has indicated its intention to go forward with the unauthorized copying of copyrighted works beginning on November 1. As a way of accomplishing the legal use of copyrighted works in the Print Library Project, AAP proposed to Google that they utilize the well-known ISBN numbering system to identify works under copyright and secure permission from publishers and authors to scan these works. Since the inception of the ISBN system in 1967, a unique ISBN number has been placed on every book, identifying each book and linking it to a specific publisher. Google flatly rejected this reasonable proposal. Noting the existence of new online search initiatives that respect the rights of creators, such as the “Open Content Alliance” involving Yahoo, Hewlett-Packard, Adobe and the Internet Archive, Mrs. Schroeder said: “If Google can scan every book in the English language, surely they can utilize ISBNs. By rejecting the reasonable ISBN solution, Google left our members no choice but to file this suit.” As a twelve-term Member of Congress, Mrs. Schroeder served as the Ranking Member on the House Judiciary Subcommittee on Courts and Intellectual Property. Mrs. Schroeder noted that while “Google Print Library could help many authors get more exposure and maybe even sell more books, authors and publishers should not be asked to waive their long-held rights so that Google can profit from this venture.”

(PS: Note the key concession that authors and publishers may benefit from Google's project. Apart from the use of ISBN's, the issues in the publisher suit are the same as the issues in the author suit, which I examined in the October 2 SOAN. The new suit will generate a lot of news coverage, but I'll try to blog only the stories and analysis that say something new.)

Google's response appeared on the company blog (October 19). It's written by David Drummond, Google's General Counsel and VP for Corporate Development. Excerpt:

We've been asked recently why we're so determined to pursue Google Print, even though it has drawn industry opposition in the form of two lawsuits, the most recent coming today from several members of the American Association of Publishers. The answer is that this program, which will make millions of books easier for everyone in the world to find, is crucial to our company's mission. We're dedicated to helping the world find information, and there's too much information in books that cannot yet be found online. We think you should be able to search through every word of every book ever written, and come away with a list of relevant books to buy or find at your local library. We aim to make that happen, but to do so we'll need to build and maintain an index containing all this information. It's no surprise that this idea makes some publishers nervous, even though they can easily remove their books from the program at any time. The history of technology is replete with advances that first met wide opposition, later found wide acceptance, and finally were widely regarded as having been inevitable all along. In 1982, for instance, the president of the Motion Picture Association of America famously told a Congressional panel that "the VCR is to the American film producer and the American public as the Boston Strangler is to the woman home alone." But Sony, makers of the original Betamax, stood its ground, the Supreme Court ruled that copying a TV show to watch it later was legal, and today videotapes and DVDs produce the lion's share of the film industry's revenue. We expect Google Print will follow a similar storyline. We believe that our product is legal (see Eric Schmidt's recent op-ed), that the courts will vindicate this position, and that the industry will come to embrace Google Print's considerable benefits. Even today, despite its lawsuit, the AAP itself recognizes this potential. The Google Print Library Program, AAP president Pat Schroeder said this morning, "could help many authors get more exposure and maybe even sell more books.” We look forward to the day that the program's opponents marvel at the fact that they actually tried to stop an innovation that, by making books as easy to find as web pages, brought their works to the attention of a vast new global audience.

Wednesday, October 19, 2005

Two new OA portals of health info for lay readers

Healthline is a new, searchable OA portal on health topics. Excerpt from Chris Sherman's review in Monday's issue of SearchDay:
Healthline is a specialized medical search engine that offers high-quality, authoritative information that's easy to find, even if you don't speak medicalese.... Healthline [is] the reincarnation of a site formerly known as Healthline is a specialized search engine that focuses exclusively on reliable, doctor-vetted information, covering 62,000 web sites with between 45-50 million pages. The site also features hosted content licensed from reliable content providers....[M]ost of us aren't trained in the use of medical terminology. Healthline addressed that problem by mapping a medical taxonomy over hundreds of common lay terms for diseases, medications and other health related terms. When you search, lay terms are translated and relevant medical information is presented. Even better, you're offered a number of query refinement tools to help you broaden or narrow your search....One of the coolest features is called a "health map," a visual display of all of the concepts related to your query. Health maps resemble flow-charts, showing phases of diagnosis, treatment, alternatives and so on. Want to explore one of those areas in depth? Just click the relevant box on the flowchart and a new search is run....If you prefer browsing, Healthline makes that easy too, offering more than 200 "channels" that each focus on a popular health topic. Each topic has featured articles, current news and related health tools and channels. Healthline also offers some useful tools for registered users. You can save, annotate or rate content that you find, or email articles to other users. All of this information is private, and is never shared, according to the company's privacy policy. Healthline is one of the best, easiest to use health information sources I've yet found on the web. The "patient friendly" interface combined with first-rate, vetted content make it an excellent resource for anyone researching health related information.

Also see the United Health Directory, another new, searchable OA portal of health information. UHD is less comprehensive and flexible, but some users may prefer its short, clear articles.

Update. Also see Tara Breton's review of Healthline in the October 24 issue of Information Today.

NDLTD wiki

More additions to PubChem

PubChem now contains bioassay data from ChemBlock and updated structures from NIAID.

More on PLoS Clinical Trials

Astara March, Online journal to cover clinical trials, UPI, October 18, 2005. Excerpt:
PLoS Clinical Trials, a new online journal, will be launched next spring to report results of all randomized controlled clinical trials on humans in all medical and public-health disciplines, its sponsor said Monday. The Public Library of Science in San Francisco said the only requirements for publication will 1) that the trial was conducted according to the Helsinki guidelines on human research, 2) it is classified by an internationally accepted registry -- such as the International Standard Randomized Controlled Trial Number or -- and 3) it is reported accurately according to CONSORT criteria....Each submitted trial will undergo peer review by a statistician and a clinical researcher in the appropriate specialty, the statement said. The journal will publish a summary of the reviewers' comments along with each article, and readers also will be able to comment on articles at a special section of the PLoS Web site. Dr. Harold Varmus, former director of the National Institutes of Health, and currently president of the Memorial Sloan-Kettering Cancer Institute and chairman of the board of PLoS, said the purpose of the journal was to make as much research information available as soon as possible. "The important thing is getting the results of the studies out, so the scientific community can review them," Varmus told United Press International. "Knowing that a drug that has been approved for one use is not effective in a different situation may keep another group of researchers from wasting resources and reinventing the wheel."...Global Trial Bank, a non-profit subsidiary of the American Medical Informatics Association, will collaborate with the PLoS on data storage. After a trial is published in PLoS Clinical Trials, the results will be coded and entered into the GTB database for open-access searching, browsing and data-mining, and reciprocal links will be created between the two entities. Each published trial also will be linked to a site on the international registry chosen by its investigators.

Opening the door to OA ebooks

Peter Brantley, PDFs, eBooks, and DRM, Shimenawa, October 19, 2005. Excerpt:
A long long time ago, in a galaxy far, far away (New York), I made the suggestion in an Open eBook Forum that if publishers were to adopt DRM for eBooks, then libraries should be able to receive a copy of the work with the DRM either disabled or removed. This copy could then be placed in an archive against the risk that in the future, electronic-only books might be irrevocably lost. Even more than with object formats or MIME types, DRM is of-the-day. Within CDL's own monograph-based text delivery system, our eScholarship Editions repository, I've long hoped that eventually we would be permitted by the UC Press to make available portable PDF- or Palm Reader-formatted versions of open-access texts. With Adobe's participation in the Open Content Alliance, this thought surfaces again, but perhaps in a more user- oriented way. Given that (for now) all of the works being contributed for ingest into the OCA system are out-of-copyright, there are no legal barriers to multiple format derivations. Common sense permits the observation that out-of-copyright texts are not likely to be heavily accessed reference materials, but fiction, early science (largely useless for all but sociology or history of science), and older historical narratives. These works might actually be read, but they won't be if they are locked into page-turned web-based UIs....Uniquely, though, this corpus of curated material might be attractive to a large-ish and diverse group of users, and therefore...might be a rare strategic opportunity to help advance the desirability of device-independent texts. I wonder: if Adobe is truly committed to being more than an underlying technology provider to the OCA; if they truly support the concept that the content is open; and given that the content will be available in PDF: would they bless an open ebook pdf format, and leave it devoid of the DRM that Acrobat products currently incorporate? Wouldn't that ultimately facilitate adoption of portable texts by the larger market, and therefore benefit everyone? I think so.

More on Google Library

Google Print Expands in Europe, Red Herring, October 18, 2005. An unsigned news story. Excerpt:
Google plans to expand its controversial Google Print book-scanning and digitizing program to eight European countries, as well as making books in several European languages searchable over the Internet. The company will open local-language sites as part of its digital library project in Austria, Belgium, France, Germany, Italy, the Netherlands, Spain, and Switzerland, in languages such as French, German, Italian, Dutch, and Spanish. Google was expected to make a formal announcement at the Frankfurt Book Fair on Tuesday....“They’ve actually had opposition in Europe as well as in the U.S., so they’ll keep facing it,” said Danny Sullivan, editor of “Then again, I have no doubt they also have publishers wanting to take part in the program, which is why they are expanding it.”...Some European publishers have also come on board, including Grupo Planeta and Grupo Anaya in Spain, De Boeck and Editions De L’Eclat in France, and Springer Science+Business Media in the Netherlands. Several large publishers in the United States are also participating, including Simon & Schuster, Random House, and Warner Books, according to USA Today, but other smaller publishers have opted out, including Rowman & Littlefield Publishing....Pat Schroeder, president and CEO of the Association of American Publishers, said that in the United Kingdom, Google is only copying books in the public domain, and she believes that Google should exercise the same policy in the rest of Europe as well as the United States. “We think maybe we have educated them, and we just hope they will bring their education back to our shores,” said Ms. Schroeder, a former U.S. Congressional representative. She sees Google’s current policy in the U.S. as a violation of copyright. “Google is going in and copying a full work without permission,” she said. “They’re making two copies, and giving one to the library. They say they have the right to make a copy because they say they’re only releasing a snippet. Snippet is not a legal term. Google made it up. “If Google can go and copy what’s in the library without permission, then everybody can copy it without permission,” she added. “Google is making new copyright laws by themselves and exercising eminent domain. The last time I looked, they weren’t elected.”

Comment. Four notes to Pat Schroeder. (1) A snippet is a fair-use excerpt. The word doesn't have to be a legal term. The excerpt has to fall within fair-use guidelines. (2) If Google's copying and snippet-sharing are protected as fair use, then everybody really can do the same. (3) Google is taking a bold step whose legality is uncertain. But its legality will be decided by a court, not by Google. (4) Is it possible you didn't already know these things?

Security trumps OA and disaster relief in Pakistan

Declan Butler, Quake aid hampered by ban on web shots, Nature, October 18, 2005. Excerpt:
Open-access satellite images are revolutionizing responses to disasters. Yet the government of Pakistan has forced aid agencies to remove pictures of earthquake devastation from the Internet....Three days after the 7.6-magnitude earthquake struck Kashmir on 8 October, the Pakistan government appealed for high-resolution satellite images to help relief efforts. But, apparently to protect national security, the government has since forced international agencies and relief organizations to remove these images from their websites. The International Charter on Space and Major Disasters put high-resolution images of the earthquake zone on its website last Friday, then pulled them off hours later. The charter, a consortium of space agencies, was created in 2000 to supply satellite images and data to communities in need of relief following a disaster....But a senior official at the charter, who asked not to be named, says that the Pakistan government had demanded that no photos be made accessible to the public, because it feared the images could compromise security in the Kashmir region - an area that has long been disputed territory between India and Pakistan. The UN and other aid agencies need Pakistan's cooperation on the ground, and had no choice but to comply, he says....The importance of publicly available satellite images and other geographical data was highlighted after the Indian Ocean tsunami in December 2004 and this year's hurricane Katrina. In the aftermath of the tsunami, thousands of geographers and researchers provided high-quality information on the Internet, "while government agencies were still struggling to call people back from their holidays", says Thierry Rousselin....[T]he lack of data is still being sorely felt on the ground [in Pakistan]....[An earthquake rescue worker with Pakistan's] Citizens Foundation, who did not wish to named, pleaded in an e-mail to NASA and others: "Could you please tell me whether it's possible to get updated satellite imagery for this region (northern Pakistan), just like it was after Hurricane Katrina? Our country needs all the help it can get, and much more."

Finding out who visits ejournal web sites and why

The ALPSP is requesting proposals to investigate transient visitors to scholarly journal websites. From the site:
As the methods by which academics and others access research and all sorts of other electronic content change, the identification of who is visiting scholarly journals or related information online, and why, becomes ever more important. It is now the case that information seekers spend more time searching than they do analysing. Recent research has also demonstrated that a large proportion of visitors to academic websites are transient visitors, often non-subscribers who turn away before crossing payment barriers to available information, or visitors whose needs seem to be satisfied by an abstract or summary. The aim of this study would be to establish who these transient visitors are, and why they have visited in the first place. Were they looking just for an abstract? Have they been put off by the need to pay for an article? Are they unaware of a pay-per-view option (or, indeed, is one not offered)? Or are they just searching the web using a general search engine which has taken them to a particular site for no apparent reason other than a matching search term? The main aim of the research would be to develop methods to capture information on these transient visitors, to establish who they are, why they have entered a particular site, and whether they are satisfied or not satisfied with what they have obtained. It would also involve some analysis of who these visitors are, and whether an untapped source of potential revenue for content holders is being lost....Proposals must be received by 1st December 2005, and should be sent to Nick Evans.

Google Print shifts more power to readers

Michael Moskowitz, Go Google! Good riddance publishers! Another contribution to the EPS debate on Google Print. Excerpt:
Go Google! Good riddance to most publishers and to most academic institutions as well....Here’s the rub: as short-run [book-printing] technology improved, it became possible to produce at a reasonable product one at a time, i.e. print-on-demand. So in addition to there being little incentive for publisher to vet the quality of a book or to help improve it, there is little incentive to put it out-of-print. There are publishers who specialize in buying up backlist from other publishers and making the books available print-on-demand. Professional books of this sort are sold, often for more than $100, to the occasional desperate person or library. Book contracts are tradable commodities. Most authors have about as much say about what happens to their books as chickens have say about their eggs. Too many authors are stuck with their books in backlist limbo. They are not marketed, not readily available; and the author does not have the right to revise the book and take it elsewhere. They would be thrilled to have their books on Google. Most authors make very little money from their books. They want to get their ideas out there. It [is] a cliché worth repeating that the web has the potential to allow the greatest expansion in knowledge dissemination ever. Just as priests lost their ability to be the gatekeepers of knowledge with the invention of the book, publishers have lost their ability with the invention of the web. Publishers, with important exceptions, have also lost any moral authority to be arbiters or gatekeepers. Many academic and other small presses will publish nearly anything. Many big presses treat books as a generic marketable commodity with no defining characteristics. Coke or Pepsi? Will it sell? How do we sell it? “You can’t believe it because it’s in a book,” is truer now more than ever. Now it’s up to the people, or do I mean the consumer, to determine what is worth reading. To help make that determination, new forms of authority and expertise are arising on the web. Let’s not stand in the way.

Lost research impact in Australia

Brendan O'Keefe, Rewards await the far-cited, The Australian, October 19, 2005. Not in the free online section of the paper. Excerpt:
Australia's research impact could be increased to oa level that would otherwise require an extra $425 million a year in funding if academics simply posted their papers on a personal or university web page rather than shelving them in a library or publishing them only in a little-read journal. Self-archiving, as proposed by Canadian academic Stevan Harnad, could bring academics and students greater exposure and more citations than through traditional publication. Harnad says the increase in impact of published research could elevate a university from a two-star ranking to five stars under the British research assessment exercise and Australia's proposed research quality framework....Harnad says even the richest university, Harvard, can afford to subscribe to only a small fraction of the 24,000 peer-reviewed journals in the world, which carry about 2.5 million articles per year. Smaller universities can afford even fewer subscriptions, so most published academic work remains unseen by most of the author's peers. "From the point of view of the author, the fact that so many potential users can't access his giveaway work is appalling," Harnad says. "There was nothing [that] could be done about this in [pre-internet] paper days; costs meant the only you could provide access was by charging a toll. This is not true any more."...Harnad says failure to self-archive costs 50 per cent of the potential citations on Australia's $1 billion research spend....Uniersity of Tasmania computer scientist Arthur Sale, who has designed software that tells authors how often and where their work is being read, says research impact could skyrocket with self-archiving. "If [the Government] contributed an extra $425 million per year for the ARC [Australian Research Council] and the NH&MRC [National lHealth and Medical Research Council], Australia's research impact as measured by citations would probably increase by about 40%....However, if it instead required a copy of all publciations derived from research funded by ARC or NHMRC grants to be deposited in an institutional repository at the time of final acceptance, Australia's research impact would rise to thoe same level. That is what taxpayers are losing annually from delay in this necessary decision." QUT [Queensland University of Technology] eprint archive project officer Paula Callan oversaw the introduction of a self-archiving policy at the university in January laast year. She says the concept was difficult to institute but, once authors saw the benefits, take-up spread. "In the feedback I've received, some expressed absolute wonder that their paper was being downloaded so often," Callan says.

October issue of RLG DigiNews

The October issue of RLG DigiNews is devoted to the certification of digital repositories (which is more about preservation than OA or interoperability):

Tuesday, October 18, 2005

Urgent call to preserve ejournals

Donald Waters (ed.), Urgent Action Needed to Preserve Scholarly Electronic Journals, October 15, 2005. This document represents the consensus of 18 academic librarians and university administrators who met at the Mellon Foundation offices on September 13, 2005, to discuss the preservation of ejournals. Excerpt:
The shift from print to electronic publication of scholarly journals is occurring at a particularly rapid pace....In the face of this shift, what makes preservation action so urgent for electronic scholarly journals -- and the risk of failure so high for the academy -- is the nature of the licensing regime under which these journals are now distributed. When research and academic libraries license electronic journals, they do not to take local possession of a copy as they did with print. Rather, they use content stored on remote systems controlled by publishers, and economies of scale in electronic publishing are driving control of more and more journals into fewer and fewer hands....For electronic journals, the academy has as yet no functional equivalent in long-term maintenance and control over the scholarly record that "owning a copy" provided for printed journals. Unless and until it creates digital archiving services, the academy cannot fully shift to electronic-only journal publishing, and cannot fully achieve the system-wide savings and benefits associated with such a shift....Four key actions are essential. First, research and academic libraries and associated academic institutions must recognize that preservation of electronic journals is a kind of insurance, and is not in and of itself a form of access. Preservation is a way of managing risk, first, against the permanent loss of electronic journals and, second, against having journal access disrupted for a protracted period following a publisher failure. Storing electronic journal files in trusted archives outside the control of the publisher addresses the first risk. Mitigating the second risk requires investment in substitute access systems, which may cost more or less to construct depending on the quality and duration of disruption that the academic community would be willing to tolerate in the event of a failure. Second, in order to address these risk factors and provide insurance against loss, qualified preservation archives would provide a minimal set of well-defined services. [Waters then lists six such services.] ...Third, libraries must invest in a qualified archiving solution....Finally, research and academic libraries and associated academic institutions must effectively demand archival deposit by publishers as a condition of licensing electronic journals.

Comment. I'm not the only OA activist who has tried to distinguish preservation from access so that the steps needed for preservation don't delay the steps needed for OA. But let's also admit that we can pursue OA and preservation in parallel so that neither effort delays the other. I've called preservation a separate essential --not part of OA but necessary-- and I stand by that. Preservation is essential not only for subscription ejournals but also for OA ejournals and repositories. I support the call by the Waters group. I particularly appreciate its diagnosis that the problem lies in licensing restrictions. Libraries rarely have permission to make copies for long-term local storage or to migrate these copies to new media and formats to keep them readable as technology changes. It's important for libraries to realize that OA removes these permission barriers and makes OA content easier to preserve than non-OA content. But it's just as important for OA activists to realize that this only makes preservation efforts permissible. It doesn't by itself get the job done. We still have work to do.

Update. The document has moved to a new URL (already incorporated above). Note to CLIR: please put a redirect page at the original URL.

Standards for OA to proteomics data

Andreas Kremer, Reinhard Schneider, and Georg C. Terstappen, A bioinformatics perspective on proteomics: data storage, analysis, and integration, Bioscience Reports, February/April 2005. Only this abstract is free online, at least so far:
The field of proteomics is advancing rapidly as a result of powerful new technologies and proteomics experiments yield a vast and increasing amount of information. Data regarding protein occurrence, abundance, identity, sequence, structure, properties, and interactions need to be stored. Currently, a common standard has not yet been established and open access to results is needed for further development of robust analysis algorithms. Databases for proteomics will evolve from pure storage into knowledge resources, providing a repository for information (meta-data) which is mainly not stored in simple flat files. This review will shed light on recent steps towards the generation of a common standard in proteomics data storage and integration, but is not meant to be a comprehensive overview of all available databases and tools in the proteomics community.

Developing an OA consumer health vocabulary

Quing T. Zeng and Tony Tse, A Case for Developing an Open-source First-Generation Consumer Health Vocabulary, Journal of the American Medical Informatics Association, October 12, 2005. Only this abstract is free online, at least so far:
Lay persons (consumers) often have difficulty finding, understanding, and acting on health information due to gaps in their domain knowledge. Ideally, consumer health vocabularies (CHVs) would reflect the different ways consumers express and think about health topics, helping to bridge this vocabulary gap. However, despite the recent research on mismatches between consumer and professional language (e.g., lexical, semantic, and explanatory), there have been few systematic efforts to develop and evaluate CHVs. This paper presents the point of view that CHV development is practical and necessary for extending research on informatics-based tools to facilitate consumer health information seeking, retrieval, and understanding. In support of the view, we briefly describe experiment with a distributed, bottom-up approach for (1) exploring the relationship between common consumer health expressions and professional concepts and (2) developing an open-access, preliminary (draft) first-generation CHV. While recognizing the limitations of the approach (e.g., not addressing psychosocial and cultural factors), we suggest that such exploratory research and development will yield insights into the nature of consumer health expressions and assist developers in creating tools and applications to support consumer health information seeking.

Help RepoMMan design a tool to work with OA repositories

The RepoMMan project is conducting a survey to help it design a software tool to help researchers work with digital repositories. From the survey site:
At a very crude level, a digital repository can be thought of as a database-driven website which is used to store, say, research outputs and which can be interrogated from other websites and through search engines to retrieve those outputs on demand. However, we believe that a repository can be useful to researchers well before they get to the finished-product stage. We hope that, through our work, researchers will be able to harness the potential of a digital repository to support them through the development of their research, from idea to output, whether they are working on their own or collaboratively. It is our intention that, when their research is completed, our software will allow the researcher to make the textual report available on the web and also to link it to such things as images, datasets and pointers to related papers or research. Effective location of research outputs available on the web relies heavily on the material being properly indexed through the use of 'metadata'; our tool will help generate this metadata. The results of this survey, together with information from personal interviews with active researchers, will help us design an effective software tool.

There is no deadline on responses, though a prize drawing among respondents who volunteer their email addresses will be held in mid-November.

OA to science in the developing world

Peter Suber and Subbiah Arunachalam, Open Access to Science in the Developing World, World-Information City, October 18, 2005. World-Information City is the print newspaper for the November 16-18, 2005, meeting of the World Summit on the Information Society in Tunis. Excerpt:
OA is a matter of special concern in developing countries, which have less money to fund or publish research and less to buy the research published elsewhere. Most libraries in sub-Saharan Africa have not subscribed to any journal for years. The Indian Institute of Science, Bangalore, has the best-funded research library in India, but its annual library budget is just Rs 100 million (about $2.2 million)....There are many successful OA initiatives in the developing world. These include Bioline International, which hosts electronic OA versions of 40 developing country journals; SciELO, which hosts more than 80 journals published in Latin American countries and Spain; and African Journals Online (AJOL), which provides free online access to titles and abstracts of more than 60 African journals and full text on request. The Electronic Publishing Trust for Development (EPT), established in 1996, promotes open access to the world's scholarly literature and the electronic publication of bioscience journals from countries experiencing difficulties with traditional publication. India is home to many OA journals that charge no author-side fees. All 10 journals of the Indian Academy of Sciences and all four journals of the Indian National Science Academy are OA journals....The Indian Medlars Centre of the National Informatics Centre is bringing out OA versions of 33 biomedical journals and has an OA bibliographic database, providing titles and abstracts of articles from 50 Indian biomedical journals. Medknow Publications, a company based in Mumbai, has helped 30 medical journals make the transition from print to electronic open access and most of them are doing much better now than before. OA archiving is even more promising than OA journals. It is less expensive, allows faster turnaround, and is compatible with publishing in conventional journals.

Google CEO defends Google Library

Eric Schmidt, The Book of Revelation, Wall Street Journal, October 18, 2005 (accessible only to subscribers). Schmidt is the CEO of Google. (Thanks to Thiru Balasubramaniam.) Excerpt:
Imagine sitting at your computer and, in less than a second, searching the full text of every book ever written....That's the vision behind Google Print, a program we introduced last fall to help users search through the oceans of information contained in the world's books. Recently, some members of the publishing industry who believe this program violates copyright law have been fighting to stop it. We respectfully disagree with their conclusions, on both the meaning of the law and the spirit of a program which, in fact, will enhance the value of each copyright. Here's why....For many books, these results will, like an ordinary card catalog, contain basic bibliographic information and, at most, a few lines of text where your search terms appear. We show more than this basic information only if a book is in the public domain, or if the copyright owner has explicitly allowed it by adding this title to the Publisher Program (most major U.S. and U.K. publishers have signed up). We refer people who discover books through Google Print to online retailers, but we don't make a penny on referrals. We also don't place ads on Google Print pages for books from our Library Project, and we do so for books in our Publishing Program only with the permission of publishers, who receive the majority of the resulting revenue. Any copyright holder can easily exclude their titles from Google Print -- no lawsuit is required. This policy is entirely in keeping with our main Web search engine. In order to guide users to the information they're looking for, we copy and index all the Web sites we find. If we didn't, a useful search engine would be impossible....Only by physically scanning and indexing every word of the extraordinary collections of our partner libraries at Michigan, Stanford, Oxford, the New York Public Library and Harvard can we make all these lost titles discoverable with the level of comprehensiveness that will make Google Print a world-changing resource....The program's critics maintain that any use of their books requires their permission. We have the utmost respect for the intellectual and creative effort that lies behind every grant of copyright. Copyright law, however, is all about which uses require permission and which don't; and we believe (and have structured Google Print to ensure) that the use we make of books we scan through the Library Project is consistent with the Copyright Act, whose "fair use" balancing of the rights of copyright-holders with the public benefits of free expression and innovation allows a wide range of activity...without copyright-holder permission. Even those critics who understand that copyright law is not absolute argue that making a full copy of a given work, even just to index it, can never constitute fair use. If this were so, you wouldn't be able to record a TV show to watch it later or use a search engine that indexes billions of Web pages....Backlist titles comprise the vast majority of books in print and a large portion of many publishers' profits, but just a fraction of their marketing budgets. Google Print will allow those titles to live forever, just one search away from being found and purchased....How many users will find, and then buy, books they never could have discovered any other way? How many out-of-print and backlist titles will find new and renewed sales life? How many future authors will make a living through their words solely because the Internet has made it so much easier for a scattered audience to find them? This egalitarianism of information dispersal is precisely what the Web is best at; precisely what leads to powerful new business models for the creative community; precisely what copyright law is ultimately intended to support; and, together with our partners, precisely what we hope, and expect, to accomplish with Google Print.

Educating faculty about OA

Dorothea Salo, Out of my bubble, Caveat Lector, October 17, 2005. Excerpt:
It’s easy to read the statistics. X percentage of research faculty think open access is a neat idea. Y would be willing to post their research materials. Z think they ought to keep their own copyrights. What’s not easy is smacking one’s nose into not-X, not-Y, and not-Z. What’s not easy is realizing that even among X, Y, and Z, these questions are purely theoretical for most. Sure, they think it’s a nice idea; doesn’t mean they’re aware or willing enough to do anything about it....I live in the middle of open-access evangelism. I read Peter Suber every day. My biggest category (well, apart from “css”) contains open-access linkage. With all this talk floating around out there, how could faculty be ignorant? Surely they’ve seen something. Then I went to a meeting where open access was confused with open source (my fault, that), where folks were concerned that open access would reduce awareness, where they worried (not entirely without justification this time) that e-publication reduced scholarly cachet. Oh, my. I have got a lot of education to do....I’m not panicking. The Evil Master Plan actually is proceeding reasonably apace, with one faculty visit definite for next month, and another likely. I’m working on squibs, handouts, an article for a local journal. The scales have fallen from my eyes, however. This is going to be an uphill climb.

(PS: Hang in there, Dorothea. Educating faculty is the front line.)

Blog coverage of Access 2005 conference

Planet Access is providing real-time blog coverage of Access 2005 (Edmonton, October 17-19, 2005). (Thanks to Richard Ackerman.)

Scirus indexes CalTech repository

Scirus now indexes CODA, the CalTech institutional repository. For more details see Elsevier's press release (October 17).

JISC funds help three journals explore OA

JISC has announced the publishers it will support in the third round of its Open Access Programme. From today's press release:
JISC today announced the winners of funding under the third round of its Open Access programme under which publishers are awarded seed money to explore open access models of publishing. Following the success of the first two years of the programme, the decision has been made to award three publishers funds to support open access delivery for their journals. A total of £84,500 will be awarded to some of the most important scholarly titles in their fields. These are: New Journal of Physics (published by the Institute of Physics Publishing); the journals of the International Union of Crystallography (IUCr); and the Journal of Medical Genetics (BMJ Publishing Group Ltd.). All three are previous recipients of funding from earlier rounds of this programme, a level of continuity that increases the growing evidence base provided by the programme, which is to be formally evaluated early next year. Peter Strickland, Managing Editor of IUCr, said: "As a result of the JISC funding so far, I am strongly convinced that providing authors the opportunity to make their papers open access works well, provides authors with extra choice and improves access to published content."...Andrea Horgan, Managing Editor of BMJ Journals, said: “Earlier JISC funds have enabled us to compare the profile and usage of open access papers with those that are behind closed access to gauge author [reader?] responses in both cases. This involves randomisation of papers to ensure the integrity of the results. This is an ongoing process and the additional grant money will allow us to randomise a further 20 papers from UK authors (10 to open access) to gather further evidence about this model.” Lorraine Estelle, JISC Collections Team Manager, said: “This programme continues to provide us with much-needed evidence about the impact of open access models of publishing on the conduct and dissemination of research."

Update. The IOP has issued its own press release (October 20).

New folksonomy/search/discussion tool

Shadows is a new service from the people at Pluck. It not only lets users tag web pages, share their folksonomies with others, and search the results (like Connotea, for example), but it also creates an open space for discussion and peer commentary on any web page. It does this by creating a "shadow page" --operating like a group blog-- for any existing web page. The service is free. For more detail, see the Pluck press release.

More on Google in Europe

Edward Wyatt, Google Opens 8 Sites in Europe, Widening Its Book Search Effort, New York Times, October 18, 2005. Excerpt:
Google said Monday that it had begun operating local-language sites in eight European countries for its Google Print program, its closely watched effort to make all of the world's books searchable online, expanding into territories where it has drawn fierce criticism. The Google Print sites - for France, Italy, Germany, the Netherlands, Austria, Switzerland, Belgium and Spain - enable users to search books provided by publishers in each country as well as English-language books in the Google library for which the company has secured local rights. Susan Wojcicki, a vice president for product management at Google, said in an interview Monday that the new sites currently could be used to search only a relatively small number of books. Many of those have been scanned since August, when the company, seeking to expand its online book program, began approaching European publishers. Google is planning to discuss the new sites this week at the Frankfurt Book Fair....That discussion is also to include Jean-Nöel Jeanneney, the president of the French National Library, who began advocating for a European effort to digitize and catalog the Continent's library collections soon after Google announced agreements with five major libraries last December to digitize their collections of 15 million books and documents. The European sites work much the same as the main Google Print site ( A user searching the German site,, for a word will receive links to books containing that word. The user can see some of the pages in each book where the word appears, review the book's bibliographic information and link to retailers that will sell the book directly to the user. Eventually, the European sites will give users access to data about foreign-language books in the collections of the New York Public Library and the university libraries of Stanford, Harvard, Michigan and Oxford. Among the European publishers that have signed pacts with Google are Grupo Planeta and Grupo Anaya of Spain, De Boeck and Editions De L'Eclat of France and Springer Science & Business Media of Holland.

More on Google Library in Europe

Mark Beunderman, Google opens digital library ahead of EU governments, EU Observer, October 18, 2005. Excerpt:
The Internet search engine Google has launched its books scan engine in eight European countries. The move comes ahead of steps by the European commission and EU member states who also plan to set up their own European digital library. Google will on Tuesday (18 October) officially present the opening of its controversial digital library service in eight European countries at the yearly Frankfurt Book Fair, a major German cultural event. Last Sunday, Germany, France, Italy, the Netherlands, Austria, Switzerland, Belgium and Spain already witnessed the launch of "Google Print"....Google's move comes after European Commissioner Viviane Reding in September announced Brussels’ own project for digitisation and preservation of Europe's cultural heritage – generally seen as a competitor to Google Print. The commission initiative followed strong pressure from France, where Google Print after its earlier launch in the US and the UK had sparked fear for Anglo-American cultural domination.

More on Google Library in Europe

Bobby Pickering, Google clarifies Print differences in Europe, Information World Review, October 18, 2005. Excerpt:
Google UK has sought to clarify its position with the Google Print project, saying it is taking an entirely different approach in Europe....Rabin Yaghoubi, strategic partner development director in Europe, told IWR in an exclusive interview that the company believes it is acting completely legally in the US, under the laws of fair use. “But we want to make it clear that in Europe we are only scanning library works that are in the public domain and pre-1900”. The Google Print project is divided into two strands – a Publisher Partner programme and a Library Partner programme. The latter has involved seven libraries, with just Oxford’s Bodleian Library outside the US. Google is talking to other libraries in Europe to join the programme. “The library project will be of enormous use to authors and publishers. More than 85% of works are out of print, and it will make this knowledge available to researchers.” He said the two programmes deliver different results pages for searches in Google Print. “On Publisher pages there will be links to e-commerce sites, while the Library pages will have links back to the library, which will drive people to these libraries.” There will be no ads shown on pages from the Library programme.

Free online Filipino journals

A Filipino librarian (apparently Vernon Totanes) has posted a list of seven free online Filipino journals in the sciences and humanities. The journals are not yet in the DOAJ, but if they qualify I join Totanes in hoping that they will be added soon.

More on the DLF Aquifer

Katherine Kott, The DLF Aquifer Initiative: progress and next steps, a slide presentation to the DLF board on October 6, 2005. Kott is the director of the Aquifer project.

DC Principles Coalition wants NIH to reconsider its public-access policy

The DC Principles Coalition has publicly released its October 17 letter to the NIH urging it to reconsider its public-access policy. The coalition recommends that NIH link from PubMed abstracts to full-text articles at publisher web sites (free online after a 12 month embargo) rather than host free online copies of the articles itself. Opponents of the NIH policy have proposed this alternative before, and it's not likely that their arguments will be more successful this time than in the past.

Wish-list for a researcher's search engine

Judit Bar-Ilan, Expectations versus reality – Search engine features needed for Web research at mid 2005, CyberMetrics, 1, 2 (2005).
Abstract: Web research is based on data from or about the Web. Often data is collected using search engines. Here we describe our "wish list" for the ideal search engine, explain the need for the specific features and examine whether the currently existing major search engines can at least partially fulfil the requirements of the ultimate search tool. The major search tools are commercial and are oriented towards the "average" user and not towards the Web researcher, and therefore are unable to meet all the requests. One possible solution is for the research community to recruit the necessary funding, resources and know-how in order to build a research-oriented search tool.

From the body of the article:

Document and word counts are often insufficient for Web research (especially when these numbers are unreliable). In order to study the documents themselves, we have to access them. Thus knowing that there are 11,203,349 pages that the search engine marked as relevant to our search, but being able to access only 1000 is not satisfactory. The ability to retrieve the whole set of results, and not only the first 250 or 1000, is essential for successful Web research.
(PS: Most of the features Bar-Ilan describes would be desirable for researchers investigating any topic, not just researchers investigating the web itself.)

CIBER's second author survey on OA issues

Ian Rowlands and Dave Nichols, New Journal Publishing Models: An International Survey of Senior Researchers, CIBER (Centre for Information Behaviour and the Evaluation of Research), September 22, 2005. Excerpt:
This survey reports on the behaviour, attitudes and perceptions of 5,513 senior journal authors on range of issues relating to a scholarly communication system which is in the painful early stages of a digital revolution....In determining where to publish, the author population as a whole does not attach much importance to being able to retain their copyright in the article, nor to gaining permission to place a pre- or post-print on the Web or in some kind of repository....Significantly, senior authors and researchers believe downloads to be a more credible measure of the usefulness of research than traditional citations, perhaps indicating a commercial opportunity for publishers....With regard to open access two significant shifts appear to have occurred since the last survey [PS: in March 2004]. Firstly, the research community is now much more aware of the open access issue. There has been a large rise in authors knowing quite a lot about open access (up 10 percentage points from the 2004 figure) and a big fall in authors knowing nothing at all about open access (down 25 points). Secondly, the proportion of authors publishing in an open access journal has grown considerably from 11 per cent (2004) to 29 per cent.....Authors strongly believe that, as a result of open access, articles will become more accessible and, somewhat less strongly, that budgetary pressures on libraries would ease as a result. They do not believe, however, that quality will improve....A clear majority of authors believes that mass migration to open access would undermine scholarly publishing. Of those who expressed an opinion, half believed this was likely; however, a good proportion of these people thought this would probably be a good thing so there is evidence of considerable dissatisfaction with the status quo....There is very little enthusiasm for author-or reader facing charges, and a feeling that libraries should not have to make such a large contribution to the costs of the journals system as they bear at the moment. The favoured option is that a greater burden should be borne (in this order) by research funders, commercial sponsors and central government....Authors are not at all knowledgeable about institutional repositories: less than 10 per cent declared that they know `a little’ or `a lot? about this development, and there are signs of a dragging of feet: a significant minority (38 per cent) of those expressing an opinion, declare a clear unwillingness to deposit their articles in an institutional repository....Looking at the author population as whole, two clusters of researchers with especially positive views about open access and the need for reform of the current system are evident. The most radicalised group (‘OA Enthusiasts’) makes up about 8% of the total population. This group is characterised by its youth, its geographical composition (with very strong representation from Asia, Africa and Eastern Europe) and a tendency towards more applied and clincal ends of the research spectrum. For a very large majority of mid-career and older researchers in the `Anglosphere mainstream’, open access issues are not at all high on their list of priorities. Not so far, anyway.

Nurturing OA in Canada, from the top

Arthur Carty, A global information system needs a culture of sharing, University Affairs, November 2005. Carty is the Canadian National Science Advisor. (Thanks to BNA Internet Law News.) Excerpt:
So, what is Canada’s vision for a 21st-century global system for disseminating and communicating research data? Above all, our goal must be to maximize the impact of research for societies everywhere, not just the developed world. People in developing nations must be able to access and contribute to the vitality of the global research information and communications system. An open-access philosophy is critical to the system’s success: if research findings and knowledge are to be built upon and used by other scientists, then this knowledge must be widely available on the web, not just stored in published journals that are often expensive and not universally available.

From a Canadian perspective, a 21st century research communications system would share certain attributes. It would: [1] take full advantage of the enormous potential of new information and communication technologies; [2] be capable of handling an unprecedented flow of information in a wide variety of formats; [3] bring Canadian research knowledge to the world and bring the world’s research knowledge to Canada; [4] be accessible by all Canadians, in all sectors, ensuring that public investment in scientific research leads to wealth creation and improvements in social and cultural well-being. With this type of system a researcher could access, from any corner of the globe, the full texts of relevant journal articles; a comprehensive set of monographs and theses; research data sets that underlie published outcomes; research reports and non-peer-reviewed research materials from both academia and government; and the electronic tools necessary to manage this volume of material. Creating a system with these attributes is no longer just a question of developing appropriate technologies; for the most part these already exist. Rather, it’s a matter of building, integrating and improving the technical infrastructure, operational standards, research support systems, regulations and institutional roles and responsibilities. It’s also a matter of nurturing a culture of open access and sharing, beyond what researchers have ever embraced. Canada is fortunate to have a number of key building blocks in place to facilitate the development of such a system. These include a network of institutional repositories at 26 university research libraries....Building an effective global information system consists both of this infrastructure and perhaps more importantly a culture of open access and sharing. This is harder to build than the nuts and bolts of the system because it requires a new mindset among researchers, administrators, governments and in some cases companies – everyone involved in the creation and dissemination of knowledge....However, filling archives, though necessary, will not be able to change the mindset of people in the research enterprise. We have to find ways to motivate researchers in all countries to preserve and exchange their research data, to publish their findings in open access journals and to deposit their published articles in institutional repositories....Institutions, too, need to know that their investments in expanding and improving the quality of their data archives and open-access repositories are recognized as measurable scientific outputs. Some of these issues will be broached at the World Information Summit taking place this month in Turin, Italy. Canada has to articulate a vision to meet the challenges outlined above. Unless we act, the unprecedented volume of research information will become too difficult to manage, and highly valuable research data will be lost, along with the public investment in our future.

More on PLoS Clinical Trials

Lila Guterman, Open-Access Publisher Plans to Start Clinical-Trials Journal That Welcomes Negative Results, Chronicle of Higher Education, October 18, 2005 (accessible only to subscribers). Excerpt:
Editors and researchers have worried for years that many clinical trials never appear in the medical literature. The human studies most likely to go unreported are small trials, ones that reach negative conclusions, and ones that don't achieve statistical significance. As a result, the research literature suffers from what editors call "publication bias." How can the literature become more representative of the research that's being done? The answer is simple, say some, and is the same as the solution they see for many other publishing woes: open access. That's why the Public Library of Science, commonly called PLoS, is scheduled to announce today that it is starting a freely accessible online journal called PLoS Clinical Trials....Open access, says Emma Veitch, the new journal's publications manager, makes it possible to publish "the sorts of trials which otherwise might not get out there." In contrast, journals that rely on subscriptions to pay the bills feel pressure to publish high-profile studies that will attract readers' attention, she says. Other studies deserve to be reported, adds Ms. Veitch, even if their results are disappointing....Kirby P. Lee, an assistant professor of clinical pharmacy at the University of California at San Francisco, says he hopes the new journal will help rebalance the medical literature. Mr. Lee has performed a study of more than 1,000 manuscripts submitted to three top-tier medical journals. At the Fifth International Congress on Peer Review and Biomedical Publication in Chicago in September, he reported that journal editors were not the source of publication bias. The journals, he found, tended to accept the same fraction of negative studies as they had received. His and others' research suggests that authors themselves are the source of the bias: They rarely submit for publication a study that has negative or statistically insignificant results. Mr. Lee says authors are hesitant to submit those studies to most journals because they fear editors will reject them. But PLoS Clinical Trials explicitly says it welcomes such research. "This open-access journal is wonderful," Mr. Lee says, "because it says, 'Hey, we just want to look at the data. Regardless of the outcome, if it's a strong study, we will publish it.' It will increase the available evidence so that clinicians can make appropriate decisions on therapies."

Monday, October 17, 2005

PLoS Clinical Trials calls for papers

PLoS has issued a call for papers for PLoS Clinical Trials, its sixth peer-reviewed, open access journal.

From the journal web site: 'Clinical trials --and particularly randomized trials-- are critical in delivering reliable evidence about the efficacy of an intervention. Clinical trial data can also provide important information about the potential adverse effects of treatment. Currently, not all trials on human participants are reported in the peer-reviewed literature. PLoS Clinical Trials aims to fill this gap. The journal will broaden the scope of clinical trials reporting by publishing the results of randomized clinical trials in humans from all medical and public health disciplines. Publication decisions will not be affected by the direction of results, size or perceived importance of the trial. As an open–access journal, all articles published in the journal will be immediately and freely available online. Join us in supporting these goals, and get your paper read by the widest possible audience: submit your trial results today.'

From the press release (October 18): 'The Public Library of Science (PLoS) today announces PLoS Clinical Trials, an innovative new journal devoted to peer-reviewing and publishing reports of randomized clinical trials in all areas of healthcare. The journal differs from other medical journals in one crucial respect. It will publish all trials that are ethically and scientifically sound and entered into an internationally accepted registry, regardless of the trial's size or whether the results are positive or negative. PLoS Clinical Trials is now accepting manuscripts in advance of its spring 2006 launch. Around half of all completed trial reports are thought to go unpublished. These unpublished trial reports differ systematically from those that are published in the direction and strength of the findings, thus distorting the evidence base for decision-making in healthcare. "Unpublished results undermine the trust between patients and investigators and slow the vast potential of medical progress," says Dr Christian Gluud of Copenhagen University Hospital, a member of the Advisory Board of PLoS Clinical Trials. Traditional medical journals publish only the highest profile clinical trials (typically positive trials), partly because the journals must attract revenues from subscriptions and selling reprints. PLoS Clinical Trials avoids this problem --it doesn't have to sell subscriptions or reprints to be viable, so it can publish the broadest range of trials.'

New low-cost ejournal platform

Scholarly Exchange is a new, low-cost publishing platform for ejournals. It's not open-source, but the overall cost is so low that it could easily support OA journals. From today's press release:
The fully configurable software platform, hosted and supported, enables scholars, societies, and publishers to create scholarly publications for fixed annual fees ranging from $750 to $1500. It is a cost-effective gathering point, where knowledge is collected, evaluated, and then shared - without the expense and restrictions of traditional journal publishers. SE offers a readily affordable pathway for even the smallest societies and groups to exchange and distribute knowledge. Scholarly Exchange plays no role in the creation of the information or its ultimate ownership, only in the sharing of the highest quality of scholarship as determined by the scholars who produce it. SE is not a publisher in the traditional sense but more a facilitator of the global academic forum of ideas. Under the 'sustainable support' model, groups of scholars, their societies, or publishers may create new publications or migrate existing ones. With guidance from Scholarly Exchange, they can register their journal or conference, configure the website to their needs, collect and peer-review content, make editorial decisions, display accepted articles, and export both content and metadata to a variety of database formats for storage elsewhere. Given the way in which the scholarly content is stored, it can readily be re-formulated for presentation as printed-and-bound volumes or as CD/DVD-based materials....Scholarly Exchange is an IRS-registered 501 (c) (3) public charity corporation with offices in Brookline, MA, Cupertino, CA, and Edmonton, AB.

Jonathan Band on Google Library

Jonathan Band, The Google Print Library Project: A Copyright Analysis, ARL Bimonthly Report 242, October 2005. A reprint of Band's essay blogged here on September 7.

Against OA for the 1918 flu virus genome sequence

Ray Kurzweil and Bill Joy, Recipe for Destruction, New York Times, October 17, 2005. An op-ed. Excerpt:
After a decade of painstaking research, federal and university scientists have reconstructed the 1918 influenza virus that killed 50 million people worldwide....To shed light on how the virus evolved, the United States Department of Health and Human Services published the full genome of the 1918 influenza virus on the Internet in the [open-access] GenBank database. This is extremely foolish. The genome is essentially the design of a weapon of mass destruction. No responsible scientist would advocate publishing precise designs for an atomic bomb, and in two ways revealing the sequence for the flu virus is even more dangerous. First, it would be easier to create and release this highly destructive virus from the genetic data than it would be to build and detonate an atomic bomb given only its design, as you don't need rare raw materials like plutonium or enriched uranium. Synthesizing the virus from scratch would be difficult, but far from impossible. An easier approach would be to modify a conventional flu virus with the eight unique and now published genes of the 1918 killer virus. Second, release of the virus would be far worse than an atomic bomb. Analyses have shown that the detonation of an atomic bomb in an American city could kill as many as one million people. Release of a highly communicable and deadly biological virus could kill tens of millions, with some estimates in the hundreds of millions. A Science staff writer, Jocelyn Kaiser, said, "Both the authors and Science's editors acknowledge concerns that terrorists could, in theory, use the information to reconstruct the 1918 flu virus." And yet the journal required that the full genome sequence be made available on the GenBank database as a condition for publishing the paper. Proponents of publishing this data point out that valuable insights have been gained from the virus's recreation. These insights could help scientists across the world detect and defend against future pandemics, including avian flu.

(PS: For a book-length statement of the contrary view --that the benefits of OA to genome data on pathogens outweigh the risk of abuse by terrorists-- see last year's study by the US National Research Council, Seeking Security: Pathogens, Open Access, and Genome Databases, September 8, 2004. Also see my comment last month in support of the study's conclusions.)

More on the launch of PLoS

Michael Hiltzik, Freedom of the Owner of the Press, Los Angeles Times, October 17, 2005. (Thanks to George Porter.) Excerpt:
"We started [PLoS] because we were outraged at the system," Michael Eisen told me last week. A biologist at UC Berkeley and Lawrence Berkeley National Laboratory, Eisen is co-founder, with Patrick O. Brown of Stanford University and former NIH director Harold E. Varmus, of the Public Library of Science, which publishes five journals of peer-reviewed scientific papers and has plans for many more. As its name implies, PLoS runs on the principle that the findings of scientific researchers should be openly accessible to all, not deposited in journals whose subscription fees can run to thousands of dollars a year....The idea is not chiefly to save money for universities at the expense of faculty members - indeed, for universities with large faculties, the new model may be more costly than the old. The real goal is to wrest research copyrights from journal publishers; when researchers are paying for publication, they, not the publishers, retain control of their papers. Eisen argues that PLoS eliminates many absurdities of traditional scientific publishing, in which a foundation or institution that might spend millions of dollars on a research project must turn its results over to a publisher gratis (scientific journals normally don't pay for articles) and then spend more money to read the findings. It also takes better advantage of the Internet, which has rendered obsolescent the paper publishing process that gave rise to the subscription model....The established scientific press, which includes giant profit-making corporations such as Reed Elsevier as well as not-for-profit institutions such as the American Assn. for the Advancement of Science (publisher of the journal Science), was arguing that subscriptions were the only way to pay for the rigorous peer review and production values that gave their publications credibility. "We knew we'd have to launch prestigious journals to prove that open access and high quality could be synonymous," Eisen says. PLoS Biology, their first journal, began publishing in October 2003. Once PLoS emerged as a potential competitor, Eisen says, publishers started to take open access seriously. Some agreed to make more material available publicly, generally after a delay of six months or longer. But they also mounted a sharp attack on the very principle of open access. There have been studies with titles such as "The Erroneous Premise of Open-Access Advocates," publicity campaigns aimed at science reporters, and lobbying about the dark side of government-maintained research repositories.

OA Citation Information report from JISC

The JISC Committee for the Information Environment (JCIE) Scholarly Communication Group has released its report on Open Access Citation Information (condensed version or expanded version). The report authors are Rachel Hardy, Charles Oppenheim, Tim Brody, and Steve Hitchcock. Excerpt from the condensed version:
A primary objective of this research is to identify a framework for universal open access (OA) citation services and an ideal structure for the collection and distribution of citation information and the main requirements of such services. The aim of the proposal is to increase the exposure of open access materials and their references to indexing services, and to motivate new services by reducing setup costs. A combination of distributed and automated tools, with some additional effort by authors, can be used to provide more accurate, more comprehensive (and potentially free) citation indices than currently exist....Recommendations: [1] Integrate reference parsing tools into IR software to allow the immediate autonomous extraction of reference data from papers uploaded by authors. [2] Automatically parse most reference formats deposited as free text, and present a properly parsed version back to the author interactively. Authors can then be invited to check the reformatted references and attempt to repair unlinked references. [3] Establish a standard means for IR software to interact with reference databases, e.g. through a published Web services interface, allowing IR administrators to choose between reference database providers (e.g. CrossRef, PubMed, CiteULike, etc.). [4] Create or adapt reference database services to support remote reference linking, i.e. using the partial data obtained from autonomous reference parsing to query, expand and link those references to the canonical reference. [5] Develop a standards-based approach to the storage and communication of reference data, e.g. using OpenURL context objects in OAI-PMH....

This proposal enhances open access by building services that exploit improved accessibility of data, but it is also predicated on open access content, and in that respect should not become a barrier to the wider adoption of OA and the provision of substantially more OA content on which it depends. The proposal needs to be tested in terms of technical implementation and usability as well as acceptability by authors before it is included in a production version of any IR software.

More on the Adelphi Charter

Here's some recent coverage of the Adelphi Charter (blogged here on October 14).

Another law professor defends Google Library

Tim Wu, Leggo My Ego, Slate, October 17, 2005. Wu is a professor at the University of Virginia Law School. Excerpt:
Google has become the new ground zero for the "other" culture war. Not the one between Ralph Reed and Timothy Leary, but the war between Silicon Valley and Hollywood; California's cultural civil war. At stake are two different visions of what might best promote authorship in this country. One side trumpets the culture of authorial exposure, the other urges the culture of authorial control. The relevant questions, respectively, are: Do we think the law should help authors maximize their control over their work? Or are authors best served by exposure --making it easier to find their work? Authors and their advocates have long favored maximal control --but we undergoing a sea-change in our understanding of the author's interests in both exposure and control. Unlike, perhaps, the other culture war, this war has real win-win potential, and I hope that years from now we will be shocked to remember that Google's offline searches [e.g. of print books] were once considered controversial. What I've called the "exposure culture" reflects the philosophy of the Web, in which getting noticed is everything. Web authors link to each other, quote liberally, and sometimes annotate entire articles. E-mailing links to favorite articles and jokes has become as much a part of American work culture as the water cooler. The big sin in exposure culture is not copying, but instead, failure to properly attribute authorship. And at the center of this exposure culture is the almighty search engine. If your site is easy to find on Google, you don't sue --you celebrate....[There] is a common misunderstanding about Googleprint --it is a way to search books, not a way to get books for free. It is not, in short, Napster for books....The big question is whether [book searches] are good for authors...."It's not up to Google or anyone other than the authors, the rightful owners of these copyrights, to decide whether and how their works will be copied" says Nick Taylor, president of the Author's Guild. Taylor isn't suggesting that book search engines are necessarily bad for authors. His objection is that Googleprint has deprived authors of their control --their right to decide whether to be in a book search in the first place....The idea that there is no tradeoff between authorial control and exposure is attractive. But it is also wrong. Individually, more control may always seem appealing --who wouldn't want more control? But collectively, it can be a disaster. Consider what it would mean, by analogy, if map-makers needed the permission of landowners to create maps. As a property owner, your point would be clear: How can you put my property on your map without my permission? Map-makers, we might say, are clearly exploiting property owners, for profit, when they publish an atlas. And as an individual property owner, you might want more control over how your property appears on a map, and whether it appears at all, as well as the right to demand payment. But the law would be stupid to give property owners that right. Imagine how terrible maps would be if you had to negotiate with every landowner in the United States to publish the Rand McNally Road Atlas. Maps might still exist, but they'd be expensive and incomplete. Property owners might think they'd individually benefit, but collectively they would lose out --a classic collective action problem....The critical point is this: Just as maps do not compete with or replace property, neither do book searches replace books. Both are just tools for finding what is otherwise hard to find.

Is search technology good enough to make tagging unnecessary?

Aliya Sternstein, Power search, Federal Computer Week, October 17, 2005. Excerpt:
A recent request for information...asks whether search technology is powerful enough to replace some government standards for information management. "Does current search technology perform to a sufficiently high level to make an added investment in metadata tagging unnecessary in terms of cost and benefit?" the Sept. 15 RFI asks. Responses are due by Oct. 21....Suggested approaches must meet the wide-reaching aim of identifying the most cost-effective means to search for, locate, retrieve and share information. The notice lists seven scenarios to provide context. For example, the government is looking for information on how to help a physician search multiple databases and Web sites for treatments for a defense contractor's unexplained illness. The doctor might not know which agencies provide information on unexplained or service-related illnesses. He or she would also need a way to search nongovernmental sources, and some of the information might not be easily accessible through traditional Internet search engines. In addition to tackling information sharing, vendors' suggested approaches must address the problem of access....The National Institute of Standards and Technology wants to withdraw the [tag-based] Government Information Locator Service [GILS] because the agency considers the search standard obsolete. A July 15 Federal Register notice states that recalling the standard...seems justified because most agencies now use commercial search tools to help people locate government information....

One global consortium is working with foreign governments on a massive information retrieval and sharing project that could influence the U.S. government's path. Earlier this month, groups from industry, government, academia and nonprofit organizations announced plans to provide online versions of books, academic papers, video and audio to the world. The Internet Archive, a nonprofit entity that offers access to historical collections in digital format, will host the Open Content Alliance (OCA). The National Archives of the United Kingdom has already contributed to the effort. The OCA "may significantly help the [U.S.] government in doing their public access mission," Internet Archive co-founder Brewster Kahle said....Kahle said he has been talking to GPO officials for the past year about joining the alliance. The alliance will unveil a technology Oct. 25 that performs nondestructive scans of book pages at high resolutions for 10 cents a page. That cost savings could appeal to GPO and its Federal Depository Library Program, he said.

OA as an alternative to the commercial internet

Richard Poynder, Time to take the red pill, Open and Shut, October 17, 2005. After an extensive discussion of Stephen Arnold's criticisms of Google from a keynote address at Internet Librarian International 2005 (London, October 10-11, 2005), Poynder shifts to OA. Excerpt:
As it turns out, one of the more organised and advanced initiatives with the potential to help create a non-commercial web is the open access (OA) movement — a movement, in fact, in which librarians have always played a very active role. For while the movement's original impetus was solely to liberate scholarly peer-reviewed articles from behind the subscription firewalls imposed by commercial publishers, there are grounds for suggesting it could develop into something grander, in both scope and scale. How come? As scholarly publishers have consistently and obdurately refused to cooperate with the OA movement in its attempts to make scientific papers freely available on the Web, the emphasis of the movement has over time shifted from trying to persuade publishers to remove the toll barriers, to encouraging researchers to do it themselves by self-archiving their published papers, either in institutional repositories (IRs), or in subject-specific archives like the arXiv preprints repository and PubMed Central, the US National Institutes of Health free digital archive of biomedical and life sciences papers....Clearly there is a valuable potential role here for information professionals, should they choose to seize the opportunity. After all, what better way for disenchanted librarians to make themselves indispensable in a new and relevant way — not by playing their traditional role as gateways to information (putting themselves between the information and the user), but as facilitators able to help researchers and other data creators collaborate and share information. If this means abandoning some of their traditional skills for new ones then so be it. Now there's a topic for discussion at Internet Librarian International 2006! The fact is, it's time for information professionals to stop bemoaning the loss of some perceived golden age, and take control of the Web. In short, it's time to reach for the red pill!

2005 pricing trends for STM journanls

Gene Kean, 18th Annual Study of Journal Prices for Scientific and Medical Society Journals, JP: The Newsletter for Journal Publishers, No. 3, 2005. (Thanks to Colin Steele.) Excerpt:
For more than 18 years, prices of periodicals have increased at the rate of three times the Consumer Price Index, more than those of most consumer goods and services in the U.S. While the CPI has risen about 2.8% to 3.1% annually, the ALA Library Materials Price Index Committee studies show that prices of all U.S. periodicals increased annually about 9.4% on an average between 1988–2005 (Table 1). The 2005 edition of the U.S. Periodical Price Index (USPPI) brings better news than recent years’ studies, which featured double-digit increases in both 1998 and 1999, according to the LMPIC. In 2005, the average U.S. periodical (journal) price rose from $328.47 to $349.79, a 6.5% increase, lower than the 8.2% increase in 2004. The scope of this LMPIC study was a selected sample of 3,912 periodical titles published in the U.S....In general, library budgets have not kept pace with serials price increases. In addition, because of a weak dollar in some years, U.S. research libraries have had to bear price increases of 20% or more for some overseas journals. All of this has contributed to the current condition of the library market, which is very tight and selective in the purchase of new titles. All libraries and journals have been affected by budget cuts. Most libraries no longer purchase new titles without cutting at least one or two other titles....Our Allen Press 18-year study shows that prices for U.S. society-published journals increased an average of 7.4% annually during 1988–2005. The journal price increase was above the CPI but substantially below the average annual price increase for all U.S. periodicals of 9.4% during the same period (Table 1). Librarians interviewed said that nonprofit society journals tend to be lower priced and have much smaller price increases than commercially published journals. Because of this, librarians consider society journals good buys.

More on the Kaufman-Wills report and the Wellcome Trust policy

Sophie Rovner, Status Report On Open Access, Chemical & Engineering News, October 17, 2005. Excerpt:
A study of open-access publishing released last week shows mixed results for the enterprise. Meanwhile, Wellcome Trust --the U.K.'s largest nongovernmental funder of biomedical research-- has now committed to open access. Sixty percent of surveyed open-access journals are either making a profit or breaking even, says the study, which is billed as the first substantial effort of its kind. The study's sponsors are the Association of Learned & Professional Society Publishers (ALPSP), the American Association for the Advancement of Science, and HighWire Press. The remaining 40% of open-access journals are not yet covering their costs and face an uncertain future, the study says....One surprising finding is that author fees are less common than in subscription journals. Open-access journals rely more on grants, subsidies, and volunteer labor. Thus, “it doesn't follow that as the journals become more mature they would necessarily become more profitable,” Morris says. Publisher Matthew Cockerill, of the open-access publishing firm BioMed Central, says the report “contains some useful information” but “draws many unwarranted conclusions. The fact that many open-access journals operate at a loss is simply a sign that these are early days.” He adds that increased submissions and higher author fees are bringing BMC closer to profitability. Meanwhile, the Wellcome Trust has begun requiring grant recipients to deposit their research papers in the open-access PubMed Central article repository for release within six months of publication. Publishers are scrambling to develop a response.

Sunday, October 16, 2005


The OpenDocument Fellowship is calling for volunteers to help develop OpenFormula, the standard for spreadsheet formulas to extend the OpenDocument Format. Interested volunteers should join the mailing list.