Open Access News

News from the open access movement

Tuesday, November 06, 2007

Defending the Google Library project

Paul Courant, On being in bed with Google, Au Courant, November 4, 2007. (Thanks to Charles Bailey.) Courant is the Dean of Libraries at the University of Michigan, which was an early partner in the Google Library project. Excerpt:

...I believe that the University of Michigan (and the other partner libraries) and Google are changing the world for the better. Four years from now, all seven million volumes in the University of Michigan Libraries will have been digitized – the largest such library digitization project in history. Google Book Search and our own MBooks collection already provide full-text access to well over a hundred thousand public domain works....

So I’m puzzled when people ask, “How could serious libraries...abdicate their responsibilities as custodians of the world’s knowledge by offering their collections up as a sacrifice on the altar of corporate power? Why don’t they join the virtuous ranks of the Open Content Alliance partners, who pay thousands of dollars to digitize books at a rate of tens of thousands of volumes a year?” It seems like those who ask such questions have little appreciation of what Michigan and the other Google partners are actually up to.

Google is on pace to scan over 7 million volumes from U-M libraries in six years at no cost to the University. As part of our arrangement with Google, they give us copies of all the digital files, and we can keep them forever. Our only financial outlay is for storage and the cost of providing library services to our users. Anyone who searches U-M’s library catalog, Mirlyn, can access the scanned files via our MBooks interface. That’s right, anyone. (Copyright law constrains what we can display in full text, and what we can offer only for searching, but we share as much as we can consistent with prudent interpretations of the law.) For an example of an MBook, take a look at The Acquisitive Society by R. H. Tawney.

In a recent New York Times article about mass digitization projects, Brewster Kahle was quoted as saying: “Scanning the great libraries is a wonderful idea, but if only one corporation controls access to this digital collection, we’ll have handed too much control to a private entity.”

I agree with him....So I would be distressed if a single corporation controlled access to the collections of the great academic libraries, just as I find it troubling, on a smaller scale, that a handful of publishers control access to much of the current scientific literature.

But Google has no such control. After Google scans a book, they return the book to the library (like any other user), and they give us a copy of the digital file. Google is not the only entity controlling access to the collection – the University of Michigan and other partner libraries control access as well. Except we don’t think of it as controlling access so much as providing it.

Since 2005, Siva Vaidhyanathan has been making and refining the argument that libraries should be digitizing their collections independently, without corporate financing or participation, and that those who don’t are failing to uphold their responsibility to the public. “Libraries should not be relinquishing their core duties to private corporations for the sake of expediency.”

“Expediency” is a bit of a dirty word. Vaidhyanathan’s phrase suggests that good people don’t do things simply because they are “expedient.” But I view large-scale digitization as expeditious. We have a generation of students who will not find valuable scholarly works unless they can find them electronically. At the rate that OCA is digitizing things (and I say the more the merrier and the faster the better) that generation will be dandling great-grandchildren on its knees before these great collections can be found electronically. At Michigan, the entire collection of bound print will be searchable, by anyone in the world, about when children born today start kindergarten....

We are not relinquishing our duties in the name of expediency; we are working with a capable partner to create a far more useful resource than we could create on our own. (Would I prefer that a charitable foundation would support this work on the same schedule as Google, and make everything available to everyone, subject only to copyright restrictions? You bet. I would prefer it even more if that foundation would buy out all of the rights holders for all out of print works. Can someone tell me the name of the foundation, please? In the meantime, it seems to me that being in bed with Google is way better than sleeping alone.)

It’s true that the digitized files from Google’s scans are often far from perfect....I will discuss some of the specific steps we are taking to address quality in a future post, but for now I will just say that the solution of these problems will require the serious engagement of academic libraries, and that the visibility of the problems is essential to their solution....

Update. Also see the response by Siva Vaidhyanathan. The comments to Siva's post include a reply from Paul Courant and another reply from Siva.

Posted by Peter Suber at 11/06/2007 10:24:00 AM.