Open Access News

News from the open access movement

Monday, August 15, 2005

Profile of the Internet Archive

Scott Kirsner, Saving the world as we know it, San Francisco Chronicle, August 15, 2005. Excerpt:
One of the great wonders of the modern world is being constructed here [in San Francisco], on a former military base called the Presidio....The Internet Archive has the ambitious goal of offering "universal access to human knowledge," and, in pursuit of that...the archivists are collecting every sort of digital file imaginable....Brewster Kahle is the MIT-educated former entrepreneur who began building the library in 1996, for the simple reason that "nobody else seemed to be doing it," he says. Now, he realizes that he has undertaken a task with no obvious stopping point....Each month, the Internet Archive collects the equivalent of one Library of Congress, says Kahle. The collection, available at, has already surpassed one petabyte. That's a million gigabytes....Kahle is starting an initiative to scan out-of-print books, and make them available online. Of course, many books that are out of print are still protected by copyright, so Kahle has also filed a lawsuit against the United States to free those works. (The suit is currently pending appeal.) Google's working on a similar book-scanning initiative in partnership with several large libraries, but Kahle says that Google seems more interested in making the text searchable, rather than offering the full text online as the Internet Archive hopes to do....The Internet Archive also sponsors a small fleet of Internet bookmobiles -- which operate in San Francisco, Egypt, India, and Uganda -- that allow people to find full-text books online and print out their own paperback copies. Kahle says the cost of lending a book out can approach $2 for some libraries; printing out a black-and-white copy on-demand can cost as little as 50 cents....When the organization runs up against technical barriers that seem insurmountable, it chisels away at them. It couldn't find a storage device on the market that was capable of holding a petabyte of data inexpensively, and consuming little power. So the Internet Archive simply built one on its own, called the petabox. (You can build your own in the basement, since they made the design available as an open-source document.)...''You have to think about getting it off its old media, and getting it to run," says Kahle. The Internet Archive already sought and won an exemption from the Digital Millennium Copyright Act of 1998, which prevented the group from breaking the software's copy protection. It seemed, to Kahle, a problem worth solving.