Carl Malamud has this funny idea that public domain information ought to be... well, public. He has a history of creating public access databases on the net when the provider of the data has failed to do so or has licensed its data only to a private company that provides it only for pay. His technique is to build a high-profile demonstration project with the intent of getting the actual holder of the public domain information (usually a government agency) to take over the job.
1. The short-term goal is the creation of an unencumbered full-text repository of the Federal Reporter, the Federal Supplement, and the Federal Appendix. 2. The medium-term goal is the creation of an unencumbered full-text repository of all state and federal cases and codes.
This is clearly public data, but as Carl wrote in a letter to West Publishing that accompanies the first data release on his site, asking for clarification about what information West considers proprietary versus public domain....
In private email, Carl wrote:
The SEC database was fairly straightforward, taking a couple of years of hard work. But, getting patents online took 5 years of drawing lines in the sand and sending shots across the bow. Our line in the sand here is all state and federal cases and codes, and I guess our shot across the bow is publishing a 3.6 gbyte tiff file and announcing our intention to systematically walk through the 5 million or so pages of federal case law.
That's a big challenge, but with computing power and storage getting ever cheaper, and with the dedication of volunteers like Carl, it does indeed seem like a possible project. (After all, when Carl pressured the SEC to put its Edgar database online in the early 90's, they said it would take years and millions of dollars. Carl did it in six weeks, and operated the database for two years before persuading the SEC to take it over.) ...
Routing around traditional publishers who want to create friction (or barriers to entry) for online access to data isn't easy. This is the same extended tussle that ScienceCommons.org is engaged in. In the end, the gatekeepers should lose, particularly where the public benefits so far outweigh the private returns to the publishers. A cure for Parkinson's, made possible because scientists can easily share data across disease silos, or another royalty for Reed Elsevier? You be the judge.
Peter Suber at 8/21/2007 03:17:00 PM.
The open access movement:
Putting peer-reviewed scientific and scholarly literature
on the internet. Making it available free of charge and
free of most copyright and licensing restrictions.
Removing the barriers to serious research.