Open Access News

News from the open access movement


Saturday, February 07, 2009

ACM recommends libre OA for government data

The Public Policy Committee of the Association for Computing Machinery has released its Recommendations on Open Government.  (Thanks to David Robinson.)  Excerpt:

Recommendations

  • Data published by the government should be in formats and approaches that promote analysis and reuse of that data.
  • Data republished by the government that has been received or stored in a machine-readable format (such as online regulatory filings) should preserve the machine-readability of that data.
  • Information should be posted so as to also be accessible to citizens with limitations and disabilities.
  • Citizens should be able to download complete datasets of regulatory, legislative or other information, or appropriately chosen subsets of that information, when it is published by government.
  • Citizens should be able to directly access government-published datasets using standard methods such as queries via an API (Application Programming Interface).
  • Government bodies publishing data online should always seek to publish using data formats that do not include executable content.
  • Published content should be digitally signed or include attestation of publication/creation date, authenticity, and integrity.

In an earlier post, David Robinson gave a good example of the potential benefits:

...Ultimately, we all want information about bailout spending to be available in the most user-friendly way to the broadest range of citizens. But is a government monopoly on "presentations" of the data the best way to achieve that goal? Probably not. If Congress orders the federal bureaucracy to provide a web site for end users, then we will all have to live with the one web site they cook up. Regular citizens would have more and better options for learning about the bailout if Congress told the executive branch to provide the relevant data in a structured machine-readable format such as XML, so many sites can be made to analyze the data....

It would be a travesty to make government the only source for interaction with bailout data —the transparency equivalent of central planning. It would be better for everyone, and easier, to let a thousand mashups bloom....

Update (2/10/09).  More from the ACM:  Make Recovery.Gov Web 2.0 Friendly.

...Today we turn our attention to an obscure requirement of [The American Recovery and Reinvestment Act], which requires a website called “Recovery.gov” to house all of the grant data that would be generated from spending under the act. USACM sent a letter calling for the website’s requirements to include the ability to download complete data sets in machine-readable form.

The legislation specifies a number requirements for the website, including this one dealing with data accessibility:

  • “(3) The website shall provide data on relevant economic, financial, grant, and contract information in user-friendly visual presentations to enhance public awareness of the use funds made available in this Act.” (our emphasis)

This is clearly an important provision, but it misses a key element for the web 2.0 culture, namely the reuse of that data. Last week, USACM released its recommendations on enhancing open-government, which recommended (among other things):

  • Data published by the government should be in formats and approaches that promote analysis and reuse of that data.
  • Data republished by the government that has been received or stored in a machine-readable format (such as online regulatory filings) should preserve the machine-readability of that data.

One important step that Congress could take toward making Recovery.gov more useful for the public is building these principles into the requirements specified by the legislation....