Open Access News

News from the open access movement


Friday, February 08, 2008

Peter Murray-Rust and the data-mining robots

Richard Poynder, Peter Murray-Rust and the data-mining robots, ComputerWeekly.com, February 5, 2008. (Thanks to Thanks to Jennifer McLennan.)
Peter Murray-Rust, a reader in molecular informatics at the University of Cambridge, has a vision. In his vision, software robots roam the network collecting scientific information, which they aggregate and process to arrive at new insights. Sometimes they make scientific discoveries.

Before this can happen, however, an enabling infrastructure will need to be built - a task Murray-Rust has been dedicated to for 30 years. During that time he has also become a passionate supporter of the Open Data movement, which advocates for non-textual material such as chemical compounds, genomes, mathematical and scientific formulae, and bioscience data to be made freely available on the web.

Murray-Rust's epiphany came while on sabbatical in Zurich in the late 1970s, where he spent most of his time in a library, poring over the thousands of molecular structures published in chemical journals. "I spent six months going through the literature and came home with several hundred data points," he says. "Each data point was the product of a visit to the library to find a single piece of information in a journal."

For every molecule he wanted to research, he had to extract all the published data and then do a complex calculation. "A paper might give you the coordinates of the atoms of a molecule, for instance, but not the distances. So you had to do a little sum - and until you had done the sum you did not know whether the answer made sense," Murray-Rust says. ...

Leafing through the journals in Zurich, Murray-Rust was struck by how much valuable data was hidden within them, but was convinced that there had to be a better way of extracting them. ...

When the web exploded into life, Murray-Rust saw its potential immediately - particularly after Tim Berners-Lee outlined his vision of the Semantic Web, promising the advent of complex machine-to-machine interaction. Murray-Rust vowed to create a Semantic Web of chemicals. ...

[T]he main challenge is to persuade researchers and publishers to share their data, which is why Murray-Rust is now a passionate advocate of Open Data - a cause to which he spends an increasing amount of time, involved in activities such as lobbying publishers, educating researchers, and alerting the world to the issue via his blog.

He does not grudge this. "Sharing our knowledge is a necessary but not sufficient condition for saving the planet," he says. "And here I am not just talking about global warming - I am also talking about how we save the planet from disease, from ignorance, and from all sorts of other things. Open Data are a critical part of the infrastructure we need for 21st century living. And this goes way beyond science - it is also about things like map data, climate data, and traffic data. So you are going to be hearing a lot more about open data in the next five years." ...