Open Access News

News from the open access movement

Saturday, May 24, 2008

Harvesting chemical data from published articles

Peter Murray-Rust has blogged some notes on his talk at the Royal Society of Chemistry meeting, Open Access Publishing in the Chemical Sciences (London, May 22, 2008). First he endorses Christoph Steinbeck's summary of his talk and then adds some additional notes:

The main thing we took away was the importance of factual data. No-one disputed that facts could not be copyrighted (though not all realised that copyright was only one of the methods used by publishers to control access and re-use - server-side beheading is completely effective). I asked the audience - > 30 composed of publishers, librarians, software companies, etc. - no actual chemists of course - whether anyone would object to our robots reading the literature and extracting the data from the papers whether as text, images of tables. Half the audience thought I should, the rest didn’t vote against.

So, publishers, I’m going to start mining data from your sites. I hope you welcome this as a way forward to a new exciting era of data-rich science publishing. I hope that if you don’t agree you’ll let me know. I wouldn’t like to start and then get the lawyers sent. So please comment - it’s very important. I shan’t attack anyone who sends a reply. And you can send it by confidential email if you like.

There are a million new compounds each year in the scholarly literature. Our robots can produce huge amounts of good information from it. In some cases we get over 90% recall and precision - it depends on the type. This must be good for science. So please, publishers, let us know we can do it and we’ll publicly thank you. And if you don’t like the idea, please let us know why....

Posted by Peter Suber at 5/24/2008 09:56:00 AM.

The open access movement:
Putting peer-reviewed scientific and scholarly literature on the internet. Making it available free of charge and free of most copyright and licensing restrictions. Removing the barriers to serious research.

Why the OAN volume has been low since January 16, 2010

Why I curtailed my blogging on July 1, 2009

I recommend the OA tracking project (OATP) as the best way to stay on top of new OA developments. You can read the OATP feed on a blog-like web page or subscribe to it by RSS, email, or Twitter. You can also help build the feed by tagging new developments you encounter.