Open Access News

News from the open access movement


Thursday, June 12, 2008

Peter Murray-Rust at Talis Research Day

Owen Stephens has blogged some notes on the Talis Project Research Day (Birmingham, June 10, 2008).  Excerpt:

...First up - Peter [Murray-Rust]: ...

Peter has a go at publishers - claiming that publishers are in the business of preventing access to data, rather than facilitating it (at this points asks if there are any publishers in the audience - two sheepish hands are raised). Peter, also mentioning that Chemistry is particularly bad as a discipline in terms of making data accessible - with the American Chemical Society being real offender....

Peter...showing a graph on the levels of Atmospheric Carbon Dioxide. If this was in paper form and we wanted to do some further analysis - it would take a lot of effort to take measurements off the graph - but if we have the data from behind the graph, we can immediately leap to doing further work.

Peter...[s]howing a pdf of an article from Nature - and making the point that all looks great (illustrations of molecules, proteins and reactions etc.) but completely inaccessible to machines.

Peter noting that most important bio-data that is published is publicly accessible and reusable - but this is not true in chemistry. This means in the article, the data about the proteins is publicly accessible, but the information on chemical molecules is not - although covered in the same article....

Peter now showing how a data rich graph is reduced to a couple of data points to 'save space' in journals - a real paper-based paradigm - we need to get away from this....

Peter noting that most researchers have experience data-loss - and this can be a real selling point for data and publication repositories.

Peter showing a thesis with many diagrams of molecules, graphs etc. Noting there is no way to effectively extract the information about molecules from the paper, as it is a PDF. He is demonstrating a piece of software which extracts data from a chemical thesis - demonstrating this from a thesis authored in Word, and using OSCAR (a text-mining tool tuned to work in Chemistry) - and shows how it can extract relevant chemical data, can display it in a table, reconstruct spectra....

Peter now demonstrating 'CrystalEye' - a system which spiders the web for crystals - reads the raw data, draws a 'jmol' view (3d visualisation) of the structure, links to the journal article etc. This brings together many independent publications in a single place showing crystal structures. Peter saying this could be done across chemistry - but data is not open, and there are big interests that lobby to keep things this way (specifically mentioning Chemical Abstracts lobbying the US Government)....

Peter saying that, for example, there should be a trivial way of watermarking images so that researchers can say 'this is open' - and then if it is published, it will be clear that the publisher does not 'own' or have copyright over the image....