Open Access News

News from the open access movement


Thursday, May 08, 2008

Profile of ChemSpider in Nature News

Geoff Brumfiel, Chemists spin a web of data, Nature News, May 7, 2008.

A chemist running a computer server from his home is quietly solving one of his colleagues' biggest frustrations by providing the community with an open-access source of chemical information.

Although biologists have enormous public databases of genes and proteins, chemists usually have to pay for access to data on molecules. Chemist Antony Williams is hoping to change this in a move likely to ruffle the feathers of the American Chemical Society. Williams, a private consultant based in Wake Forest, North Carolina, has started a website called ChemSpider that has compiled data on nearly 20 million molecules in a year.

The modest project has made chemists interested in open access take notice — last week, the number of daily users of the site surpassed 5,000. ...

Chemical data have long been available, but at a hefty price. The largest supplier of such information is the American Chemical Society's Chemical Abstracts Service. The service, which is more than a century old, includes data on roughly 35 million molecules. But university and industry chemists must pay thousands of dollars to use the database. The society will not reveal numbers, but fees for using the database are thought to make up a substantial portion of its US$311-million annual income from 'electronic services'. ...

In recent years, several public sources for chemical information have appeared on the scene. The largest, PubChem, is run by the National Library of Medicine in Bethesda, Maryland, and contains data on some 19 million chemical structures. But PubChem's data focus on biological information, according to Williams. Other potential sources of information, such as Wikipedia, lack the algorithms needed to search chemicals according to their structure. “I noticed there was this gap,” says Williams. “So I decided to try an experiment.”

Rather than building up a database, the ChemSpider service scans open-access sources, including PubChem and Wikipedia, for chemical data. It compiles the publicly available information in a single location, and allows users to follow links to the original source material. The site is maintained with modest profits from advertising and the work of about 30 active volunteers who double-check the data pulled in from outside.

The site is not without its flaws. ...

But Williams nevertheless believes that the service may be able to compete with for-profit services. ...

See also Williams' comments on the ChemSpider blog:
... The original investment in hardware and software costs has finally been recouped. Modest profits? No one gets paid for the work we do. ...