Open Access News

News from the open access movement


Monday, April 30, 2007

Collections of open molecules

Peter Murray-Rust, Repositories or Lists of Open Molecules, A Scientist and the Web, April 29, 2007.  Excerpt:

I am looking for lists (or repositories) of small molecules with connection tables (or machine-parsable molecular structures) which are Open. By Open I mean that anyone can, in principle download, copy or clone part or all of the site, re-use the information and redistribute without reference to the original site. At present I am aware of:

  • Pubchem (10 million+ , superset of many Open datasets including NCI. I use this term to subsume everything at nih.gov)
  • ChEBI (> 25 000 terms collected at EBI, not all with connection tables)
  • MSD (ligands in Protein structures, collected at EBI > 5000)
  • WWMM (250, 000 calculated structures from NCI database). Reposited in DSpace,
  • Crystallographic Open Database crystal structures collected from the literature or donated. Soon to be complemented with CrystalEye. This should give nearly 100,000 crystal structures.
  • The BlueObelisk Data Repository (BODR). A collection of critical information collected by BO volunteers primarily as reference data for (Open) software. (includes non-molecular stuff like elemental properties). BODR is widely distributed on Gnome and other Open Source distros.

I’ve almost certainly missed some so please let us know....