Open Access NewsNews from the open access movement Jump to navigation |
|||
Doing science in a world of shared, voluminous data
Alexander Szalay and Jim Gray, 2020 Computing: Science in an exponential world, Nature, March 22, 2006. Excerpt:
[D]ata volumes are doubling every year in most areas of modern science and the analysis is becoming more and more complex....With data correlated over many dimensions and millions of points, none of the old steps — do experiment, record results, analyse and publish — is straightforward. Many predict dramatic changes to the way science is done, and suspect that few traditional processes will survive in their current form by 2020....As data volumes grow, it is increasingly arduous to extract knowledge. Scientists must labour to organize, sort and reduce the data, with each analysis step producing smaller data sets that eventually lead to the big picture. Analysing terabytes of data (one terabyte is 1,000 gigabytes) is a challenge; but petabyte data sets (of more than 1,000 terabytes) are on the horizon. One petabyte is equivalent to the text in one billion books, yet many scientific instruments, including the Large Synoptic Survey Telescope, will soon be generating several petabytes annually.... PS: Also see other Nature articles from the same issue on 2020 Computing (all OA). The Szalay-Gray article above is based on a longer report from Microsoft Research. |