Open Access News

News from the open access movement

Wednesday, October 22, 2008

Data handling in different repository software

Dorothea Salo, Content, presentation, and behavior, Caveat Lector, October 20, 2008.

... DSpace and EPrints make certain assumptions about the files they take in. Key for our purposes is that they assume that all they have to do to mediate between a file and its end-user is serve it up in response to a request. Ask, give, end of story. ...

For a dataset, this ask-and-give assumption is pure disaster. Hardly anybody wants a whole dataset boiled down into a single file. Hardly anybody creates a dataset that way. Sure, they�ll tell you they just have the one spreadsheet, but that doesn�t count the data dictionary and the lab notebooks and the field notes and the et cetera. What�s more, datasets don�t want to be treated as unitary objects; ask-and-fetch just doesn�t work. Query, slice-and-dice, facet, analyze, number-crunch, mash up�that�s what people want to do with a dataset. They want it to have an API.

And all DSpace and EPrints can do is say �durrr, here�s a file.� ...

Les Carr, Data Access in Repositories - Don't Overlook What We Already Have!, RepositoryMan, October 21, 2008.

Dorothea Salo's latest blog entry takes EPrints and DSpace to task for not being able to help users analyse (query, slice-and-dice, facet, analyse, number-crunch, mash-up) data files.

You can already do that, at least you can in Microsoft Excel anyway. As an example, I chose a data file that is already in the MINDS reporisoty (DSpace) and one that is in my school repository (EPrints) and created a new spreadsheet on my desktop that referenced data ranges in both of the archived data sets. ...

It is an interesting issue, to think what the data-oriented functions are that a repository can provide. However, we should not overlook the functions that we already have! ...

Posted by Gavin Baker at 10/22/2008 11:36:00 AM.

The open access movement:
Putting peer-reviewed scientific and scholarly literature on the internet. Making it available free of charge and free of most copyright and licensing restrictions. Removing the barriers to serious research.

Why the OAN volume has been low since January 16, 2010

Why I curtailed my blogging on July 1, 2009

I recommend the OA tracking project (OATP) as the best way to stay on top of new OA developments. You can read the OATP feed on a blog-like web page or subscribe to it by RSS, email, or Twitter. You can also help build the feed by tagging new developments you encounter.