Open Access News

News from the open access movement

Thursday, July 12, 2007

Harvesting articles for local repositories through the DOAJ

Article-Level OAI-PMH Harvest Available from DOAJ, Disruptive Technology Library Jester, July 11, 2007.  Excerpt:

Earlier this year the DOAJ began offering a new schema for registered articles that significantly improves the value of OAI-PMH harvested article content. Prior to this addition the only scheme available was Dublin Core, which as a metadata schema for describing article content is woefully inadequate....The new schema...includes elements for ISSN/eISSN, volume/issue, start/end page numbers, and author affiliation. There is also a <fullTextUrl> element that is a link to the article content itself (not the splash page of the article on the publisher’s site).

Article content using this schema is harvestable through the DOAJ OAI-PMH provider site....This is, in fact, the same schema journal publishers use to submit article content to the DOAJ article database. With these pieces in place, it is now conceivable to harvest open access journal article content through the DOAJ and add it to a local journal article repository (such as the Electronic Journal Center in the case of OhioLINK).

Thanks go out to the DOAJ folks for making this available!

Also see this comment by Eric Lease Morgan:

Yep, kudos to DOAJ.

I saw this a week or two ago, and while I did not take advantage of their article-specific metadata scheme, I did use the Dublin Core metadata scheme to harvest about 54,000 of the articles and save them into a MyLibrary instance. I then used an indexer called Kinosearch to make them searchable. Finally I created a rudimentary searchable/browsable interface to the whole thing. See [this].  Ah, the possibilities are almost endless!