Open Access News

News from the open access movement

Saturday, April 08, 2006

Alf Eaton, Open Text Mining Interface (OTMI), HubLog, April 7, 2006. Excerpt:

The Open Text Mining Interface (OTMI) is a proposed method for making available the text of journal articles for indexing and analysis, while preserving any subscription model that funds the journals. This approach, presented in a Web 2.0 session at the Bio-IT World conference earlier this week, uses an Atom XML version of each article, with OTMI namespaced extensions, to provide all the sentences of the article in alphabetical order. Some extra information such as word frequency is also presented, but this could presumably be derived from the sentence text anyway. All the articles in the 2020 Computing issue of Nature have OTMI files linked using <link rel="OTMI" type="application/atom+xml" href=""/> - here’s an example file.

Comment. I have to commend the developers. Insofar as it's useful, however, OTMI will counteract what I've called the software strategy for OA: using very cool and useful tools optimized for OA files as incentives for authors and publishers to make their work OA. OTMI doesn't preserve information about what which sentences are adjacent or even proximate, foiling attempts to reconstruct a readable version of the text. While this is an essential virtue of OTMI for toll-access publishers, I suspect that it's a vice for hard-core text-mining. There have to text-mining applications for which OTMI files will be less useful than full-text originals with sentence-sequence and other contextual information intact. In any case, OTMI will reduce the number of text-mining apps that support the software strategy for OA.

Posted by Peter Suber at 4/08/2006 09:34:00 AM.

The open access movement:
Putting peer-reviewed scientific and scholarly literature on the internet. Making it available free of charge and free of most copyright and licensing restrictions. Removing the barriers to serious research.

Why the OAN volume has been low since January 16, 2010

Why I curtailed my blogging on July 1, 2009

I recommend the OA tracking project (OATP) as the best way to stay on top of new OA developments. You can read the OATP feed on a blog-like web page or subscribe to it by RSS, email, or Twitter. You can also help build the feed by tagging new developments you encounter.