     May 25, 2001

Free searching of unfree texts

If the full-text of a journal is not online free of charge, the next best thing is free full-text searching of the journal's contents.

Scirus is a search engine specializing in scientific literature, officially launched just last month.  The content comes from pay and no-pay scientific sites all over the web.  Above all, it comes from the full corpus of Elsevier science journals.  The power comes from Fast Search and Transfer (FAST), the speedy and comprehensive Norwegian search engine.

Elsevier's portion of the Scirus database consists of 1,200 online journals, 1.2 million online articles, 30 million article summaries, 6.8 million links from citations, 8 science portals, and 50 million indexed web pages.  Most of the Scirus journals, including the Elsevier journals, use CrossRef.

Run a search on any scientific topic, and you'll get hits from pay and no-pay electronic science journals.  When a search brings up an article not freely available online, the hit is flagged with a special icon.  When the article is from an archive like BioMedNet or ChemWeb, you can register yourself with the archive (free of charge) and read what Scirus calls a "summary plus" of the article, but not full-text.  If you or your institution is a paying subscriber, you can see full-text.

There are two welcome innovations here.  First, texts which are not freely available are at least freely searchable.  Even users who have paid nothing get bibliographic citations and URLs for articles relevant to their search requests.  I remember a service in my field called Brain-Wave Philosophy, which searched the commercial database called Philosophers Index.  Brain-Wave charged $0.50 per search and $2.00 per citation.  I'm glad to report that its business model failed miserably and quickly.  Second, Scirus automatically searches large chunks of what is called the deep internet, the online databases whose content is invisible to ordinary users because it is not indexed by regular search engines.  By some estimates the deep internet has 500 times the content of the visible internet.

Cambridge University Press has introduced a similar but more limited service.  Users may create a personal page at the Cambridge Journals Online (CJO) web site called My CJO home page.  Once you sign up, at no charge, you may search all Cambridge journals.  As with Scirus, you get bibliographic citations and URLs matching your search query, even in journals to which you have no full-text access.  Paying subscribers can see full-text for each hit.  The Cambridge search engine is much less flexible than the Scirus engine, but it does add current awareness.  You can sign up to receive tables of contents of any Cambridge journal by email.  You can store searches to rerun later, but you cannot run them at automatic intervals and receive the hits by email.

If you know of other publishers offering free full-text searching of unfree literature, please let me know about them.



"My" CJO home page (Cambridge Journals Online)
(This link has been flaky during the past week.)

Postscript.  Here are the major search engines covering the "deep internet".  Each of them includes at least one scholarly database.

CompletePlanet, http://www.completeplanet.com/
The Invisible Web, http://www.invisibleweb.com/
ProFusion, http://beta.profusion.com/
Researchville, http://www.researchville.com/


Finding free online scholarship

By now every discipline has at least a few peer-reviewed journals with full-text on the internet free of charge.  How can you find the ones in your field?  How can you tell whether the number is growing, stable, or shrinking?

So far there aren't good answers to these questions.  In the last issue I mentioned Periodicals.net, which does a good job covering online periodicals of all kinds, although for a fee.  At the same time I should have mentioned the Directory of Scholarly Electronic Journals from the Association of Research Libraries, which is equally thorough.  It too is available only to paying subscribers.

Are there any free directories of free online scholarship?  There are many gateways or portals to subsets of this literature.  For example, most major libraries have a page of links to online texts.  But while some of these are more extensive than others, none is very successful at capturing the full range of FOS now available.

A few that seemed to aspire to completeness haven't been updated in a year or two, apparent victims of the continuing explosion of web content.  In this category we could put the guides produced by the University of Houston Libraries and the Electronic Frontier Foundation.  Of the more comprehensive lists still being updated, I can recommend the one produced by the Loughborough University Library

The most thorough free guide I've seen is PubList.  Based on Ulrich's Periodicals Directory, It is international, organized by field, and very comprehensive.  Behind PubList in information per journal, if not also in range, is NewJour.  While it does not organize the journals by field, NewJour does support email notification of new journals.  Neither service is limited to FOS journals.

JAKE (Jointly Administered Knowledge Environment) organizes electronic full-text journals from cooperating sources, whether free or for-pay, and makes the contents searchable and linkable.  In that sense it goes well beyond a directory.  The JAKE software is open source.

Among the useful guides to the guides are _Electronic Journals_, from Harrassowitz, and _Serials in Cyberspace_, from the University of Vermont.

If you have a favorite guide to free online journals, text archives, and digital books, or to any significant subset of this domain, please post it on our new links page with a short description of its virtues. (More on the new links page below.)

What we need is an open directory of free online scholarship.  Like the Open Directory Project, it could harness the labor of an army of volunteer editors.  It should be organized by academic field, and indicate for each journal or archive when it began its online existence, whether it is peer- reviewed, whether it takes advertising, whether it publishes the full-text of any articles, whether it publishes full-text for all or only selected articles, whether it supports any of the standards for scholarly metatags, whether it supports current awareness (email delivery of tables of contents, abstracts, full issues, or stored searches), whether it supports online discussion, whether it offers all or only some of its services free of charge, whether it has a print edition, whether it accepts articles previously posted to the author's web site or to a preprint archive, how it archives back issues, and whether the archive is searchable.  You get the idea.

For each journal it should supply the URL for the journal home-page and (if they differ) the URLs for subscription information and submission details for authors.  It could also track the many gateways, guides, and portals to this literature, and indicate their disciplinary coverage, their regional coverage, and their update frequency.  The whole directory should be organized in a database so that users may browse it or run search and sort queries across its contents.

I don't have the time to launch such a directory, though I'd be tireless in publicizing and supporting it if someone else launches it.  If this newsletter can help inspire and organize the effort, or if you know a foundation which might fund such a project, just drop me a line.

Directory of Scholarly Electronic Journals and Academic Discussion Lists
From ARL

Electronic Journals
From the Loughborough University Library

PubList, The Internet Directory of Publications



Electronic Journals: A Selected Resource Guide
From Harrassowitz.

Serials in Cyberspace
From the University of Vermont

Open Directory Project

Postscript.  What existing guides come closest to the open directory I described?  Let me know your thoughts.  Here's one to check out:



_New Scientist_ joins the debate

In the May 26 _New Scientist_ StevanHarnad has a short piece defending what he calls the literature liberation movement.  It's not on the NS web site yet, but you can see it, as well as a longer version of the same argument, at Harnad's web site.

Stevan Harnad, The (Refereed) Literature-Liberation Movement


Copyright law relaxed for educational content

Existing copyright law prevents universities from broadcasting digital dramatic works like plays and operas to distance education students.  On May 17, the Senate Judiciary Committee approved legislation to change this and create an exemption for non-profit schools broadcasting encrypted content for educational purposes.  The Association of American Publishers helped negotiate the bill with university lobbyists, and approves the final language.  The full Senate is expected to pass the bill later this month.  Language in the bill makes it easy for Congress to change its mind if the encrypted data streams turn out to be easy to hack and steal.

Andrea Foster, Senate Committee Favors Letting Instructors Use More Digital Works in Online Classes
From _The Chronicle of Higher Education_

Text of the bill


Free internet access

ConnectNet helps bridge the digital divide by showing people without internet access how to get internet access free of charge.  For those who can't even visit the database on the web, there is a toll-free number providing the same information (866-583-1234).

ConnectNet (English)

ConnectNet (Spanish)

Rebecca Weiner, Finding Free Internet Access for Those Without
From _The New York Times_


P2P anti-censorship tool

P2P file-sharing tools always had the potential to go beyond music and movies to news stories and scholarly articles.  The hacker group, Cult of the Dead Cow (CDC), has developed a new anti-censorship tool with just these uses in mind.  The tool, called Peekabooty, will allow users anywhere to download content prohibited or restricted where they live or work.  This could be an oppressive country, an oppressive cubicle, or a library running mandatory filtering software.  Peekabooty will be launched in July at the Defcon security conference.

Free online scholarship would mean universal access --if we didn't also have to reckon with the digital divide (poverty) and oppressive governments (censorship).  We need tools like Peekabooty to address the second problem.

Ann Harrison, Peekabooty challenges online censorship
From _Network World File-Sharing Newsletter_


In other publications

* A recent story (May 17) in the _Online Journalism Review_ discusses methods for translating articles formatted in XML into HTML for online publication.  The story focuses on newspaper publication, but the methods it identifies will be just as useful for scholarly journals.

* In a recent article (May 3) posted at Advogato, a grad student expresses frustration that his laboratory's automatic copyright on dissertations will limit the usefulness of his work.  He wonders what he can do about it, and his question has generated some answers and commiseration at the site.


Catching up

I missed last year's conference at New York University School of Law called, "A Free Information Ecology in the Digital Environment."  But I just discovered the conference proceedings online, both in text and video format.  There are some important scholars here talking directly about the economic feasibility and political desirability of free online scholarship.

Conference abstracts, papers, and video (RealPlayer) files


New at the FOS web site

I write this newsletter because I care about a cause.  I recently wrote a brief statement of my editorial position, trying to answer the question, "What cause do I hope to advance through this newsletter?"  I wrote it partly to reveal the perspective which governs my writing and news-gathering, and partly to advance the cause itself.  I hope you'll take a look.  If you have comments, please post them to our discussion forum.

Editorial position of the FOS Newsletter


New at the site, cont.

Also since the last issue, I've created a page allowing users to submit FOS-related links.  Thanks to Phil Greenspun for writing the software to automate this process.  If you want to add relevant sites, or want to see the sites recommended by others, have a look.

User-constructed page of FOS links


Topica problems

Topica.com hosts our email list and discussion forum and is having some problems at the moment.  For example, I've bragged that our back issues and discussion postings are searchable.  They should be, but at present they are not.

Here is a more serious problem:

** If you receive this newsletter by email, then please delete the "easy unsubscribe" footer before forwarding it to friends or colleagues.  It contains a code identifying you as the original recipient of the email.  If someone down the forwarding chain clicks on the unsubscribe link, mischievously or inadvertently, then you will be unsubscribed. **

I don't need this, and Topica has been slow to answer email about the problems.  If you think you know a service which can do better, I'd like to hear about it.  See my criteria at this URL.


