Welcome to the Free Online Scholarship Newsletter
     May 7, 2001

The Semantic Web

Tim Berners-Lee created HTML and the World-Wide Web.  Now he's the director of the World Wide Web Consortium.  With some colleagues, Berners-Lee is proposing innovations which would make web content more intelligible to machines, which will help machines help us search and share it.  This is a significant step ratcheting up the usefulness of the entire online network of scholarly information. 

He calls it the "semantic web" and describes its implications for online scholarship in an April 12 contribution to FOS debate hosted by _Nature_.  He has a longer exposition of the semantic web in the May 1 issue of _Scientific American_.

The semantic web will add structure to existing web content and bring automated reasoning tools to bear on that new structure.  Here's a peek at the details.  XML (eXtensible Mark-up Language) allows authors to create custom tags to reflect the unique structure and content of their documents.  RDF (Resource Description Framework) allows subject-verb-object assertions to be coded as XML tags.  The terms in these assertions will not merely be typed strings of letters, but links to identifying information and definitions.  A code called a URI (Universal Resource Identifier) will point to these definitions.  The familiar URL is one type of URI. 

These components of the semantic web (XML, RDF, and URI) already exist.  Some others have yet to be created.  Let's say that an "ontology" is a web file containing assertions about sets and their subsets, members, and properties.  Since concepts determine sets (a concept can be defined extensionally as the set of its instances), think of ontologies as formal definitions of concepts.  URI's will point to ontologies or parts of ontologies, so that the properties of a given concept, and its relation to other concepts, can be ascertained from a central and authoritative source.  Ontologies will also contain rules of inference suited to the concepts they define.  Together these will allow automated reasoning tools to draw inferences about the concepts tagged in a semantic web page.  This will enable intelligent software agents to answer user questions directly and provide "proofs" of the answers (web pages certifying the steps in the inference) on request.

The semantic web is not semantic because machines will have mental states or understand the meanings of words.  It is semantic because we will be able to delegate to machines much of the reasoning we do by virtue of understanding words, even if the machines will still do this reasoning syntactically.  The semantic web will behave as if it understood words, for example, forging connections where different disciplines use common concepts but not common terms, and supporting search engines which eliminate false positives and overlook differences of terminology to find an identity of underlying concepts. 

Readers interested in more detail and some clarifying examples, should read the longer exposition in _Scientific Ameican_.

Berners-Lee and Hendler, Scientific Publishing on the 'Semantic Web'
From the FOS debate in _Nature_

Berners-Lee, Hendler, and Lassila, The Semantic Web
From _Scientific American_

W3C page on the semantic web


Streams from neighboring fields

I won't routinely cover free online news, fiction, and popular magazines.  Their struggles have very few implications for online scholarship.  One reason is that scholarly writers are not paid for their journal articles. Hence they can move to a free dissemination model without any sacrifices they don't already make. 

But here are some recent events in the worlds of news and fiction which might have implications for FOS.

1.  Amazon has new service called the honor system.  One use for it is to support online publications that can't survive on ad revenue and can't persuade enough readers to pay subscription fees.  The online publication puts an Amazon honor system icon on its page.  Users with means and good will click on it, go to an Amazon page describing the service, enter the amount of their donation, and click to transfer funds from their credit cards to the publication.  Amazon has 29 million customers with credit card information stored in its computers.  Most of these millions are serious readers.  If a periodical wants free-will donations from any of them, Amazon's honor system makes it easy for the periodical and for its readers.  To process a donation, Amazon charges 15 cents plus 15% of each donation.

For example, Content Exchange is an online marketplace where content creators and content publishers can meet and make deals.  It used to publish a free newsletter supported by advertising.  Pinched by falling ad revenues, it turned to the public radio model and solicited pledges, but failed to raise enough money.  Now it uses Amazon's honor system. 

I don't know of any online academic journals using Amazon's honor system.  If you do, let me know about them and I'll try to interview the editors to see how it's working.

Amazon's honor system

Content Exchange

E-Pledge Drives Don't Work
From _Wired_ magazine

2.  _Editor and Publisher_ is an 117 year old print magazine covering the North American newspaper industry.  Its May 1 issue reports on a survey showing that free online news does not diminish revenues for print newspapers, but on the contrary tends to stimulate sales. 

In earlier issues we've seen the same result reported for scholarly books.  If it's true for newspapers and scholarly books, is there any reason it wouldn't be true for scholarly journals?  Post your thoughts to our discussion forum.

Web Sites Don't Cannibalize Print
From _Editor and Publisher_ magazine

3.  Time-Warner believes that 20% of the novels rejected by commercial publishers deserve an audience.  (Insert your own punchline here.)  It has created a web service called iPublish to discover these hidden gems.  Cab-driving novelists submit excerpts from their work in progress to iPublish, which posts them to the web site.  Readers rate these excerpts and send authors feedback.  Authors and readers pay nothing to use the site.  iPublish editors follow the user ratings and their own reading to identify books with promise.  iPublish will then edit the promising manuscripts and publish them as eBooks, print books, or both.

How many scholarly books rejected by academic presses deserve an audience?  What's the best way to identify and disseminate them?  Post your thoughts to our discussion forum.


You Write, They Edit, iPublish
From _Wired_ magazine


Almost free online scholarship

Octavo Editions is doing something worth your notice.  It takes very rare books, digitizes each page with state-of-the-art graphics, and sells the results on CD-ROM at prices comparable to modern paperback books.  Since this form of access is expensive to create, Octavo needs a revenue stream to reimburse itself. 

See or download the Octavo edition of Shakespeare's 1623 First Folio, produced through a partnership with the Folger Shakespeare library and available at Octavo's web site.  Apart from displaying the 1623 text without intrusive footnotes, the site offers a variable magnification feature which lets scholars study the volume's typography, print-through, water-staining, and marginalia.  Use if only to appreciate the richness of these high-res images.  It's hard to imagine a scholarly use for the First Folio that cannot be satisfied through this digital edition.

Octavo's work is significant both for access and for preservation.  Octavo is creating greater access to these books than scholars have ever had in the past, even scholars lucky enough to live and work near a library housing them.  And even if access need not be greater than it already is, Octavo's editions preserve nearly all the physical and textual details of interest to scholars.  Octavo isn't claiming that its file format and CD-ROM technology make an archiving medium superior to paper, but it does promise to translate its images to new archiving formats as they become available.  All digital scholarship faces the archiving problem.  Octavo is one approach to the solution.

Octavo, by the way, was founded by John Warnock, one of the co-founders of Adobe Systems.  He has a track record of creating document protocols which become standard. 

Octavo Editions, digital rare books

Octavo edition of Shakespeare's First Folio

Out of Print But Into Digital
From _Wire_ magazine


Correction.  Paul Ginsparg tells me that there have been a number of mainstream press stories on his archive, including a New York Times mention in 1995 and a longer story in April of 1998:

Physics on the Web Is Putting Science Journals on the Line
From _The New York Times_


This is the Free Online Scholarship Newsletter.

Please feel free to forward this newsletter to colleagues.  If you received this issue from a friend, you may subscribe yourself by sending an email to <suber-fos-subscribe@topica.com> or signing up at the FOS home page.

FOS home page

FOS Newsletter subscriptions, back issues, discussion

Peter Suber

Copyright (c) 2001, Peter Suber

Return to the Newsletter archive