Open Access News

News from the open access movement


Wednesday, May 23, 2007

OpenBusiness on Freebase

Wikipedia for Data - Freebase, OpenBusiness, May 23, 2007.  Excerpt:

Want to know how many dentists are in one mile vicinity, if they are next to tube stop and are specialists in teeth whitening? Freebase say they can not only give you this information, but that the database behind it will be build Wikipedia style.

Normally one would say this is ‘nuts’, but the team behind it seems promising (and Tim O’Reilly thinks the idea is HUGE).
OpenBusiness has interviewed one of the minds behind Freebase.

Robert Cook is one of the co-founders of Metaweb, the company behind Freebase. The company attempts nothing less than to build a ‘better infrastructure for the Web’.

Behind the Metaweb is also Danny Hillis, serial inventor and entrepreneur who was behind the Connection Machine a parallel supercomputer at MIT....

Freebase aims to be the Wikipedia for data. So naturally OpenBusiness was interested. Also their business model seems cool. They say they will make money through an through an API program. Depending on the commercial vs. non-commercial nature, and extent of services required by a developer, they will charge fees. How this all works, why they use Creative Commons and what they think about OpenAPI’s read below:

1. Why did you start Freebase?

Freebase’s goal is to be a database of the world’s knowledge.  As a single unified database, Freebase will prove to be far more powerful than the sum of its data sources, as it connects people to films, films to places, places to science, science to schools, schools to sports and so on…

As a database, it lets people ask complex and extemporaneous questions like, “Find me child-friendly dentists within 10 miles of my home,” or “Give me photographs of John F. Kennedy in Europe prior to 1962,” or even “Find me all of the Venture Capitalists in Silicon Valley who share a board membership and went to college together.” ...

Even more than the technology, the bigger question for us was where all of this the data would come from.  The internet has many thousands of significant databases, but most are hidden within websites or have restrictive licenses so that the data is locked up.  Fortunately, there are several hundred significant open databases that are in machine readable form, and we have begun to import these.

But most importantly, there are now many examples of sites where people are eager to build collective knowledge.  Wikipedia is the best example of this, but there are countless other sites built from user contributions, the biggest being IMDb (which has since become a closed model) and Musicbrainz, a music database which in many ways surpasses commercial alternatives.  It’s this phenomenon that makes us believe that a large database can actually be built....

We...learned two critical things from Wikipedia:

A. Wikipedia has radically embraced a ‘post-hoc’ moderation model....

B. Wikipedia has exactly one article for one idea....

Freebase has adopted the radically open contribution model (our current closed Alpha notwithstanding), where users can add structured information with minimal effort, such as the closing time of a restaurant, a link to a digital camera’s online manual, or the name of a company’s founder.  Experts in a field are unimpeded by process.  Bad data becomes good data as many people find problems and fix them.

Also, like Wikipedia, Freebase has the same one-to-one mapping of database records (what we call “Topics”) to things in the world.  For instance, we have a single “Austin, Texas” topic that points to all of the companies based there, the movies filmed there, the tree species growing there and the famous people born there.  If there are two “Austin Texas” topics, they will get joined into a single one.

3. You are using a CC license - why? ...

Freebase uses the very open “Creative Commons Attribution License” that allows anybody to use the data for any purpose, as long as they give attribution to the contributor.....

We believe that the more open the license is, the larger the set of users, the larger the set of contributors, and therefore the larger and higher quality the data set.  We allow and encourage commercial use because we want people to start building businesses that use and contribute back data to Freebase....

PS:  For more background, see my post from 3/9/07 on the Freebase launch.