Exploring rich data is fun,
but finding it, formatting it, and
tagging it with metadata is drudge work barely fit for a trained
chimp. And if you want to share a large dataset online, you face two
troubling prospects: either a) that no one will find it, or b) that everyone
will find it.
A central, community-driven repository solves these
problems and presents amazing possibilities. Once we interconnect the
datasets along concepts they share, instead of 100,000 datasets, there's
just one. Study the physics of baseball by comparing the hourly weather
during every single baseball game to game outcomes. Uncover political
campaign irregularities by comparing neighborhood per-capita income,
historical voter trends, and public campaign finance records. Plan
real-estate decisions based on what news-and-other-media keywords rank
highly in each area. Let's start building tools that make this way of
thinking available to everychimp.
Open: No redistribution bureaucracy, no
larcenous prices for government-generated data, no teaspoon-sized
servings from a sumptuous buffet. Apart from prior restrictions
attached by the data provider, everything produced by the
infochimps.org project is and will remain open.
Descriptive: Not just numbers — fields
describe the real-world concepts they embody. Why should ints and
strings get all the glory? Your data should arrive knowing that it
describes a 'location', or a 'money value', or a 'book', and that
it has taken the form of a 'latitude-longitude pair', or '2004 US
Dollars' or 'ISBN Number'.
Free: Free to download, free to share, free
to use, free to redistribute. Just share, give credit where
credit is due, and respect existing restrictions.
Universal: Stop parsing flat files. As
infochimps like you help to convert and import well-structured
files, we can bundle and serve them in universal formats like XML,
YAML, JSON, and Excel- or SQL-ready CSV.
Verifiable: We list all contributors and
sources. If you need to reach the origins - to verify information
or to shower them with thanks - it's all right there.
Gavin Baker at 5/09/2008 07:05:00 PM.
The open access movement:
Putting peer-reviewed scientific and scholarly literature
on the internet. Making it available free of charge and
free of most copyright and licensing restrictions.
Removing the barriers to serious research.