Google has launched the Google Labs edition of Google Squared, an attempt to capture structured open data and present it tabular form. From the announcement:
Some information is easy to find. If you want to learn the rules of golf, you can search Google for [golf rules] and we'll return a list of relevant web sites right at the top. But not all your information needs are that simple. Some questions can be more complex, requiring you to visit ten, perhaps twenty websites to research and collect what you need.
For instance, I'm a big fan of roller coasters. In the past I've used Google to search for information about roller coasters, such as which ones are the tallest, fastest, and have the most loops. Finding this information used to take multiple searches — I'd find roller coaster sizes on one website, heights on another, and speeds on a third. By manually comparing the sites, I could get the information I was looking for, but it took some time. With Google Squared, a new feature just released in Google Labs, I can find my roller coaster facts almost instantly.
Google Squared is an experimental search tool that collects facts from the web and presents them in an organized collection, similar to a spreadsheet. If you search for [roller coasters], Google Squared builds a square with rows for each of several specific roller coasters and columns for corresponding facts, such as image, height and maximum speed....
This technology is by no means perfect. That's why we designed Google Squared to be conversational, enabling you to respond to the initial result and get a better answer. If there's another row or column you'd like to see, you can add it and Google Squared will automatically attempt to fetch and fill in the relevant facts for you. As you remove rows and columns you don't like, Google Squared will get a fresh idea of what you're interested in and suggest new rows and columns to add....
If you click on any fact, you'll see the sources Google Squared gathered it from as well as a list of other possible values that you can investigate. So even if your square isn't perfect at the beginning, it's easy to work with Google Squared to get a better answer in no time. Once you've got a square you're happy with, you can save it and come back to it later.
I like this. OA to structured data is useful but just the beginning. OA alone doesn't help users query the data or view its structure. However, we're seeing a wave of new tools to provide visualization and querying, after the fact, for a growing range of data files. This is another in that wave. (Note that these tools would never be developed if there weren't a large and growing number of OA data files to harvest as input.) The job is difficult and the first results are not always impressive. But many well-equipped players are entering the game and the results should steadily improve.
I particularly like the way Google lets users add and subtract rows and columns. For example, if you search on trees, you can subtract the rows devoted to mathematical trees. If you add a new column on "genus", Google runs a new search for the genus of each tree already on table. If you add a row for "white pine", Google runs a new search for each parameter of white pines already represented by a table column. To see the full power of this feature, start with an empty table. Add a row for "white pine". Note that Google creates two default columns: "image" and "description". Then add your own column for "genus". While you're at it, add columns for "family", "class", and "species". Then add rows for "red oak" and a few other trees, and watch Google fill in the cells as well as it can.
Features to add: Let users save their tables as CSV or spreadsheet files. Let users upload a spreadsheet, modify it with new Googly rows and columns, and then download again. Allow sorting by clicking on a column head. Merge Google Squared with the spreadsheet in Google Apps: When cells contain numbers, users should be able to calculate on them. (For an example of a Google Square with numbers, search for hard drives. One of the default columns is "capacity". Add a column for "price" to get a second number. Imagine another column computing "price / capacity". For another example, create a new table with three nations in three rows, and add columns for "2006 GDP" and "2007 GDP". Imagine computing the unit and percentage changes from one year to the next.)
Peter Suber at 6/04/2009 11:33:00 AM.
The open access movement:
Putting peer-reviewed scientific and scholarly literature
on the internet. Making it available free of charge and
free of most copyright and licensing restrictions.
Removing the barriers to serious research.