Open Access News

News from the open access movement


Monday, June 16, 2008

Extracting geodata from PDFs

Roderic Page, From PDFs to Google Earth, iPhylo, June 13, 2008.
I've added a service to bioGUID that takes a PDF and attempts to extract latitude and longitude data ... returning those co-ordinates in either a Google Earth KML file, or in JSON format. ...

To see what it can do, try this URL to get a list of localities in the paper Description of eight new species of shrub frogs (Ranidae: Rhacophorinae: Philautus) from Sri Lanka.

Then try this one to get the KML file, and open it in Google Earth. The service uses a bunch of regular expressions to try and extract latitude and longitude pairs from the text ...

The ultimate aim is to assemble a bunch of Open Access PDFs (say, from Zootaxa), run them through this service, then display the result on Google Earth. Think of it as a geography of taxonomy. ...