Open Access News

News from the open access movement

Monday, January 21, 2008

OA texts to train translation software

JRC publishes texts to help development of computer-assisted translation systems, a press release from the EU's Joint Research Centre (JRC), January 21, 2008.  Excerpt:

The EU's Joint Research Centre (JRC) has published a million sentences translated into 22 official EU languages in a bid to help the development of computer-assisted translation technologies and software.

By offering free and open access to this collection of sentences, the EU hopes to foster multilingualism and provide a valuable resource for system developers to create machine translation software.

As part of its remit, the EU translates all its legal and political documents into all 23 official languages, meaning translators must work with 253 possible language pair combinations across 1.5 million pages a year. This also means there is a collection of translated texts which is of great value as a learning base for system developers....

Because the text is offered in context, it can also help develop and test grammar and spell checkers, online dictionaries and text classification systems....

Comment.  This is a great example of one of the most important but least discussed virtues of OA.  OA not only removes access barriers for readers and increases impact for authors, but free online texts become free online data for sophisticated software that creates new forms of value for everyone.