Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough

ITS 2.0 - Official W3C-Recommendation

Parent Feed: 

The Internationalization Tag Set 2.0 has now the status of an official W3C-recommendation. Cocomore participated in the development of this standard for encoding information that increases the quality and efficiency of translation and internationalization on the web. Within the EU sponsored LT-Web project we did not only contribute to the standard. We also created a number of reference implementations that put it into practice and demonstrate its benefits. A lot more detail on what we did in the project can be found in the official project deliverables that are now online (we were responsible for D3.1.1, D3.1.5, D5.1.1). Below are short summaries.

Drupal Modules

Within D3.1.1 of the MultilingualWeb-LT project Cocomore implemented modules for translation and ITS2.0 handling within the open-source CMS Drupal. The implementations are based on the translation management (TMGMT) module available for Drupal as a community module (https://drupal.org/project/tmgmt). The implementations provide the following functionality:
  • Base TMGMT module models translation workflow with external LSPs in Drupal
  • Cocomore’s extensions added the following abilities:
    • Handle ITS 2.0 throughout the whole workflow
    • Apply global ITS 2.0 metadata at content node level
    • Handle ITS 2.0 annotation in Drupal WYSIWYG-editors (where content is produced). Annotation via menu bar, context menu, keyboard shortcuts.
    • Standalone ITS 2.0 editor (jquery Plugin) to support annotation in a separate process step, without modifying the actual content. Annotation via menu bar, context menu, keyboard shortcuts.
    • Localization chain interface: Round-tripping of data to/from LSP‘s TMS, including automatic data export and re-import
    • Interface with Enrycher for automatic annotation
These functionalities are embodied in the following modules:
  • Drupal TMGMT Workflow (TMGMT-module extension) to allow workflows with ITS 2.0 annotation
  • Drupal WYSIWYG editor: Plugin for ITS 2.0 annotation
  • Drupal TMGMT Translator Linguaserve: Localization chain interface (see also D3.2.2 and D4.1.3)
  • JQuery plugin for ITS 2.0 annotation in a separate step (new implementation)
  • Drupal Enrycher Integration (see also D3.1.3)
The modules are released under GNU General Public License 2 and can be downloaded and modified. They are available at the following URLs:

ITS 2.0+CMS: Best Practices

One important application for ITS 2.0 is the preparation of web content within a CMS for optimized localization/translation. This is best done by implementing ITS 2.0 directly inside a CMS. The experiences gathered in this context within the MultilingualWeb-LT project are summarized in a best practice documentation published as D3.1.5. It discusses topics that occur when using ITS 2.0 in connection with a CMS, and suggests ways to deal with these topics. The document is informed by the experiences gathered in the MultilingualWeb-LT project, where an ITS 2.0 aware translation workflow was implemented within the open source CMS Drupal. Its aim however is to provide guidance independent of the CMS as far as possible. An important aspect are therefore the characteristics of the CMS that interact with ITS usage and handling. However, not all internationalization-related issues can be resolved by the special markup described in ITS 2.0. The best practices in this document therefore go beyond application of ITS markup to address a number of problems that can be avoided by correctly designing the XML format, and by applying a few additional guidelines when developing content. This document and Internationalization Tag Set (ITS) Version 2.0 implement requirements formulated in the W3C Working Draft Requirements for Internationalization Tag Set (ITS) 2.0.

Drupal Machine Translation Training Module

Within D5.1.1 of the MultilingualWeb-LT project Cocomore implemented a module to send aligned original and translated data with ITS 2.0 markup to a machine translation (MT) provider for data driven creation or optimization of machine translation engines or models. The most common use case will be to train or tune a statistical MT model based on the aligned data and give special consideration on top of the standard techniques to the knowledge that is encoded in the ITS 2.0 markup. But other use cases, like the systematic identification of problematic cases for manual adjustment of a rule based MT system are also conceivable.. While ITS aware MT training was explored in more detail in D 5.2, the scope of this deliverable is the extraction of annotated and aligned bilingual data from the Drupal CMS. This process is based on the ITS 2.0 capabilities added to Drupal as described in the deliverables for WP 3. It was successfully tested in the context of the business case described in these deliverables (translation of VDMA press releases). Based on 141 press releases that were translated from German to French and Chinese. we could provide a three-way parallel annotated corpus of some 12.000 sentences.
Author: 
Original Post: 

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web