More than three years ago the W3C has published an excellent note on Best Practices for XML Internationalization. To illustrate their intentions the authors of this note included a lot of examples for bad design – which is indeed very helpful.

How come that many authors of XML data and (even worse) authors of tools that create XML data have got that wrong and apparently use the bad design examples as templates for their data? Among others I have come across the following in recent years:

  • Translatable content in CDATA sections
  • Translatable attribute values (in some cases even multiple paragraphs!)
  • Elements named like <x_01>, <x_02>, <x_03> and so on
  • Documents that contain four or more languages

When it comes to localization, XML can be an excellent format. If only the W3C recommendations would be observed.