Web Site Globalization is like a Bowl of Spaghetti
by Steffan Berelowitz
Steffan Berelowitz founded Bit Group, Inc. in 1995, and over its 13-year history has helped to develop a client list of Fortune 500, mid-market and emerging businesses. In addition to his responsibilities at Bit Group, Steffan served as a trustee of the Massachusetts Technology Leadership Council (MA Software Council) from 2001-2006. Steffan served on the board of directors of the Jewish Community Centers of Greater Boston as the chair of the advisory board of the Center for Information Technology of Hebrew College. Steffan is a member of the Boston College Technology Council. He is also a member of the Technology Network, a national network of senior executives from the nation's leading technology companies. Steffan served as an Internet consultant to former senator and presidential candidate Senator Bill Bradley. A graduate of Boston College, Steffan has spent the past 15 years in online services and technology. In 1993, Steffan was one of the key founders of ArtNet.
See other posts by Steffan Berelowitz
For applications that have not been well structured for internationalization (I18N), the prospect of extracting and then replacing localized strings is a little like removing 30% of the strands of spaghetti in a bowl of pasta and then putting them back where they belong. If that’s not tough enough, each new strand may have to change color at a moments notice depending on who’s eating the pasta.
Even for other applications that have relatively well externalized strings, there still remains a daunting challenge of managing the text extraction, delivery to the translation partner, then reinsertion of translated code. As it turns out, the using the XML Localization Interchange File Format (XLIFF) is a great way to simplify and scale this process while making it much more efficient.
The underlying issue that led to the creation of XLIFF is the problem of format. A translator frequently receives source documents for translation that may be in diverse formats such as Microsoft Word, Txt, HTML, RTF, and XML (with an unlimited variation of DTDs). Translators first have to deal with the challenge of simply reading the source file. This requires identifying and potentially obtaining and installing the right software package. It’s not enough to have Microsoft Word or Adobe Framemaker, but also each translator needs to have the right version of that software. We can only picture the late nights and multi-hour support phone calls required by the IT staff charged with the impossible task of setting up and maintaining these workstations.
“Hello, this is IT helpdesk, how can I help you?”
– “Would you please install Framemaker 6.0 on my workstation?”
“ Pardon me, but didn’t I just install version 7 for you yesterday?”
– “Yes, but that was for a different job, and this file doesn’t open in the new version.”
“So would you like me to uninstall version 7?”
– “I am still working on the prior job, I need both version 6 and 7!”
The problem of format is not just confined to the translator. Publishers and software developers have the challenge of extracting and sending text (strings) from documents or applications for translation. How can you be sure that the sentence or string that you extracted from your source can be reinserted in exactly the same place? In other words, each strand of spaghetti must be uniquely identifiable. The easiest way to do this is to attach little tags on the ends of each strand of spaghetti with some additional information about that strand (like attributes).
Fortunately, there is an organization that is dedicated to solving these kinds of standardization problems. The Organization for the Advancement of Structured Information Standards (“OASIS”) is a not-for-profit consortium that drives the development, convergence and adoption of open standards for the global information society. In February of 2008, OASIS members approved the XML Localisation Interchange File Format (XLIFF) version 1.2 as an OASIS Standard, a
status that signifies the highest level of ratification.
The Bryan Schnabel, the co-chairs of the OASIS XLIFF Technical Committee describes XLIFF very simply as follows:
– XLIFF is “[a] powerful and concise format for content that needs to be translated. ”
For a company seeking to build or better manage a global Web site, XLIFF provides far greater efficiency in the setup and maintenance of content for translation. The XLIFF schema remains stable even if document format changes or software code features change over time. For the translation service provider, XLIFF greatly reduces the complexity of managing source file formats and technologies and allows translation service providers to focus more of their time and resources on the work of translation itself. For both localization customers and service providers, XLIFF also provides workflow metadata providing better communication between both parties throughout a translation lifecycle. In sum, XLIFF greatly simplifies the challenge of selectively removing and replacing multicolored spaghetti from your bowl of pasta.
Recommended Reading