You are planning a new website, and you want the website to be multilingual. What should you do? First you plan the structure of your website. Then you figure out how to solve the missing page dilemma. Finally you consider the users' browsing experience.
Language-codes must follow the ISO 639-1 international standard. A small integer—seen on some products—won't do. On the <html> element you should declare the main language of the website with an xml:lang="en" attribute. Whenever you write a text that is in a different language than the one of the page, you must declare the language as in <span lang="fr">mot étranger</span>. Only use <span> if the text isn't embedded in another element already.
This makes it possible for e.g. Google to offer translation of the page. It makes it possible to spell check and it makes it possible for text-speakers to pronounce the text correctly. In the future the EU Commission's Machine Translation may play a role.
To be able to display Latin, Greek[, Armenian, Georgian] and Cyrillic letters on the same page, it is necessary to use a character set, which contains all three (and more) alphabets. Unicode is the international standard. For the web, this is typically encoded into UTF-8 when transmitted to the webbrowser.
Until all texts are in Unicode, it is imperative in the interim transition period to know what text is in the legacy 8-bit character set, and what has already been converted.
Bear in mind that the role of the content author and the role of the translator are two different roles. The content author should not be unduly burdened. In fact there is no need to involve him in the workflow of the translation. When the author creates a new text - or modifies an existing one, it should automatically be included in the next XLIFF batch to the translators.
We have a relationship with the Commission's translation centre. Rather than providing a webinterface to them for each website where they type their translations, we send them the texts in XLIFF format to translate. They send the translations back and we import them into the message catalogue. Why? Because they have tools that automates most of the work. These are e.g. spell-checkers, grammar checkers, glossaries, translation memories, machine translation for first draft and the workflow for quality assurance etc. Those tools are not available for them in a homemade interface. The only use you have for a webinterface is to make quick edits in the texts.
Consider modified texts: It can take weeks for the retranslation to be finished. Should the obsolete translations be shown in the meantime? It depends on the text and the nature of the modification to the original text. The safe default is to not show the old translation. If your text is: Don't drive with more than 0.08% alcohol. Then if the text is changed to Don't drive with more than 0.05% alcohol your translations could land you in legal trouble.
Consider how the system can detect that a translation is outdated with regards to the source text. Usually a timestamp can be used.
Read more in multilingual techniques in content management systems.
One difference between websites and documents is the navigation. Websites often have one-word texts used everywhere such as "Next", "Previous" and "Home". With only one or two words, then translator has no context, and can make the wrong choice. The word "show" can be both a verb and a noun. They will translate to two different things in other languages. To prevent this, it is a good idea to have the ability to attach a note to the word for the XLIFF file.
Certain words such as person names, small cities, rivers and lakes don't have translations. Instead they are transliterated. We have a guide on how to present names in a multilingual context.
Can you trust the language preference to be set correctly in the user's browser? In most cases, but perhaps not always. It still is a very good starting point in my opinion. Read more about language negotiation in websites.
Document last modified: 2009/04/18 . Content in this portal is modified daily by a community of providers.