The user's webbrowser sends the Accept-Language header line to the server, which is (supposed to) represent the user's preferences for different languages. Typically, when a browser is installed, this preference list might be set to the language in which the operating system has been configured to work. Browsers will hopefully offer intuitive configuration for users to indicate their language preferences, so that they may be used in requests to servers.
The values from Accept-Language is an unordered list of ISO-639-1 two-letter language codes with optional territory. The list can be ordered with the q qualifier, which is a value from 0.0 to 1.0, and this is the way Internet Explorer operates. It looks like this:
Note: Spiders coming from search engines don't provide an Accept-Language line, meaning the non-default content is hidden from them unless you take precautions.
Authors are often seen arguing that it is pointless to apply language negotiation, because most people have no idea how to configure their browsers to use it properly. They are then inclined to respond to this belief (which may very well be true) by designing their own weird and wonderful language selection mechanism, exclusively for their own site, based typically on some kind of user dialogue leading to the setting of a cookie, or even on making guesses based on the user's IP address or domain name.
This approach has a number of issues:
Use the browser's Accept-language to set the initial choice of language. In most cases this is the correct language. Then allow the user to change via some overriding mechanism.
Can you rely on the user having set the language choice correctly? The short answer is no. In some cases the person is using a webbrowser that is not his own. Consider the experience of a European person sitting in a Japanese Internet cafÃ©. He somehow finds the "Internet options" dialogue box in the dropdown lists, and gets this:
Which tab is the content tab? Which button sets the preferred languages?
You sit in a Japanese Internet cafÃ©, can't switch your preferences, the pages show in Japanese, and you must rely on the override mechanism to change to English. How do you find it? Here we have listed 8 choices. Try to change it to English.
If you don't know that è¨èª é¸æ means
Select language, how do you even find the language switch?
Here we have listed the language names in their own language.
Much easier, wasn't it?
With language negotiation, how will search engines ever discover other pages than the ones in the default language?
It depends on the language negotiation strategy of the site.
For sites that use simple URL rewriting, web crawlers can just go find the alternative language resources. In other words, if the English lives at http://example.com/en and the French at http://example.com/fr, then web crawlers can find the resources under those paths and index them. The resources are still on the site. Language negotiation typically refers to software logic at the server for giving a specific language version of a resource given a generic URI. Specific URIs in this case still work to retrieve a specific language.
If the site uses extension based language negotiation (as with Apache MultiViews) or if web crawlers cannot automagically find where the alternate languages live (perhaps they are generated dynamically), then things get more complicated. One way to do it is to put a lot of extraneous META information in page headers. A manual link page that leads to specific language versions can be put on the site (the page might be the one used by users to change their language manually or it might just be a page intended to guide web crawlers into specific language "channels"). Most web crawlers will index both directories and try to reach links found on a site and you can assist this with ROBOTS meta tags that direct the crawler to follow specific links. Providing links with specific language preferences in them for web crawlers to index can help get all of the language versions indexed and you can use otherwise "hidden" pages to help the robot along...
Document last modified 2009/04/18. Content in this portal is modified daily by a community of providers.