What does the perfect setup for an international website look like? Or should you even set up new websites for the different languages and / or target countries?
This article would like to present a scenario which is recommended if there is no website yet and everything has to be rebuilt “from scratch”. The following nine subject areas are dealt with:
- Upper or lower case in URLs?
- URLs with Asian characters or other special characters
- Special characters in domain names
- One or more domains?
- National ccTLDs or gTLDs?
- Subdomains or subfolders?
- Different solutions for Russia and China
Each tip for the relevant topic is argued as comprehensibly as possible. However, this can only be a possible recommendation. Especially if a website already exists, the recommended strategy can deviate from the one presented in order to avoid possible ranking losses during the conversion.
Before transferring it to your personal situation, the same applies as in medicine: “Ask your doctor or pharmacist online marketing or SEO specialist about risks and side effects.”
Upper or lower case in URLs?
Baidu, Yandex, Bing and Google can deal with both lowercase and uppercase letters. The supposedly same URLs in different combinations of lowercase and uppercase letters are different URLs for Google. The reason is that most web servers (based on Linux) also differentiate between these URLs and try to interpret and resolve them as different URLs. The following two URLs can play different content:
If the linking of the URL is not case-sensitive, this will result in 404 error pages.
Other web servers, however, make no distinction between lowercase and uppercase letters. No matter how a URL is written, such a web server would always return the same document.
However, since Baidu (and other international search engines) regards the URLs as individual URLs and not as canonical URLs (as long as no canonical is defined), there would be a risk of duplicate content.
Since URLs are typed or dictated again and again, it is recommended to always formulate URLs in lowercase letters. This reduces the risk of being typed or dictated incorrectly (“everything small” would be a typical statement on the phone before the URL is pronounced).
The risk of duplicate content due to URL duplicates – or, far more commonly – generating URLs that do not transport any content (usually 404 pages) is drastically reduced.
Special characters in URLs
Particularly when working with other languages, one comes across special characters that differ from the normal 26-letter alphabet. How should you deal with this in url design?
In principle, the search services are also able to process special characters in URLs. The URLs are saved in the index and correctly linked in the search results. Nevertheless, there are still problems with the display of URLs with special characters in various applications that a user could use.
The reason is the UTF8 coding and the human-unfriendly representation of the special characters as% codes. URLs encoded in this way are also perfectly accepted by the browsers. But when a human user is faced with the encoded URLs, confusion and mistrust can arise.
www.mydomain.de/schöne-umlaute.html www.mydomain.de/sch%C3%B6ne-umlaute.html www.mydomain.des/akzénte-in-urls.html www.mydomain.des/akz%C3%A9nte-in-urls.html www.mydomain.com/大家好.html www.mydomain.com/%E5%A4%A7%E5%AE%B6%E5%A5%BD.html
It is therefore advisable not to use URLs with special characters – not because of technical, but human hurdles.
This means that any search terms contained in the URL are no longer contained in their original form – however, the keyword as part of the URL has a very low percentage of positive rankings.
What special characters are there?
fußgängerübergänge = fussgaengeruebergaenge größenmaßstäbe = groessenmassstaebe größenordnungsmäßig = groessenordnungsmaessig caleçon = calecon canapé = canape château = chateau voilà = voila noël = noel l'aïeul = laieul mañana = manana быть = byt сказать = skazat который = kotoryy ยินดีต้อนรับ = yinditxnrab สวัสดี = swasdi ยินดีที่ได้รู้จัก = yindithidirucak 大家好 = dajiahao 电脑 = diannao 巧克力蛋糕 = qiaokelidangao
The search engines recognize the equivalents in the simple 26 letters of the alphabet and can infer the words in the correct spelling.
The figure above shows the Baidu instant search while entering the word qiaokelidangao (stands for “巧克力 蛋糕” which means chocolate cake) in alphabetized syllables (the alphabetical transcription is also called “pinyin”). Baidu automatically recognizes what the user actually wants to write during input (in Chinese characters in gray next to the actual user input).
Special characters in domain names
The recommendation to refrain from using special characters also applies to domain names.
The domain registration offices allow the registration of names with German umlauts for German domains. They also allow the registration of Chinese domain names with Chinese characters. Such domains are also called IDN domains (Internationalized Domain Names).
The coding for IDN domains is not UTF8-coded, but encoding according to the ASCII character set. IDN domains are easy to recognize by the prefix “xn--“.
https://www.巧克力蛋糕.com/ encodes into https://www.xn--74qv5a31w884ajok.com/ https://www.größenordnung.com/ endoces into http://www.xn--grssenordnung-jmb.com/
Even in China, there are hardly any domains that work with Chinese characters in the search results. Most domains of this type are forwarded by the owner after registration to the safe domain with purely alphabetic and numeric characters.
http://www.亚马逊.cn 301 redirect to the short URL http://amzn.to/1EdADRf 301 redirect to http://amazon.cn/ 301 redirect to https://amazon.cn/ 301 redirect to https://www.amazon.cn/ On the side - Amazon China could optimize the forwarding chain a little ;-).
One or multiple domains?
The Internet has its own ccTLD (Country Code Top Level Domains such as .de for Germany) for every country in the world. And the Internet users are usually very familiar with the ccTLD applicable for their country. It is not uncommon for a company to opt for a domain extension that is suitable for the country for which the website is being created when setting up its own website, if only for reasons of trust.
If you expand your distribution area and then want to be present with a website in the other countries, the next ccTLD domain for the respective new country often follows.
Each of these websites starts from “zero”. New content has to be created (often the existing content is translated into the new language). And the new website must also gain new trust and strength from the search engines. Over time, links are created and the new website gains strength and hopefully at some point it will also be found in the target market via the search engines.
The websites that have already planned with several languages and target markets at the beginning have the advantage at this point. For administrative reasons, they often only use one domain on which they serve several languages.
The advantage for search engine optimization is certainly obvious: all the links that a website could gain benefit from the same domain. Regardless of whether the links point to the French or the English content – they point to the same domain. With a good internal linking concept and a good URL structure, all languages that are mapped on this one domain benefit from the backlinks.
For a new website to be planned, it is advisable to base it on a single-domain strategy. Experience has shown that the advantages outweigh the disadvantages.
ccTLD or gTLD?
ccTLD has a strong signal for most international search engines, for which country they are likely (!) relevant.
In the Google Search Console, for example, you can even not change the GEO localization for a ccTLD. It is automatically located in the country for which this domain was originally set up.
For Google in particular, GEO localization via ccTLD or the Search Console is not a primary signal. The language of the content, external signals such as the country from which most of the traffic arrives on the website or the language / local orientation of the websites that link to the domain are significantly stronger signals that can “overwrite” a setting in the Google Search Console .
The slight positive signals for the GEO localization via a ccTLD can also be made manually in the Google Search Console for a gTLD (generic top level domain) such as .com.
But what about other search engines?
Yandex (Main focus land: Russia)
The Russian search engine Yandex prefers local ccTLDs like .ru or .com.ru. Nevertheless, a generic TLD can also rank with Yandex and even achieve good rankings. Before you decide to book a local TLD in the international website orientation for the Russian market, you should consider whether Yandex is really the most important search engine for your own Russian target group. In the search engine market in Russia, Google (depending on the source) is on a par with Yandex or even over 50%.
Baidu (Main focus land: Mainland China)
The fact that the Chinese search engine Baidu prefers Chinese ccTLDs such as .cn or .com.cn is an often cited SEO myth. The fact is that Baidu initially has no preference for a certain TLD. Baidu doesn’t have to either, as there are so many other signals that can credibly assure that a website is optimized for the Chinese search market China (excluding Taiwan and Hong Kong). The TLD could even be misleading here. As a result, some of the largest Chinese Internet services have opted for a gTLD instead of a ccTLD.
Examples of the strongest domains in the Chinese market:
However, Baidu focuses on the mainland Chinese market. Therefor they will prefer websites that make it easy for the visitors to stay within the Chinese language. The higher the chances that the users could find non-Chinese content on the same domain, the less Baidu will believe that this website is perfect for the Chinese users.
So different from the above International Google recommendations to group all country websites / all language websites onto one domain, it does make sense to build one domain with Chinese only content for targeting the Chinese users – and Baidu.
If you are building a whole new website for China users only – it does make sense to us a Chinese ccTLD as well, because it at least shows the users that the website is meant for the Chinese users even before they visit it.
Which gTLD if a gTLD?
When choosing a gTLD for the international market, it is important to ensure that this domain extension is not unusual in all target markets. Because an unusual TLD can be suspicious of spam among search engine users. The CTR (Click Through Rate) in the search results could fall short of expectations and, in the worst case, be interpreted as a negative signal by the search engine.
In all cases it is to be avoided to use a geographically “pre-loaded” ccTLD if one has more than one target region in mind. For example, it can have a negative effect on the CTR if the French market is targeted with a .de domain.
If more than one search market is to be served, a gTLD should be selected for a new project. The .com TLD is the world’s most popular and recognized gTLD.
Sub domain or Sub folder?
Following the recommendation for a generic TLD – probably a .com domain – the next phase of the URL structure planning is imminent: Subdomain or subfolder? If a website area is integrated into the website architecture once as a subdomain and another time as a subfolder with the same internal link, in my experience the subfolder gains significantly in strength.
This corresponds to the assumption of many SEOs that a subdomain is treated like a new domain by Google. The incoming links from the main website (usually the www subdomain) are treated like external links and do not contribute to the internal pagerank to the same extent as the links to a subfolder would.
A subdomain is counted as a new domain
A subdomain therefore does not benefit fully from the backlink profile of the main website, but must first earn authority and strength through its own backlink profile.
The choice now seems to be for an international subfolder strategy. However, this does not take into account the fact that Baidu prefers to prescribe self-contained subject areas in clearly separated URL areas. The subfolder is only the second best choice. The subdomain would be preferable for Baidu SEO at this point.
Different solutions for Russia and China
One might think that the new Chinese subdomain would start from “zero” authority. That would be true for Google – but not for Baidu. Baidu allows a stronger inheritance of “Pagerank” from the main domain to the associated subdomains (Disclaimer: Pagerank is a term coined by Google – the rough concept of Pagerank inheritance is known to all major search engines).
Despite the stronger line of inheritance between the main and subdomains, one shouldn’t hope too much for a strong entry into Baidu’s organic rankings – because a website that is strong for Google does not have to be strong for Baidu. The reason is that Baidu is primarily weighted for the People’s Republic of China’s Chinese market. For this reason, websites (and thus also linked websites) are given more weighting if they are in Simplified Chinese (customary font in China, contrary to Traditional Chinese, which is written in Hong Kong and Taiwan) and Mandarin (vocabulary / grammar customary in China and Taiwan, contrary to Cantonese , which is spoken in Hong Kong) and get backlinks from corresponding Chinese websites.
The use of a subdomain for the Chinese section of the website is enough signal to Baidu that the entire subdomain is intended for the target market China.
There is another peculiarity when using subdomains with regard to SEO for China: Baidu allows more authority of the main website to be passed on to subdomains than is the case with Google. In this way, a Chinese subdomain benefits when the main website (although not in Chinese) has been able to build up authority from a Baidu perspective (e.g. through strong external links to the main website from strong EDU or GOV websites).
With a one-domain concept on a generic TLD, a suitable subfolder should be selected as the URL area for the various search markets for which Google is the most important search engine.
For search markets in which Google is only one search engine but still important (depending on the situation, e.g. for Russia or South Korea), still the sub folder concept is recommended, as it supports your Google SEO.
If Google does not play a role (e.g. Mainland China), a solitary domain for this country is recommended – and since it shows the users of the target country that they will be served well, a ccTLD does make sense.
If a new domain is not possible and the Non-Google-Country website needs to be built on the international gTLD domain, instead of using the sub folder strategy, a sub domain strategy is the way to go:
Mehrsprachigkeit oder Lokalisierung?
For many websites a multilingual orientation – one folder per language – will be sufficient. Sometimes, however, different information may exist for different countries, even if they use the same language.
A common example can be found in online shops, which offer different product prices, currencies and shipping costs depending on the country. However, the features of the products offered may also differ depending on the target market (preinstalled software, other connectors, etc.).
For the search services, it makes sense to also clarify these instances in suitable silos in the URL structure. For example, you can register individual folders in the Google Search Console as properties and assign them to a specific search market (country).
Overall, it makes sense to plan as few instances of a language as possible. Because even with the correct use of Hreflang and GEO localization via Google Search Console, duplicate content can cause irritation in search engines. So, contrary to the recommendations of the webmaster, Google can decide not to rank the UK version of a website in Great Britain, but the US version, because this could perhaps gain significantly higher authority and thus strength and relevance in the overall structure of the website.
This speaks in favor of a multi-language domain if it is not absolutely necessary to create country versions because significantly different information is to be presented in the different countries.
But which structure is suitable for exactly your planned project?
Decision diagram showing when which sub-domain / sub-folder concept is ideally used:
Figure: Decision diagram showing when which sub-domain / sub-folder concept is ideally used.
Concept A including Chinese for China
What about hreflang?
This article is not intended to go into detail on Hreflang, as Baidu does not pay any attention, and the official guide from Google is also quite informative.
A few basic words about Hreflang should not be missing at this point:
- Hreflang is a markup option for the individual documents on a website, which is intended to help the search service to decide which documents on a website are relevant in which search market and in which language.
- Hreflang can, for example, help with the problem that Google in Austria displays the pages actually intended for Germany (and vice versa).
- Basically, Hreflang works in such a way that every document that exists in at least two copies (e.g. once in German and once in English; or once in German for Germany and once in German for Austria) with Hreflang references to the respective other versions ( and himself) refers.
- In addition to the URL, the award also includes the language information and, if necessary, the GEO information (the country). It is important to ensure that hreflang is only used on canonical URLs that are to be indexed and really only refers to URLs that exist and are to be indexed. Each document should also refer back to all other documents that refer to it.
The strategies presented at this point are based on many years of experience and taking into account markets that are not necessarily only determined by Google. If we are commissioned to plan a new international website, the strategies presented are definitely our basis for the domain and URL strategy.
However, it cannot be denied that other strategies also have their advantages. Google is not only able to deal with this one internationalization setup of a website. Constructs that are not outlined in more detail at this point can also celebrate good successes in organic search.
Especially when a website is already established, it can be advantageous to adapt an existing domain and URL architecture instead of establishing a new one.
It therefore makes sense that the responsible SEO specialist familiarize himself sufficiently with the current situation before a strategy change is planned and implemented.