When it comes to international SEO (especially within EU) there are many questions and myths. One of the myths I have heard most often is that you shouldn’t use Special characters in domain names or URLs, since the search engines can’t understand them. This is however of cause a false statement. Before I get ahead of myself let me just try to explain a few words that I will be using later on in this article. But before that a disclaimer I’m Danish hence my English might not be up to par with you people who are native to English. So be a pal, and ignore any punctuation and spelling errors. I promise I’ll try to keep them to a minimum.
Language Specific Characters: I’ll be calling them LSCs from now on. Special characters used in other languages than English, like: Γ, Γ and Ο.
Internationalized domain name or IDN:Β This is a domain name that contains LSC, such as “CafΓ©.fr“, “ΓΌbercool.de” or “ΟΟΞΏΟ ΟΞ±ΞΊΞΉΞ±.com“. IDN domains are used in Japanese, French, German and the very beautiful Danish language (and since I’m Danish I’m naturally completely unbiased in this regard) plus a lot of other languages.
Punycode: A way to translate special characters into ASCII code (‘normal’ characters) that the web servers actually understand for instance in the above example with “ΓΌbercool.de” in punycode this will look like: “xn--bercool-m2a.d“. Old versions of Firefox actually encoded the URL in the navigation bar to punycode, making IDNs look strange and dangerous. Luckily they don’t do that anymore.
ASCII code: The American Standard Code for Information Interchange. Well it’s basically English letters, no reason to make it more complex than that.
IDN support in e-mail Clients, Browsers:
When working with online marketing in Europe, it’s important to understand how email clients, browsers and the search engines handle these LSCs. So let’s jump right in there and get started.
In the past IDN support in email clients and browsers was a huge problem and back in the days of Internet Explorer 6 people would have to download a plug-in from windows update to get IDN support. This update is automatically installed, unless you deselect it, but people who don’t upgrade their browsers, usually don’t fool around with the advanced stuff.)
In Europe the problem is almost non-existent. Especially in Scandinavia, where we are model citizens of the web, who know to update our browsers (this is of cause, a completely unbiased and a very objective remark, that has absolutely, positively nothing to do with the fact that I’m Danish.) In Europe the percentage of people using IE6 is below 2% and most of these people do have the IDN plug-in installed, even though most are not aware of it. (Source: StatCounter.com)
When it comes to mail clients, the picture I painted above is roughly the same. I have been unable to find any stats to prove or disprove this, so I will have to go by gut feeling and best guess. If we expect the percentage to be roughly the same (and I think we can safely do that) then we can conclude that it’s a miniscule percentage who have a problem (if any at all). The people who don’t upgrade are often old persons and computer newbies (or public institutions) and here we have another factor joining the others. These people (the people who don’t upgrade) will often use Bing or Google for a search, even if they wish to visit seomoz.org (a very likely scenario.. Ok, maybe not ), these people more often than not, type in seomoz.org in Bing or Google rather than the address bar. Meaning that the search engines will find the site they are looking for and they’ll click the link. That way, we eliminate the problem entirely.
Now that I mentioned the Google and Bing, let’s cover their support of IDNs. This one is actually very quick, they understand them perfectly just like they understand LSC perfectly. Now that we have covered the IDN domain name problem it’s time to move on to some more interesting stuff, but before I do let’s summarize:
The search engines have no problem with IDN domains and since I am focusing on SEO in Europe, the numbers speak for themselves and the problem is veritably non-existent. Now, if we are working with local market eg: Denmark (wonderful country by the way) then we will have a homepage in that particular language and we can safely assume that the people who visit it will most likely have the plug-in for IDN support installed. And if they don’t it probably won’t be the type of user who is likely to do online purchases or be interesting for your online business at all.
If we are working with a major corporation then the corporation will (or at least should have, if they like to get visitors from the search engines) a homepage targeting each market, again eliminating the problem. If they don’t but still target all of Europe, the page and domain name will likely be in English, again eliminating the problem.
Of cause there are special circumstances where you need to stop and think: What if your potential client is on a business trip to China? What about AdWords? Well it just goes to prove that goes to show you shouldn’t do anything without thinking. Both the AdWords and the mail client problems with IDNs can be solved with a domain Alias and owning the domain without the LSC: Owning both “Cafe.com” and “CafΓ©.com”. And have the IDN domain forward all mails to the domain without the LSCs.
How do the Search Engines handle LSC?
I cannot speak for how the search engines handles all LSCs. Let’s face it, I might be able to read and write a few languages, but I’m no linguist. And there are quite a few languages out there, so I’ll stick to the once I know. In the following example I’ll use German as an example since most of you probably don’t give a rats..*cough* tail.. about the Danish LSCs (a shame but most likely true .)
In German you have a few LSCs like Γ, Γ, Γ and more. Traditional countries who have LSC in their char sets use variations of the originals for when they are using a keyboard that doesn’t have these chars. Γ would be OE. Γ would be AE. Γ as UE. Now here comes the cool part, the search engines do understand this, so if you search for UEber then many of the results you would get, would be for Γber and visa versa. But since it’s not an exact match search, some search result will naturally outrank the “guesses” from the search engines
So I ask you, if the search engines understands this and Germans will search for ΓΊber instead of ueber (and they will). Why on earth would you not want to own the domain ΓΌber.de instead of ueber.de (or even better both) But Iβm getting ahead of myself here. As they say “PICS! Or it didn’t happen!” and your wish is my command!
If you look at the URLs, you’ll notice that Google highlights ΓΌber when I searched for Ueber (but not the other way around). Let me just emphasize that this actually works differently In my wonderful Danish, Both Google and Bing will highlight Γ as AE and visa versa, it seems that’s not the case in German.) Bing screenshots below.
Just to prove my point about the Danish LSCs here a screen from google.dk
What is interesting is that if we do the same search but use ΓΓΓ instead of AEOEAA then the exact match domain name will rank higher (spot 3 instead of 4) and when you type AEOEAA Google Suggest will even suggest ΓΓΓ .
Conclusion:
This was just one example there are many similar examples in other languages. You cannot just expect the search engines to understand the special rules for each country’s LSCs you have to research it. But what is clear is that the old dogma about not using IDN domains in SEO, is dead and gone. Sure there are situations where it makes more sense to not do it. But if you are in Denmark, targeting Danish client, who do their shopping in Denmark. I would want that higher ranking for the exact match, wouldn’t you? Think before you act, will you (or your client) earn more money getting the occasional paying client who happens to be in China or will it be more bang for the buck ranking one or two spots higher in the search engines. I know where I put my money, how about you?