How do I index a Hubspot blog
Is it a good idea to use URL names with special characters? [closed]
Is it a good search engine optimization to have urls (page names) with non-english characters like chinese names in urls?
Share2010-10-05 15: 11: 36- jpkeisalaSource
I found out that there is an official transcription system called -jpkeisala
I'm not sure if other (non-Chinese) websites can use these type of urls due to their parser / regexp limitation. This can have a negative impact on your website's page rank. -zgorawski
This is not really related to programming. -Gumbo
Thank you for your request, but really, this should be asked at webmasters.stackexchange.com @jpkeisala -ericn
1 @ fuzzybee yes, sure, but this question is older than webmasters.stackexchange.com -jpkeisala
from an SEO point of view
GENERAL URL RULES:
- All URLs in a web property must follow these rules (listed in priority).
1) unambiguous (1 URL == 1 resource)
2) permanent (they don't change)
3) manageable (1 logic per site section, no complicated exceptions)
4) easily scalable logic
6) with a targeted keyword phrase
The targeted keyword phrase is the least important - but still important. If you can have short, scalable, manageable, persistent, unique URL logic - with non-English characters, go for it.
There are benefits to having the URL match the search term as the search term will be highlighted in the SERPs. Also, the url is the most common anchor text (as people tend to put in & urls) a cool anchor text, if you use the keyword (in whatever language) in the url, the url keyword is also considered content and adds context to the page, another SEO plus.
So yes, do it, but only if it does not work against principles 1 to 5.
As of June this year, ICANN has approved the use of Chinese characters in domains without the use of .cn at the end.
This still does not mean that using an uncoded Chinese character in a URL is valid. But good information -Pekka
@Pekka: And even if it's valid, that doesn't automatically mean it's a good idea to do. -Stefan Steiger
@Quandary Well this is easy to say for us Latin alphabet countries ... I can understand that not having domain names and email addresses in your native script is annoying. It's still terribly painful and expensive for us programmers who have to implement it, no doubt. -Pekka
@Pekka: I agree, although in my case German only has the additional ÄÖÜ, which can also escape in normal German as ae oe ue, the rest is the same. It's just a nuance if you have several of these characters in a row or if a colleague of mine uses one of them in a SQL column name while I have to merge my scripts in the ASCII-only Windows command line I can very well imagine that this is a little different for Chinese users. Still, it's not a good idea. -Stefan Steiger
@Quandary yes. Internationalization should have been taken into account 30-40 years ago - it would have prevented the crappy tape fixes we have to use today ... Punycode, code pages ... -Pekka
Indeed. For example, take a look at the MinGW compiler for Windows. We are in 2010 and the C ++ STL Unicode parts are still incomplete at best. MSFT is a little better, but not really by much. -Stefan Steiger
I wouldn't for one simple reason: email.
The e-mail log does not (yet?) Contain any characters. So if your domain was www.äüö.com, you could use the email addresses<...> @ äöü.comdo not use.
See the first comment for a workaround.
1 Wrong. You would have to use a browser, resolve äöü.com to the Punycode URL and then replace äöü.com with xn-whatever.com. Although you're right in that the average user would be too stupid for it, and MS programs like Outlook don't yet include a Punycode transform to do it automatically (Mozilla Thunderbird, BTW). -Stefan Steiger
Maybe he wasn't referring to the domain name, but the rest of the URL. I think it's okay if the website is only for Chinese hearing aids and is good for search engine optimization. -Yasen Zhelev
No it is not. First, you will have problems registering your domain name in the DNS system (you will have to resolve it to Punycode)
Second, Googlebot and BingBot rate keywords in URLs very high (PageRank) which unfortunately won't be recognized if your url is punycode / whatever encoded (well, maybe google fixed that, but ms probably not for another year or two) .
Third, in terms of page names, the browser must support these languages, which is unsafe for anything that is not English.
Just no ... First of all, SEO wants your url to be easily accessible, and I'm not sure if people write a url that's simple like:
First of all, your url will be so tireless ... This site is a simple tool for url chech for SEO ...
Most web-based frameworks support slugification of your page names into accessible URLs.
So, 1-Keep your URLs accessible 2-Define your page title and meta tags so spiders will render them correctly as meta tags have no problem with special characters ...
I'm not entirely sure about SEO. But since you've tagged it with ease of use, I want to add that it's not a very good idea. It is next to impossible for someone with a non-Chinese keyboard layout to enter your URL. Unless this is extremely important in search engine optimization, I would advise you to stay away from it.
3If the URL contains Chinese characters, it should have a Chinese audience with Chinese keyboards -Carlos Muñoz
Maybe, but a good web developer can't take this for granted. Especially if the site can lead to a sale. -Joyce Babu
1You can take it for granted as stated in the specification. If you're working on websites that only make sense locally (in non-English locations) this makes perfect sense. -Carlos Muñoz
I am not ready for a fight. I just gave my opinion. I will never do it unless it is extremely important. Your website, your rule. My side, my rule :) -Joyce Babu
If most of the users are Chinese searching in their native language, the answer isYES
URLs cannot contain non-ASCII characters. However, it is possible to encode non-ASCII characters in ASCII.
You can use IDN in the domain name part. I don't know how well that's supported, but it's there.
You can use% Escape notation for Unicode code points in the prt path. This is well supported by current browsers and understood by search engines - so it is indeed a good search engine optimization. We use it for characters with a European accent and everything works fine.
This might interest you
mod_rewrite to alias one file suffix type to another
Linking source and search words to account creation
How does Stack Overflow generate its SEO-friendly URLs?
How do I get the Googlebot to get the correct GEOIPed content? [closed]
How do I generate a friendly url in Symfony PHP?
Why do some websites add "slugs" at the end of URLs? [closed]
SEO Superstition: Are