Skip to content
Thoughtful, detailed coverage of everything Apple for 34 years
and the TidBITS Content Network for Apple professionals

ICANN Tests Non-Roman Characters in Domain Names

For years, speakers of languages that use alphabets or symbols other than those found in the set of Roman characters used in English and Western European languages have been rather put out. They can’t use their own characters to represent full domain names. While certain alphabets, like Cyrllic, are supported through a strange mapping system known as punycode for parts of a domain name, the top-level domain (TLD), like .com, still has to be entered in English. That’s about to change.

The global Internet naming and numbering authority, ICANN (Internet Corporation for Assigned Names and Numbers) launched a test on October 15th that would enable a complete domain name to be entered using characters found outside of the Roman alphabet. The test domains are all called example.test in their native languages; the .test TLD is reserved for just such purposes. ICANN has put up a wiki page at each of 11 test languages’ addresses for people to experiment with.

Punycode converts alphabetic letters and symbols that are not found in the Roman alphabet into an obscure sequence starting with xn--. The test domain in Cyrllic, for instance, renders as xn--80akhbyknj4f in punycode. (This resulted in the potential for spoofing domain names that the folks at the Shmoo Group uncovered back in 2005; see “Don’t Trust Your Eyes or URLs,” 2005-02-14.)

Putting together this test almost provoked an international crisis. The original plan was to take the word hippopotamus in each tested language and insert the digits 1 and 8 in the middle to make it nonsensical. This was derailed when, one news service reports (although I can find no confirming documents at ICANN), that an Israeli registrar found the word that ICANN suggested for that river-dwelling creature in Hebrew was actually an expletive. The less politically sensitive example.test was chosen instead.

There are still issues to be resolved, such as whether .com in every language or spelling would all map to the U.S.-controlled .com domain. But this is the first real step towards eliminating the assumption of an English-speaking, Roman-character component to every domain name.

Subscribe today so you don’t miss any TidBITS articles!

Every week you’ll get tech tips, in-depth reviews, and insightful news analysis for discerning Apple users. For over 33 years, we’ve published professional, member-supported tech journalism that makes you smarter.

Registration confirmation will be emailed to you.

This site is protected by reCAPTCHA. The Google Privacy Policy and Terms of Service apply.