by Tina Dam on March 6, 2010

Since the launch of the Fast Track Process, ICANN has received many questions about how the DNS Stability Panel will determine a confusingly similar string; that is, a requested string that is confusing similar with an existing ccTLD, gTLD or applied-for TLDs.

The overall rules seem clear:

1) If you apply for an IDN ccTLD that is confusingly similar with an existing ccTLD, gTLD, or reserved name, then your request will be declined.

2) If you request an IDN ccTLD that is confusingly similar to a “validated” IDN ccTLD, then your request will be declined.

3) If you request an IDN ccTLD that is confusingly similar to another IDN ccTLD under evaluation, and yet not “validated”, then both request will be placed on hold until a solution is found.

4) If you request an IDN ccTLD that is confusingly similar to an applied-for gTLD string that has reached Board approval, and hence considered an existing TLD, then your request will be declined.

5) If you request an IDN ccTLD that is confusingly similar to an applied-for gTLD string, then both parties will be informed.

Validation, for the purpose of the Fast Track Process means that it has been established that the string is a meaningful representation of the corresponding country/territory name, and that it has successfully passed the DNS Stability Panel evaluation.

However, it is the notion of confusingly similar and exactly how it is established that two or more strings are so confusingly similar that they cannot co-exist in the DNS, that reasonably is raising questions.

As the Final Implementation Plan states, any such determination is on a case-by-case basis. However, it is probably useful to provide some insight into how the panel makes such a determination.

While the determination is done by the DNS Stability Panel, Fast Track participants should know that ICANN staff will provide them with concerns about confusability (if such is found) during the initial review of a Fast Track request. The requester then has the opportunity to either (i) change the string they requested, (ii) withdraw the request and resubmit at a later stage, or (iii) continue with the request as originally submitted.

Type styles, fonts, etc.

Issue: A sufficiently creative choice of type styles or the exploitation of information about scripts that a given user may be unable to display can result in one character (or a sequence of characters) in one script being visually confusable with one or more characters (or character sequence(s)) in another script.

The issue becomes even more serious for closely related scripts (for example, Greek/Latin/Cyrillic).

While we are aware of the issues, some level of risk must be accepted. These kinds of issues cannot be completely guarded against, especially as type styles and fonts (just like languages and scripts) evolve and change over time.

Instead, determining confusability is focused on issues that may arise from the basic geometry of characters that is preserved, to a greater or lesser degree, across a variety of fonts, styles, and formatting.

Two-character strings

Issue: Two-character strings that consist of Unicode code points in scripts such as the Latin, Greek, and Cyrillic script blocks are intrinsically confusable with currently defined or potential future country code TLD (ccTLD) strings based on the ISO 3166-1 alpha-2 codes.

This is particularly true when variations in font and presentation interface are considered. And it is not limited to the pairs of “visually confusable characters” identified in Unicode Technical Report #39. Those characters are based on Unicode Reference Fonts that are deliberately designed to reduce the potential for visual confusion.

Therefore, a very conservative standard is being used to assess applied-for strings that consist of two Greek, Cyrillic, or Latin characters, including a default presumption of confusability to which exceptions may be made in specific cases.

How are strings ranked?

The Fast Track Process recognizes the following rankings for requested two-character IDN ccTLD strings. The higher the rank the more likely the applied-for string as a whole presents a significant risk of user confusion.

[6] Both characters are visually identical to an ISO 646 Basic Version (ISO 646-BV*) character. [International Organization for Standardization, "Information Technology – ISO 7-bit coded character set for information interchange," ISO Standard 646, 1991.]

[5] One character is visually identical to, and one character is visually confusable with, an ISO 646-BV character.

[4] Both characters are visually confusable with, but neither character is visually identical to, an ISO 646-BV character.

[3] One character is visually distinct from, and one character is visually identical to, an ISO 646-BV character.

[2] One character is visually distinct from, and one character is visually confusable with, an ISO 646-BV character.

[1] Both characters are visually distinct from an ISO 646-BV character.

Some disagreement may arise in assessing whether a string is confusingly similar with existing ccTLDs, gTLDs, or applied-for strings. Thus, these rankings are for guidance only, and the DNS Stability Panel makes its assessment based on the rankings and on the expertise of the panelists. In difficult situations, the panel may conduct extended evaluations that also can include drawing on additional linguistic expertise.

The likelihood of user confusion presented by a given two-character IDN ccTLD string does not depend strictly on the individual confusability of each character, if considered separately. The assessment of “visually distinct” and “visually confusable” takes into account both the individual features of each character and their combined effect.

In general, a two-character IDN string at rank [4] or higher presents a significant risk of user confusion.

In general, a two-character IDN string at rank [3] or lower does not present a significant risk of user confusion.

What about confusable strings already in the DNS root zone?

Some have argued that we already have TLDs in the DNS root zone that could be considered confusingly similar, so there is no need to prevent future confusingly similar strings from being entered in the root zone as well. There is only one answer to such statement: Just because there are issues today does not mean that we should make it worse for the future!

Finally, thank you to the DNS Stability Panel for all their work in this area and for generating the rankings based on their professional experience and prelaunch training!

Taran Rampersad 03.09.10 at 1:10 am

It might be a good idea to give examples. Otherwise, what I get out of this is that some group has decided that they will decide what is ‘confusingly similar’ by some guidelines – something that takes less than a paragraph to write.

Examples. Precedents. Any?

Tina Dam 03.09.10 at 1:16 pm

@Taran, good point. I will try to see if we can have a couple of examples that will not be seen as problematic from the corresponding country or territory.

Hopefully you can imagien that several characters in for example Cyrillic and Greek look a lot like the basic Latin characters (a,b,c…) and hence IDN ccTLD strings can look like exisiting ccTLDs (.gr, .cn, .br, .py, .kr, etc…)


Leonid Todorov 03.09.10 at 9:37 pm

Fr0m what I can see, there clearly is some confusion with respect to a new IDN TLD\’s name selection procedure. So, why Russia has embarked for .РФ?
The reason is two-fold.
First, while choosing between РУ (\” RU \” transliterated in Russian) and РФ, clearly, the PУ looks pretty much similar to the combination of Latin characters P and Y, which would confuse potential Russian users who have no command in English. Plus, such a combination bears no sense in Russian.
In contrast, \”РФ\” contains a unique non-lating character \”Ф\” (pronounced as [ef]), which will undisputably distinguish it from any other TLD, no matter it is based on the Latin or Cyrillic script; furthermore РФ is an informal common acronym for Russian Federation (Российская Федерация) and as such is broadly recognized and actively used by Russian-laanguage speakers.
The selection of the РФ was made by means of online poll ccTLD .RU has been running for a few weeks back in 2008. The overwhelming majority of Internet users voted nearly unanimously for .РФ – vox populi, vox dei, and since then on the acronym \”РФ\” has become an \”official\” name of the soon-t0-unfold (touch wood!) IDN TLD for Russia.
Hope it will help

Tina Dam 03.11.10 at 5:48 pm

@ Leonid, thanks for providing a real example. I think the explanations around rationales for the РФ extension is very useful and probably can be helpful for other countries and territories considering participation in the Fast Track Process.


Nilya 03.24.10 at 12:08 am

for Russian Federation

Leontinka 03.24.10 at 2:29 am

which would confuse potential Russian users who have no command in English.

Tina Dam 06.02.10 at 2:57 pm

@Николай Филипов / Nikolay Filipov : I am sorry for the late reply to you. Unfortunately I wont be able to help you much. The requests in the IDN ccTLD fast Track Process are confidential and I am unable to discuss any of them in any detail.

In general, the confusingly similar verification or assesment is done by a panel of experts that is external to ICANN staff – as such this is not me or my collegueas that are making these types of decisions, but experts in the field. I do want to note that the subject of confusingly similar is very important and adding these IDNs means a much bigger risk for domain names to be confusingly similar. The uniqueness principle is one of the main reasons that the Internet, or the domain name part of it, works so well. As such we are very careful about introduction of new TLDs, of any kind.


Vassil Petev 08.24.10 at 7:28 am

[This information has already been submitted to ICANN through the regular feedback channels]

I do believe that there is a shortcoming in the way the Fast Track Process determines a confusingly similar string, which I have analyzed in detail in a blog:
Putting String Similarity into Context: Bulgaria’s IDN (.бг) vs. Brazil’s ccTLD (.br)

Everyone’s comments will be greatly appreciated!

