Domain names with special characters (IDNs)
Your domain name can contain characters from any official EU language script. These characters include, for example, the Swedish å, the German ü, the Romanian ș and characters from the Bulgarian (Cyrillic) and Greek alphabets as a whole.
Domain names that contain these special, so called non-ASCII, characters are called Internationalised Domain Names (IDNs).
IDNs are particularly important as the European Union has 28 Member States and 24 official languages and many of these languages have non-ASCII characters in their alphabets.
To know which non-ASCII characters can be used in your domain name, please consult our supported character list below.
There are also certain domain name rules that you should bear in mind when choosing to register an IDN.
Please note that with the introduction of the .ею (Cyrillic string), the script of the second level domain name must match the script of the TLD extension (.eu, .ею). In other words, if the domain name being registered is in Latin script, the script at the top-level will be .eu. On the other hand, if the domain name being registered is in Cyrillic script, the script at the top-level will be .ею. A registrar wishing to register an exclusively numeric domain name - possibly including hyphens - should specify the TLD extension during registration. In the case that the extension is not specified, the .eu extension will be set by default.
Internet users can still reach your website or email account using your IDN ACE string if their browsers or email applications don't yet support IDNs.
Supported characters and bundling tables
Classic (Non IDN) domain names consist of:
- Characters a to z
- Digits 0 through 9
- The hyphen (-)
- Always have the .eu extension
IDNs consist of:
- Digits 0 through 9
- Hyphen (-)
- Unicode characters from the Cyrillic, Greek or Latin scripts. Click here for a complete list of supported characters.
- Cannot combine characters from different scripts. All the characters of the second level (ie. the part before the extension) must come from a single script. Domain names made up of Latin or Greek characters will have the .eu extension, while domain names made up entirely of Cyrillic characters will have the .ею extension. The digits 0 through 9 and the hyphen can be used with all Latin, Cyrillic and Greek characters.
Here you can find a list of all the non-ASCII characters you are able to use in your domain name as well as the homoglyph bundling tables. Each character is listed with its official Unicode number.
IDNA2008 and homoglyph bundling
Following the amendment of EC Regulation 874/2004, on 6 May 2015 EURid introduced a revised mechanism for handling Internationalised Domain Names containing non-ASCII characters (shift from IDNA2003 to IDNA2008) as well as the so-called “homoglyph bundling”.
Implications of moving from IDNA2003 to IDNA2008
IDNA stands for Internationalising Domain Names in Applications. It is a mechanism for handling internationalised domain names containing non-ASCII characters. For instance, IDNA2003 mapped IDNs as follows: café as a normalised IDN is converted into an ACE-string, namely: xn--caf-dma. The same applies to кафене which is converted into xn--80akarr4b.
From the moment the EURid registration system supported the IDNA2008 protocol, the following updates entered into force:
A) The list of accepted characters is adjusted to those supported by the IDNA2008 protocol. The most recent version of the list of accepted characters can be consulted . More specifically:
- ß and ς are no longer mapped to equivalent letters, but can be used in the input as fully accepted characters.
- The lower case mapping still converts upper case characters into their lower case equivalent (A → a, B → b, etc.), however there is an exception to this rule: Σ → σ.
- Mapping of ẞ → ß.
- ŀ and ŉ will continue to mapped to separate characters: normal l followed by dot and apostrophe followed by normal n.
- Greek letters with iota below continue to be allowed on the input, and will be mapped to separate characters. For instance: ᾳ → αι.
Once a domain name is case folded, it is normalized. In Cyrillic no further normalisation of the domain name is done. For the Latin and Greek scripts, the normalisation tables are and contain the characters that are actually normalised (transformed into another character or series of characters). The actual registered domain name is the domain name that is the result of this normalisation step.
B) Domain names from two different scripts that are visually indistinguishable and therefore, might lead to confusion, are bundled via the so-called “homoglyph bundling” procedure.
C) The legacy registrations, meaning all registrations existing prior to 6 May 2015 that are no longer compliant with the new registration rules, either because they contain characters no longer supported or contain sequences of characters no longer allowed, continue to be registered but specific Legacy rules apply.
As long as the legacy registration continues to be registered, standard transactions such as updates, renewals, transfers and reactivations from quarantine continue to be possible. Should the legacy domain name be deleted, it will no longer be able to be registered.
Standard transactions such as updates, renewals, transfers and reactivations from quarantine continue to be possible. Should the legacy domain name be deleted, its status becomes “not allowed”.
Introduction to Homoglyph Bundling
Homoglyphs are characters which, due to similarities in size and shape, might appear identical at first glance. The homoglyphs below represent two unique characters belonging to two different scripts, or alphabets:
Cyrillic character a → Unicode number 0430
Latin character a → Unicode number 0061
With the introduction of the so-called “homoglyph bundling” procedure, domain names that might look confusingly similar are prevented from being registered.
Homoglyph bundling is when you register an IDN and the registration system automatically bundles all the homoglyphs of that name (if there are any). This means that several domain names are bundled at one time, and none of the other domain names in that bundle can be registered.
The Homoglyph Bundling rules can be summarised as follows:
A) Visually similar characters across different scripts are bundled.
- Latin e versus Cyrillic е
- Latin a versus Greek α (uppercase)
There are exceptions to this rule. Below you will be able to find a non-exhaustive list. More detailed information can be found :
- Latin ß and Latin ss,
- Latin ss and Greek β: these are characters from 2 different scripts, which are not visually similar,
- Greek ς and Greek σ,
- Greek α and Greek ἀ ἁ ἂ ἃ ἄ ἅ and Greek ᾀ ᾁ ᾂ ᾃ ᾄ ᾅ and
- Greek αi and Greek ἀi ἁi ἂi ἃi ἄi ἅi.
B) If one domain name in a homoglyph bundle exists, none of the other domain names in that bundle can be registered.
The word “exists” should be interpreted in the previous sentence as having either one of the following .eu domain name statuses: in use, registered (on hold, suspended, seized), withdrawn, quarantine. When querying a domain name that is in a bundle via the WHOIS, it will return the status “homoglyph blocked”.
Should one or more domain names happen to be part of a bundle but were registered before 6 May 2015, they will continue to be registered. Should they be deleted, they will not be available for new registration and become “homoglyph blocked” in the EURid WHOIS database.
As described earlier, as a consequence of the implementation of the IDNA2008 standard protocol that replaces the currently deployed IDNA2003 protocol, new characters are going to be supported when registering a .eu domain name while others are going to be phased out.
This section aims to explain both the changes from the supported character perspective and the legacy policy for characters or sequences of characters being phased out.
Managing the introduction of the ß (Latin small letter Sharp S, Unicode U+00DF) and the ς (Greek small letter ending Sigma, Unicode U+03C2):
The IDNA2008 protocol supports both the German Eszett (ß) and the Greek ending sigma (ς) on input as fully allowed characters. Due to the introduction of the homoglyph bundling mechanism, both characters are part of the homoglyph bundling algorithm, meaning that registered domain names containing characters “ss” or the Greek normal sigma (σ) prevent domain names with German Eszett (ß) or Greek ending sigma (ς) from being registered.
However, considering the limited support of the newly introduced characters by many web browsers, a registrant who has registered a domain name containing characters “ss” or the Greek normal sigma (σ), or vice versa - German Eszett (ß) or Greek ending sigma (ς) - can request EURid to activate the corresponding domain name written with the the German Eszett (ß) or Greek ending sigma (ς) - or vice versa with the characters “ss” or the Greek normal sigma (σ) - at any time. The two names must be assigned to the same registrant. They will coexist and both be invoiced to the registrar.
EURid will regularly check that the domain names are assigned to the same registrant and if not, will revoke the domain name activated by the latter registrant.
EURid will continue to investigate and assess the support of the IDNA2008 protocol through the most common client software (web browsers, email clients, …). When the aforementioned support is deemed sufficient by EURid and the Internet, as well as by the technical community, the domain names for which two “versions” coexist - those with characters “ss”/ German Eszett (ß) or the Greek normal sigma (σ)/Greek ending sigma (ς) – the registrar will be requested to choose which domain name they wish to keep activated. The other domain name will be withdrawn and be homoglyph bundle blocked by the other name.
Registrants of existing domain names which contain the aforementioned characters who wish to activate the corresponding domain name written with the equivalent characters will have to:
- Either contact EURid or the registrar to inform either party of their choice to activate the equivalent domain name.
This policy supersedes the previously communicated policy that foresaw the following:
"To allow registrants of existing domain names which contain the characters “ss” or the Greek normal sigma (σ) to switch to the corresponding domain name written with the German Eszett (ß) or the Greek ending sigma (ς), EURid has designed the following policy: If a registrant has registered a .eu domain name with “ss” or Greek normal sigma (σ) before 6 May 2015, it will continue to exist and will remain registered. The registrant may keep the currently registered domain name, or may at any time request that the equivalent domain name with German Eszett (ß) or Greek ending sigma (ς) be activated. By requesting that the equivalent domain name is activated, the registrant and registrar accepts that one year later the domain name with “ss” or Greek Normal Sigma (σ) is revoked and homoglyph bundled.
EURid will directly activate the domain name in the registrar’ portfolio. This is considered a normal new registration and is charged as such. Furthermore, the new domain name is going to have its own registration and expiry dates independently from those of the original name. The currently registered domain name will enter into a one (1) year phase-out period. After the phase-out period the original domain name will be revoked and will be homoglyph blocked in the EURid WHOIS database, which will prevent it from being registered.
Please note that the option of requesting the activation of the domain name with the newly supported character has unlimited validity considering that in any case the currently registered domain name will prevent the equivalent domain name with German Eszett (ß) or Greek Final Sigma (ς) from being registered (therefore, having the homoglyph blocked status in the WHOIS database)."
Managing .eu domain names with hyphens in the second, third and fourth position, or with “ŀ” (L followed by middle dot but not followed by a subsequent L), or with "ı" (dotless i):
.eu domain names that have been registered
- with hyphens in the second, third and fourth position, or
- with “ŀ” (L followed by middle dot but not followed by a subsequent L), or
- with "ı" (dotless i)
are no longer supported. To allow registrants to seek proper solutions to find possible alternatives, they will remain operational for a term of one (1) year until 6 May 2016. After then, they will be revoked and will not be allowed for re-registration. During the one (1) year phase-out period (until 6 May 2016) registrants can still update, reactivate the domain name when in quarantine or transfer it. However if deleted, these domain names will not become available for a new registration.