Authorized characters for names / IDNA
Last updated
Last updated
IDNA (Internationalizing Domain Names in Applications) refers to an effort to make all characters available for registering domain names and distributing web applications accross the web.
Previously only some ASCII characters were available, a-z, 0-9 as well as - (dash) and _ (underscore). The goal was of course to make the web more international and less tied with the latin/western alphabet.
We believe this feature has a lot more disadvantages than advantages, some obvious flaws of supporting a large portions of UTF-8 is the ease with which you can perform phishing attacks. Internationalization of domain names It is not at all a gain in security or accuracy, quite the opposite in fact.
This phishing attack on www.apple.com is an example.
Right now there is a per-extension policy going on to authorize certain characters depending on if you are deploying to .us, .com, .cn, .ru etc. Again it is a monkey patch thing, not a unified policy. Hundreds of issues and discussions have emerged as a consequence of this feature.
Our approach is simply to drop UTF-8 or large characters, and go straight back to [a-z0-9] 36 characters. 36 latin/characters system is very trustworthy and hard to mismatch, even for the uninformed users and/or users that are primarly familiar with non-latin set of characters.
See this IETF document (2003)
See how we validate names in dappy node software.