Unicode Homograph Exploits

Post by **biolizard89** » Sun Dec 22, 2013 2:03 pm

Does anyone have thoughts on preventing unicode homographs? Info on the attack: https://en.wikipedia.org/wiki/IDN_homograph_attack

If the Namecoin client disallowed homographs (either by rejecting transactions or by not loading them into nameindex.dat), would this break any non-d/ namespaces? Or should this check be done by nmcontrol (which would require nmcontrol to know the age of names and keep a list of all names, and would make it harder to notice in advance if registering a name would trigger the rejection)? How would hashing of names as per Gregory Maxwell's proposal affect the difficulty of defending against this attack?

Anyone have thoughts on this?

virtual_master · Post by **virtual_master** » Sun Dec 22, 2013 5:35 pm

biolizard89 wrote:Does anyone have thoughts on preventing unicode homographs? Info on the attack: https://en.wikipedia.org/wiki/IDN_homograph_attack

If the Namecoin client disallowed homographs (either by rejecting transactions or by not loading them into nameindex.dat), would this break any non-d/ namespaces? Or should this check be done by nmcontrol (which would require nmcontrol to know the age of names and keep a list of all names, and would make it harder to notice in advance if registering a name would trigger the rejection)? How would hashing of names as per Gregory Maxwell's proposal affect the difficulty of defending against this attack?

Anyone have thoughts on this?

I cannot see how hashing of names could solve this problem as you can hash anything but may be I am wrong. Hashing of names could still have sense for other reasons.
The problem of spoofing is also a general problem and I don't think this particular field deserve a higher attention than others.

wikipedia wrote:The internationalized domain name (IDN) homograph attack is a way a malicious party may deceive computer users about what remote system they are communicating with, by exploiting the fact that many different characters look alike, (i.e., they are homographs, hence the term for the attack). For example, a person frequenting citibank.com may be lured to click the link [сitibank.com] (punycode: xn--itibank-xjg.com/) where the Latin C is replaced with the Cyrillic С.

What is for Latin characters writers spoofing is not for Russians but for them could be Latin characters spoofing.
The only possibility to avoid completely this type of attacks is not to click on links from email but to write the link in your own keyboard in the browser.

I am thinking about how hashing and validating content could present a more general solution for this problem and other related also.
(http://dot-bit.org/forum/viewtopic.php?f=5&t=1336)
Anyway the domains could be spoofed with other methods also and protecting static contents with a certificate in the blockchain would present a more general solution, even for content spoofing or seizure.
But spoofers could hash their content also and make a certificate.

Disabling Unicode characters wouldn't solve completely the problem but would disadvantage other languages.
Splitting the domain system in character classes could solve partially but would require a lot of work and would create also confusion.

I have an idea. Or an addition to your idea.
What about ranking domains by (domain age) X (the registration fee payed) ?
Domain age alone wouldn't be enough as somebody could register in advance a spoof-able name and hashing names would even hide this. But I suppose that a serious company or organization will pay the max registration fee of 200 NMC and a spoofer the minimum because he must have more names and this would cost him very much so his ranking would be lower.
Showing the domain ranking would help if the user already used the real domain.
This solution would be beneficial for the network also as legitimate companies and organizations would be rewarded with a higher ranking and more security against spoofing(not only against contesting the name) if paying the highest fee.

This ranking could be applied for Namecoin IDs (id/) also but there should be no contesting system as anybody sould afford to keep an id/ with maximum of security. The only reward for paying a higher fee for an id/ to have a higher pseudonymous authority/credibility and ranking.

Ben · Post by **Ben** » Sun Dec 22, 2013 8:57 pm

I don't think this is a problem Namecoin has to or should (because Namecoin is just an arbitrary key/value store, right?) handle. Every major browser has protection against this by showing mixed script domains as Punycode, so it's already a solved problem at the browser level.

Namecoin Forum

Unicode Homograph Exploits

Unicode Homograph Exploits

Re: Unicode Homograph Exploits

Re: Unicode Homograph Exploits