[Freeipa-devel] [PATCH 0070] Normalization check only for IDNA domains

Fri Jun 27 09:21:30 UTC 2014

On 27.6.2014 10:58, Alexander Bokovoy wrote:
> On Fri, 27 Jun 2014, Jan Cholasta wrote:
>> On 27.6.2014 10:29, Alexander Bokovoy wrote:
>>> On Fri, 27 Jun 2014, Jan Cholasta wrote:
>>>> On 27.6.2014 10:15, Alexander Bokovoy wrote:
>>>>> On Fri, 20 Jun 2014, Martin Basti wrote:
>>>>>> On Fri, 2014-06-20 at 10:32 +0200, Jan Cholasta wrote:
>>>>>>> On 18.6.2014 16:49, Martin Basti wrote:
>>>>>>>> Due to compability with older versions, only IDNA domains should be
>>>>>>>> checked
>>>>>>>> Patch attached.
>>>>>>>
>>>>>>> I'm not particularly happy about the u'\xdf' special case. Isn't
>>>>>>> there a
>>>>>>> better way to do this check?
>>>>>> I cant find better way. u'\xdf' is mapped to ss, and ss is not IDN
>>>>>> string.
>>>>>>
>>>>>> Or just remove this validation.
>>>>>>
>>>>>>> (BTW I really think this should be a warning, not an error, but that
>>>>>>> would require larger amount of work, so I guess it's OK for now.)
>>>>>> (More pain than gain)
>>>>> Main thing in this patch is that the check should not be done against
>>>>> non-IDN strings. I want this version of the patch to go in for that
>>>>> reason as currently you cannot even complete ipa-adtrust-install
>>>>> run due
>>>>> to IDN normalisation check being applied to non-IDN domains.
>>>>
>>>> On non-IDN domains, the only effect of IDN normalization is that it
>>>> lower-cases the names (right?), so the check should compare
>>>> lower-cased original name with the normalized name, instead of
>>>> special-casing certain characters etc.
>>> .. what's the reason to do such comparison then? lower-cased non-IDN
>>> name will be equal to lower-cased normalized non-IDN name by definition,
>>> so the check is not needed in this case, at all.
>>
>> The point is that it works for both IDN and non-IDN, without
>> u'\xdf'-style hacks.
> No, your proposal of comparing low-cased value and normalized value is
> not going to work because low-cased value is in general not equal to
> normalized value for IDN names, only for non-IDN ones, due to the fact
> that lower case for non-ASCII Unicode character may map to a completely
> different character than in normalization situation. Take, for example,
> Turkish alphabet where there are six letters with different case rules
> (uppercase dotted i, dottless lowercase i, upper- and lowercase G with
> breve accent, and upper- and lowercase S with cedilla), which will break
> your generalized check.
> So you'll anyway will need to split these cases.
>

I see.

I'm still not comfortable with carrying the bit of knowledge about 
u'\xdf' in this particular spot. Can we check that a name is IDN some 
other way than "domain_name.is_idn() or u'\xdf' in value"?

-- 
Jan Cholasta