[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Freeipa-devel] [PATCH] 971 detect binary LDAP data

Jan Cholasta wrote:
On 28.2.2012 18:02, Petr Viktorin wrote:
On 02/28/2012 04:45 PM, Rob Crittenden wrote:
Petr Viktorin wrote:
On 02/28/2012 04:02 AM, Rob Crittenden wrote:
Petr Viktorin wrote:
On 02/27/2012 05:10 PM, Rob Crittenden wrote:
Rob Crittenden wrote:
Simo Sorce wrote:
On Mon, 2012-02-27 at 09:44 -0500, Rob Crittenden wrote:
We are pretty trusting that the data coming out of LDAP matches
schema but it is possible to stuff non-printable characters into

I've added a sanity checker to keep a value as a python str type
(treated as binary internally). This will result in a base64
blob be returned to the client.

I don't like the idea of having arbitrary binary data where unicode
strings are expected. It might cause some unexpected errors (I have a
feeling that --addattr and/or --delattr and possibly some plugins might
not handle this very well). Wouldn't it be better to just throw away the
value if it's invalid and warn the user?

This isn't for user input, it is for data stored in LDAP. User's are going to have no way to provide binary data to us unless they use the API themselves in which case they have to follow our rules.

Shouldn't you try to parse it as a unicode string and catch
TypeError to
know when to return it as binary ?


What we do now is the equivalent of unicode(chr(0)) which returns
u'\x00' and is why we are failing now.

I believe there is a unicode category module, we might be able to
that if there is a category that defines non-printable characters.


Like this:

import unicodedata

def contains_non_printable(val):
for c in val:
if unicodedata.category(unicode(c)) == 'Cc':
return True
return False

This wouldn't have the exclusion of tab, CR and LF like using ord()
is probably more correct.


Freeipa-devel mailing list
Freeipa-devel redhat com

If you're protecting the XML-RPC calls, it'd probably be better to
at the XML spec directly: http://www.w3.org/TR/xml/#charsets

Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |

I'd say this is a good set for CLI as well.

And you can trap invalid UTF-8 sequences by catching the
UnicodeDecodeError from decode().

I don't think we should care about XML-RPC in LDAP-specific code at all.
If you want to do some additional checks, do them in XML-RPC-specific code.

We are trusting that the data in LDAP matches its schema. This is just belt and suspenders verifying that it is the case.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]