[Freeipa-devel] [PATCH] 0015 Only split CSV strings once

Petr Viktorin pviktori at redhat.com
Tue Mar 20 17:02:06 UTC 2012


On 03/16/2012 12:55 PM, Petr Viktorin wrote:
> On 03/15/2012 08:55 PM, Rob Crittenden wrote:
>> Petr Viktorin wrote:
>>> https://fedorahosted.org/freeipa/ticket/2227 (Unable to add certain sudo
>>> commands to groups). What an interesting bug to get :)
>>>
>>>
>>> One problem with our CSV splitting is that it's not idempotent
>>> (baskslashes are "eaten" when there are escaped commas), but when we
>>> forward a call it gets done on both the client and the server. The
>>> attached patch fixes that in a somewhat hacky way. It's not a complete
>>> fix for the issue though, for that I need to ask what to do.
>>>
>>>
>>> Some Params use the CSV format when we just need a list of
>>> comma-separated values. Our flavor of CSV does escaping in two different
>>> ways. This is pretty bad for predictability, least-surprise,
>>> documentation, ...
>>>
>>> To recap, the two ways are escaping (escaped commas are ignored,
>>> backslashes also need to be escaped) and quoting (values are optionally
>>> enclosed in double quotes, to include a '"' in a quoted string, use
>>> '""').
>>> Escaping is good because users are used to it, but currently literal
>>> backslashes need to be doubled, making multivalue syntax different from
>>> single-value syntax, even if there are no commas in the value.
>>> Quoting is weird, but complete by itself so it doesn't also need
>>> escaping.
>>>
>>> Do we need to use both? Is this documented somewhere? Some of our tests
>>> check only for one, some assume the other. Users are probably even more
>>> confused.
>>>
>>>
>>> If we only keep one of those, the fix for #2227 should be quite easy.
>>> If not (backwards compatibility), we need to document this properly,
>>> test all the corner cases, and fix the UI to handle CSV escaping.
>>> Since it's subtly broken in lots of places, maybe it's best to break
>>> backwards compatibility and go for a simple solution now.
>>>
>>> Anyway, if I want to do a proper fix, CSV handling should be only done
>>> on the client. Unfortunately turning off CSV handling in the server will
>>> break the UI, which at places uses `join(',')` (no escaping commas, no
>>> single place to change it), even though it can just send a list.
>>
>> I'm with Honza, not too keen on the _forwarded_call part. I'd rather you
>> do a mix of comparing self.env.in_server and whether the value is a
>> basestring to determine if we need to split it.
>
> This was a hack necessary because the WebUI relied on server-side
> splitting (in a few cases, and it did not escape correctly). Now that
> Petr Vobornik's patch 99 is in, it is unnecessary.
> This discussion thread has an updated patch (0015-03); what I'm
> attaching now builds on top of that to add several new changes in tests.
> To summarize it: the splitting is only done in the client; from the RPC
> call on no CSV magic is involved at all.
>
> Please tell me your reasons for basestring checking. I think it's
> entirely unnecessary and just adds complexity.
> In the rest of the framework a string, as a parameter value, has the
> same effect as a list of that one string. Why break this?
> We are concerned about the API here, not the command line. Shortcuts to
> save the user's keystrokes in the common case should *not* win over
> predictability and making sure the easier way is the correct, robust way
> to do it. Compare -- with basestring checking:
>
> some_command(arg=escape_commas(some_string))
> - is correct
>
> some_command(arg=[some_string])
> - is also correct
>
> some_command(arg=some_string)
> - is WRONG because the string is not escaped
> - but is the easiest solution!
>
> The problem is much worse because it is subtle: it only affects values
> with commas and backslashes, so *the wrong solution would work* in most
> of the cases.
> And I can assure you plugin writers *would* pick the wrong way if it
> would usually work. Our own web UI had examples.
>
> Contrast to my proposed behavior:
> some_command(arg=some_string)
> - is correct
>
> some_command(arg=[some_string])
> - is also correct
>
> some_command(arg=escape_commas(some_string))
> - is wrong, but doesn't make much sense because at this level you don't
> have to worry about CSV at all
>
> Basestring checking would add unnecessary complexity for both us *and*
> the plugin writer. The only case it would make it easier for the plugin
> writer is if the plugin took CSV values ­-- but why force a random
> format, tailored for our specific frontend, into RPC clients?
>
> The client should translate the command line into a RPC call. There's no
> reason to arbitrarily expect the server to do part of the job.
>
>
>> I'm not sure why this is broken out of normalize. Isn't this normalizing
>> the incoming values into something more sane?
>
> "Normalization" implies it's an idempotent operation, which CSV
> splitting is not (at least its current incarnation, without the
> basestring checking):
> r"a,b\,c" splits to ["a", "b,c"] which splits to ["a", "b", "c"]
>
>

Just to make things clear: freeipa-pviktori-0015-04 is the latest 
version of this patch.
It's complemented by freeipa-pviktori-0021-03 which adds the tests for 
this (and other things) (but contains one more refactor).


If and when that's all reviewed and pushed, I'd like to discuss 
freeipa-pviktori-0018 (Simplify CSV escaping syntax). I'm not saying my 
solution there is perfect, but the flavor of Python's CSV parsing we're 
using now is quite quirky. It would be nice to get the syntax right, or 
at least better, while we're fixing this from a totally-broken state. 
Later there'll be backwards compatibility problems.
But, don't let that hold back this patch.

-- 
Petr³




More information about the Freeipa-devel mailing list