[Freeipa-devel] [PATCH] Password vault

Mon Mar 23 10:07:39 UTC 2015

Dne 11.3.2015 v 15:12 Endi Sukma Dewata napsal(a):
> Thanks for the review. New patch attached to be applied on top of all
> previous patches. Please see comments below.

Thanks. I have replied to some of your comments below.

>
> On 3/6/2015 3:53 PM, Jan Cholasta wrote:
>> Patch 353:
>>
>> 1) Please follow PEP8 in new code.
>>
>> The pep8 tool reports these errors in existing files:
>>
>> ./ipalib/constants.py:98:80: E501 line too long (84 > 79 characters)
>> ./ipalib/plugins/baseldap.py:1527:80: E501 line too long (81 > 79
>> characters)
>> ./ipalib/plugins/user.py:915:80: E501 line too long (80 > 79 characters)
>>
>> as well as many errors in the files this patch adds.
>
> For some reason pylint keeps crashing during build so I cannot run it
> for all files. I'm fixing the errors that I can see. If you see other
> errors please let me know while I'm still trying to figure out the problem.

Well, I did not use pylint, but pep8: 
<http://pep8.readthedocs.org/en/latest/>

>
> Is there an existing ticket for fixing PEP8 errors? Let's use that for
> fixing the errors in the existing code.

There is no ticket, but we still follow PEP8 in new code, so please do 
that. It shouldn't be too hard.

> ...
>
>> 3) The container_vault config option should be renamed to
>> container_vaultcontainer, as it is used in the vaultcontainer plugin,
>> not the vault plugin.
>
> It was named container_vault because it defines the DN for of the
> subtree that contains all vault-related entries. I moved the base_dn
> variable from vaultcontainer object to the vault object for clarity.

That does not make much sense to me. Vault objects are contained in 
their respective vaultcontainer objects, not directly in cn=vaults.

>
>> 4) The vault object should be child of the vaultcontainer object.
>>
>> Not only is this correct from the object model perspective, but it would
>> also make all the container_id hacks go away.
>
> It's a bit difficult because it will affect how the container & vault
> ID's are represented on the CLI.

Yes, but the API should be done right (without hacks) first. You can 
tune the CLI after that if you want.

>
> In the design the container ID would be a single value like this:
>
>    $ ipa vault-add /services/server.example.com/HTTP
>
> And if the vault ID is relative (without initial slash), it will be
> appended to the user's private container (i.e. /users/<username>/):
>
>    $ ipa vault-add PrivateVault
>
> The implementation is not complete yet. Currently it accepts this format:
>
>    $ ipa vault-add <vault name> [--container <container ID>]
>
> and I'm still planning to add this:
>
>    $ ipa vault-add <vault ID>
>
> If the vault must be a child of vaultcontainer, and the vaultcontainer
> must be a child of a vaultcontainer, does it mean the vault ID would
> have to be split into separate arguments like this?
>
>    $ ipa vaultcontainer-add services server.example.com HTTP
>
> If that's the case we'd lose the ability to specify a relative vault ID.

Yes, that's the case.

But I don't think relative IDs should be a problem, we can do this:

     $ ipa vaultcontainer-add a b c  # absolute
     $ ipa vaultcontainer-add . c    # relative

or this:

     $ ipa vaultcontainer-add '' a b c  # absolute
     $ ipa vaultcontainer-add c         # relative

or this:

     $ ipa vaultcontainer-add a b c         # absolute
     $ ipa vaultcontainer-add c --relative  # relative

or this:

     $ ipa vaultcontainer-add a b c --absolute  # absolute
     $ ipa vaultcontainer-add c                 # relative

> ...
>
>> 11) No clever optimizations like this please:
>>
>> +        # vault DN cannot be the container base DN
>> +        if len(dn) == len(api.Object.vaultcontainer.base_dn):
>> +            raise ValueError('Invalid vault DN: %s' % dn)
>>
>> Compare the DNs by value instead.
>
> Actually the DN values have already been compared in the code right
> above it:
>
>      # make sure the DN is a vault DN
>      if not dn.endswith(self.api.Object.vaultcontainer.base_dn):
>          raise ValueError('Invalid vault DN: %s' % dn)
>
> This code confirms that the incoming vault DN is within the vault
> subtree. After that, the DN length comparison above is just to make sure
> the incoming vault DN is not the root of the vault subtree itself. It
> doesn't need to compare the values again.

I see. You can combine both of the checks into one:

     if not dn.endswith(self.api.Object.vaultcontainer.base_dn, 1):
         raise ValueError(...)

>...
>
>> 14) Use File instead of Str for input files:
>>
>> +        Str('in?',
>> +            cli_name='in',
>> +            doc=_('File containing data to archive'),
>> +        ),
>
> The File type doesn't work with binary files because it tries to decode
> the content.

OK. I know File is broken and plan to fix it in the future. Just add a 
comment saying that it should be a File, but it's broken, OK?

> ...
>
>> 16) You do way too much stuff in vault_add.forward(). Only code that
>> must be done on the client needs to be there, i.e. handling of the
>> "data", "text" and "in" options.
>>
>> The vault_archive call must be in vault_add.execute(), otherwise a) we
>> will be making 2 RPC calls from the client and b) it won't be called at
>> all when api.env.in_server is True.
>
> This is done by design. The vault_add.forward() generates the salt and
> the keys. The vault_archive.forward() will encrypt the data. These
> operations have to be done on the client side to secure the transport of
> the data from the client through the server and finally to KRA. This
> mechanism prevents the server from looking at the unencrypted data.

OK, but that does not justify that it's broken in server-side API. It 
can and should be done so that it works the same way on both client and 
server.

I think the best solution would be to split the command into two 
commands, server-side vault_archive_raw to archive already encrypted 
data, and client-side vault_archive to encrypt data and archive them 
with vault_archive_raw in its .execute(). Same thing for vault_retrieve.

BTW, I also think it would be better if there were 2 separate sets of 
commands for binary and textual data 
(vault_{archive,retrieve}_{data,text}) rather than trying to handle 
everything in vault_{archive,retrieve}.

>
> The add & archive combination was added for convenience, not for
> optimization. This way you would be able to archive data into a new
> vault using a single command. Without this, you'd have to execute two
> separate commands: add & archive, which will result in 2 RPC calls anyway.

I think I would prefer if it was separate, as that would be consistent 
with other plugins (e.g. for objects with members, we don't allow adding 
members directly in -add, you have to use -add-member after -add).

>
>> 17) Why are vaultcontainer objects automatically created in vault_add?
>>
>> If you have to automatically create them, you also have to automatically
>> delete them when the command fails. But that's a hassle, so I would just
>> not create them automatically.
>
> The vaultcontainer is created automatically to provide a private
> container (i.e. /users/<username>/) for the each user if they need it.
> Without this, the admin will have to create the container manually first
> before a user can create a vault, which would be an unreasonable
> requirement. If the vault_add fails, it's ok to leave the private
> container intact because it can be used again if the user tries to
> create a vault again later and it will not affect other users. If the
> user is deleted, the private container will be deleted too.
>
> The code was fixed to create the container only if they are adding a
> vault/vault container into the user's private container. If they are
> adding into other container, the container must already exist.

This sounds like a job fit for the managed entries plugin. Have you 
tried using it for this?

>
>> 18) Why are vaultcontainer objects automatically created in vault_find?
>>
>> This is just plain wrong and has to be removed, now.
>
> The code was supposed to create the user's private container like in
> #17, but the behavior has been changed. If the container being searched
> is the user's private container, it will ignore the container not found
> error and return zero results as if the private container already
> exists. For other containers the container must already exist. For this
> to work I had to add a handle_not_found() into LDAPSearch so the plugins
> can customize the proper search response for the missing private container.

No ad-hoc refactoring please. If you want to refactor anything, it 
should be first designed properly and put in a separate patch.

Anyway, what should actually happen here is that if parent object is not 
found, its object plugin's handle_not_found is called, i.e. something 
like this:

     parent = self.obj.parent_object
     if parent:
         self.api.Object[parent].handle_not_found(*args[:-1])
     else:
         raise errors.NotFound(
             reason=self.obj.container_not_found_msg % {
                 'container': self.obj.container_dn,
             }
         )

> ...
>
>> 21) vault_archive is not a retrieve operation, it should be based on
>> LDAPUpdate instead of LDAPRetrieve. Or Command actually, since it does
>> not do anything with LDAP. The same applies to vault_retrieve.
>
> The vault_archive does not actually modify the LDAP entry because it
> stores the data in KRA. It is actually an LDAPRetrieve operation because
> it needs to get the vault info before it can perform the archival
> operation. Same thing with vault_retrieve.

It is not a LDAPRetrieve operation, because it has different semantics. 
Please use Command as base class and either use ldap2 for direct LDAP or 
call vault_show instead of hacking around LDAPRetrieve.

>
>> 22) vault_archive will break with binary data that is not UTF-8 encoded
>> text.
>>
>> This is where it occurs:
>>
>> +        vault_data[u'data'] = unicode(data)
>>
>> Generally, don't use unicode() on str values and str() on unicode values
>> directly, always use .decode() and .encode().
>
> It needs to be a Unicode because json.dumps() doesn't work with binary
> data. Fixed by adding base-64 encoding.

If something str needs to be unicode, you should use .decode() to 
explicitly specify the encoding, instead of relying on unicode() to pick 
the correct one.

Anyway, I think a better solution than base64 would be to use the 
"raw_unicode_escape" encoding:

     if data:
         data = data.decode('raw_unicode_escape')
     elif text:
         data = text
     elif input_file:
         with open(input_file, 'rb') as f:
             data = f.read()
         data = data.decode('raw_unicode_escape')
     else:
         data = u''

     ...

     vault_data[u'data'] = data

> ...
>
>> 26) Instead of the delete_entry refactoring in baseldap and
>> vaultcontainer_add, you can put this in vaultcontainer_add's
>> pre_callback:
>>
>>      try:
>>          ldap.get_entries(dn, scope=ldap.SCOPE_ONELEVEL, attrs_list=[])
>>      except errors.NotFound:
>>          pass
>>      else:
>>          if not options.get('force', False):
>>              raise errors.NotAllowedOnNonLeaf()
>
> I suppose you meant vaultcontainer_del. Fixed, but this will generate an
> additional search for each delete.
>
> I'm leaving the changes baseldap because it may be useful later and it
> doesn't change the behavior of the current code.

Again, no ad-hoc refactoring please.

> ...
>
>> 28) The vault and vaultcontainer plugins seem to be pretty similar, I
>> think it would make sense to put common stuff in a base class and
>> inherit vault and vaultcontainer from that.
>
> I plan to refactor the common code later. Right now the focus is to get
> the functionality working correctly first.

Please do it now, "later" usually means "never". It shouldn't be too 
hard and I can give you a hand with it if you want.

-- 
Jan Cholasta