[Pulp-list] Importer Sync APIs

Mon Nov 28 14:06:46 UTC 2011

On 11/22/2011 09:18 AM, Jay Dobies wrote:
> A few more other random thoughts now that I've had some coffee.
>
>> To be both consistent, flexible and efficient, I suggest an API based
>> around a "ContentUnitData" class with the following attributes:
>> - type_id
>> - unit_id (may be None when defining a new unit to be added to Pulp)
>> - key_data
>> - other_data
>> - storage_path (may be None if no bits are stored for the content type -
>> perhaps whether or not bits are stored should be part of the content
>> type definition?)
>
> I actually already wrote this about two weeks ago in the refinements of the importer APIs.
> The original intention was for the importer's "add" functionality which is meant to field
> user uploaded content units (not gonna go any deeper into that now). That made me happy to
> find this morning :)
>
>> The content management API itself could then look like:
>>
>> - get_units() -> two level mapping {type_id: {unit_id: ContentUnitData}}
>> Replacement for get_unit_keys_for_repo()
>> Note that if you're concerned about exposing 'unit_id', the existing
>> APIs already exposed it as the return value from
>> 'add_or_update_content_unit'.
>> I think you're right to avoid exposing a "single lookup" API, at least
>> initially - that's a performance problem waiting to happen.
>
> I'm unsure of how people would best want the return type. I can see an argument for
> wanting to organize by unit keys too.
>
> So to that end, I'm returning a custom object (currently named UnitBag, but I may find
> something better) that offers a bunch of transformations (queries is probably a better
> term) on the set of units. So "give me them by unit key" or "give me just the unit keys"
> or "give me them mapped by ID". Stuff like that.
>
>> - new_unit(type_id, key_data, other_data, relative_path) -> ContentUnitData
>> Does *not* assign a unit ID (or touch the database at all)
>> Does fill in absolute path in storage_path based on relative_path
>> Replaces any use of "request_unit_filename"
>>
>> - save_unit(ContentUnitData) -> ContentUnitData
>> Assigns a unit ID with the unit and stores the unit in the database
>> Associates the unit with the repo
>> Batching will be tricky due to error handling if the save fails
>> Replaces any use of 'add_or_update_content_unit' and
>> 'associate_content_unit'
>
> Another thing I forgot to discuss in that blog is the idea of child units. So an errata is
> itself a unit, but has references to RPMs which are their own units.
>
> I'm thinking of sticking with the proposed model:
>
> link_child(parent_unit, child_unit)

We may want to model this as a 'references' association instead of parent/child. 
Parent/child implies ownership or a hierarchical relationship which I don't think exists here.

>
> More details on that later, I'm still kinda flushing it out.
>
>