[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Pulp-list] Importer Sync APIs

On 11/22/2011 09:18 AM, Jay Dobies wrote:
A few more other random thoughts now that I've had some coffee.

To be both consistent, flexible and efficient, I suggest an API based
around a "ContentUnitData" class with the following attributes:
- type_id
- unit_id (may be None when defining a new unit to be added to Pulp)
- key_data
- other_data
- storage_path (may be None if no bits are stored for the content type -
perhaps whether or not bits are stored should be part of the content
type definition?)

I actually already wrote this about two weeks ago in the refinements of the importer APIs.
The original intention was for the importer's "add" functionality which is meant to field
user uploaded content units (not gonna go any deeper into that now). That made me happy to
find this morning :)

The content management API itself could then look like:

- get_units() -> two level mapping {type_id: {unit_id: ContentUnitData}}
Replacement for get_unit_keys_for_repo()
Note that if you're concerned about exposing 'unit_id', the existing
APIs already exposed it as the return value from
I think you're right to avoid exposing a "single lookup" API, at least
initially - that's a performance problem waiting to happen.

I'm unsure of how people would best want the return type. I can see an argument for
wanting to organize by unit keys too.

So to that end, I'm returning a custom object (currently named UnitBag, but I may find
something better) that offers a bunch of transformations (queries is probably a better
term) on the set of units. So "give me them by unit key" or "give me just the unit keys"
or "give me them mapped by ID". Stuff like that.

- new_unit(type_id, key_data, other_data, relative_path) -> ContentUnitData
Does *not* assign a unit ID (or touch the database at all)
Does fill in absolute path in storage_path based on relative_path
Replaces any use of "request_unit_filename"

- save_unit(ContentUnitData) -> ContentUnitData
Assigns a unit ID with the unit and stores the unit in the database
Associates the unit with the repo
Batching will be tricky due to error handling if the save fails
Replaces any use of 'add_or_update_content_unit' and

Another thing I forgot to discuss in that blog is the idea of child units. So an errata is
itself a unit, but has references to RPMs which are their own units.

I'm thinking of sticking with the proposed model:

link_child(parent_unit, child_unit)

We may want to model this as a 'references' association instead of parent/child. Parent/child implies ownership or a hierarchical relationship which I don't think exists here.

More details on that later, I'm still kinda flushing it out.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]