[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Pulp-list] Package path enhancements in pulp

Hi Pradeep,

We recently ran into an issue where in some situations package paths in
pulp could collide. The relevant bug is here #798656. Due to this we
decided to change the package path location to include the whole package
checksum instead of first three characters. Though the change sounds
simple, the path for migration is involved. The following wiki page
illustrates the changes in detailed

I'm curious under what situations that collision may occur?

One nice thing about the old directory structure is the %{name}/%{version}/%{release}/%{arch} pattern matches koji's. I previously had some thought why the same structure for both would be beneficial, but now I've forgotten. :P

There's another minor concern about the directory structure in general, with our without the extra level. The use of symlinks to point from the repo packages directory into grinder's multi-level structure takes a lot of disk activity to do any sort of scan that does a stat() on each RPM. Following each symlink requires traversing 4 or 5 directories that are unlikely to be in the fs cache. For example, compare times of '/bin/ls' and '/bin/ls -l' in the repo packages directory.

Daily grinder syncs of large repos, like Fedora, can take quite a long time even when there are few changes. I suspect this to be a contributing factor. Has there been any thought about making this more efficient, perhaps by creating hard links, or by updating a database with grinder's sync status?

The list archives don't have anything on the thinking behind this structure. File de-duping and bandwidth savings are clear benefits, but I'd like to hear thoughts on whether others have this same concern, or more likely whether I'm just not doing something right. ;)


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]