Jesse Keating wrote:
On Sat, 2008-03-29 at 22:59 +0900, John Summerfield wrote:Is it difficult to release jigdos?The problem is this. Jigdo would require the exploaded bits be available by http. Trying to sync all that content around the mirrors in any sort of reasonable time frame is not going to happen. Hanging the entire tree off of a single http point is not going to happen either, that point would quickly drown under the connections. We can't just rely on rawhide as tomorrow the rawhide content will be different and you wouldn't be able to complete your jigdo. If we're to have any sort of fast to the public snapshoting, we have to use a delivery mechanism that is capable of spreading the bandwidth load throughout the users without bringing down the host. Right now, that only leaves us with bittorrent as an option. Now, if there were some way to combine bittorrent and jigdo and if jigdo had better failover methods when mirrors are hit without the content we could potentially do something better. Jigdo + rawhide + whatever you already have for the majority of the content, bittorrent to suck down the very last little bit or something along those lines.
A suggestion I made awhile back re: breaking network installs would help here. Rawhide currently changes monotonically (probably by directory renaming?) and when the primary mirror finishes it's repo compose it ends up removing all the prior repo bits. Now, network installs request files by name, and they already have the list of file names to get from anaconda doing the metadata fetch and dep solving. If a network install is going during the repo change it breaks and cannot complete even if it was fetching the very last package when the repo changed. Each mirror rsyncs at random times, but whenever they do that mirror ends up with the same behavior (almost monotonic change from one repodata package set to the next).
Keeping the prior day (or two days) packages in place, but not including them in the current repodata would fix this. It would allow installs already started to fetch the package they expect to find. The mirrors already have that data, so it means a very light load increase on disk space; files that would normally be deleted during that repo changeover would be kept, but all new files coming in would also be there, so a small increase exists on average.
Anyone syncing that repo tree would already have the files that are excess space, so it shouldn't impact bandwidths much because still only the new packages need fetched.
For the jigdo issue, that would also leave the packages that need to be fetched for a day or two. If people reacted quickly it would be possible to build the images. If they were lazy, they would need to build off the next jigdo snapshot, whenever that happened.
-- Andrew Farris <lordmorgul gmail com> www.lordmorgul.net gpg 0xC99B1DF3 fingerprint CDEC 6FAD BA27 40DF 707E A2E0 F0F6 E622 C99B 1DF3 No one now has, and no one will ever again get, the big picture. - Daniel Geer ---- ----