[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Fedora extras metadata

On Fri, Mar 16, 2007 at 10:01:57PM +0100, Thomas M Steenholdt wrote:
> Michael E Brown wrote:
> >
> >Matt Domsch is working on just such a tool and is looking to have it in 
> >place for F7 release, afaik. The tool is Mirror Manager.
> >
> This looks like a very competent tool indeed and there's no doubt that 
> it will be very useful for a lot of cases. However I have no idea how 
> the mirror validation in the package will work, I just hope it will be 
> implemented in a way that will be usable without special tools. Having a 
> way to validate a mirror from within the ftp directory listing is very 
> valuable - especially to mirror scripts etc.
> It looks to me like the MirrorManager will notify the main site that the 
> sync has completed. This is useful, but probably not to other mirror 
> sites (or we may need specialized tools to perform the check). I could 
> easily be mistaken here, though!

First, the problems with the current mirroring is really twofold:

1) the storage array backing the master rsync servers is undergoing
   some serious stress.  This is causing it to be very slow for the
   master rsync servers that serve the data, thus the global mirror
   servers pulling from it are seeing very slow syncs.  Red Hat I/T is
   working on it.  (It isn't helping that the RHEL5 floodgates opened
   on Wednesday either - that just added stress to an already stressed
   set of people and colo networks).

2) the global mirrors aren't being notified when content has
   changed on the master, such that they should start a new rsync
   run.  mirrormanager takes a per-host email address, which the
   master sign-and-push scripts will eventually send an email to when
   the content changes.  As it stands, syncing every 6 hours when
   nothing has changed doesn't make any sense.

Now, mirrormanager has 2 methods by which it can know a give mirror
host is up-to-date.  First is a new report_mirrors script that uploads
directory data back to the database from the mirror itself.  Not
everyone will want to run that, and there's always the 'trust but
verify' model, so I've also got a (fast?) crawler that crawls each
host using HTTP HEADs and keepalives or FTP DIR calls looking for
content it should be carrying compared to the master list, and
tracking presence and up-to-date-ness on a per-directory level.  Those
that aren't up2date get dropped from the appropriate per-directory
lists (e.g. the repodata dirs) in real time.

That's the idea.  A lot of the code is implemented, there's more to
go.  If you're good with python, turbogears, and the like, I'm sure I
could put you to work on it.  Drop me a note.


Matt Domsch
Software Architect
Dell Linux Solutions linux.dell.com & www.dell.com/linux
Linux on Dell mailing lists @ http://lists.us.dell.com

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]