up2date and yum: failover mode?

Wed Oct 22 20:11:24 UTC 2003

Le mer 22/10/2003 à 21:13, seth vidal a écrit :
> > 
> > The repositories are the same - they are just potentially out of sync.
> 
> END.
> 
> If they are not in sync then they are not the same.
> If they are the same, then they don't need to both be listed as separate
> repositories.

On a volatile repository like Rawhide mirrors are always out of sync.
The best ones are maybe 90% synchronized, but they always seem to lack
the last tensome of updated packages.

This doesn't mean that one can not use them, just that upstream must
*always* be checked too to complete updates with this tensome of
packages.
> 
> > The aim is to maximize acces to local/close/deep mirrors without losing
> > ability to get the latest parts from the master site (accessing the
> > master site is costly for the upstream source and potentially for the
> > user if it means leaving an intranet through a congested shared pipe).
> 
> You're mixing the concepts of repository and mirror in ways that they
> shouldn't be mixed.

I may be mixing yum concepts. I was just talking plain layman
site/mirror problems (and studiously avoided even writing repository 
here;)

> a repository is a set of packages - distinct from any other repository
> in that it provides different items - they don't have to be discretely
> different but at least not claiming to provide the identical things.
> 
> a mirror is an IDENTICAL COPY - hence the term mirror.

There is no such thing in reality. The deepest the mirror, the more
volatile the upstream source, the less true it is.

What you can have at best is a set of sources with different epochs of
the index/packages set.

Traditional mirroring just "assumes" mirror=upstream source.
Repository mirroring can be smarter, since the data is fully indexed and
so it is possible to check precisely the level of out-of-sync-eness
(freshness) without downloading the full upstream set.

Therefore is is possible to do a big part of the download from mirrors,
download from source only the delta of packages that wasn't mirrored
yet, and get exactly the same results as if the full download was done
ftom up-to-date upstream.

> in yum it works like this:
> [repo]
> baseurl=mirror
>         mirror
>         mirror
>         mirror
> 
> [repo2]
> baseurl=mirror2
>         mirror2
>         mirror2
>         mirror2
> 
> 
> > All the download manager should need is a list of the X upstream level 1
> > sources, a list of all known mirror sources (sometimes a few 100 sites)
> > and be able to choose by itself 2-3 mirrors it'll poll and one upstream
> > source to check for completeness.
> 
> and if they're not in sync it will never know what it's getting - the
> suggestion that you appear to be saying is allow repo2 to provide
> something if repo fails.

No. My suggestion is to recognize all mirrors are always more or less
out of sync, so you get as much stuff as you can for mirrors, and you
complete as needed from upstream.

It's not an all-or-nothing proposition. If only package foo was updated
since upstream was last mirrored it'd be stupid do download the gazillon
other updates from upstream too. What you need is use the mirrors to
update all stuff except foo package, and download foo from upstream (and
skip the intermediary versions of foo that may be present on the
mirrors)

This also means you must explicitely tell your package manager what the
level 1 source(s) is(are), since while you needn't check all available
mirrors you *must* check upstream, if only to verify you're not using
mirrors that stopped being updated a year ago (ie if you always end up
not downloading anything from a mirror because it lags too much behind -
stop polling it systematically even if its response time/bandwidth is
excellent. This can happen if you're accustomed to daily syncs and this
particular mirror is updated every week)

And then there is the problem of partial mirrors, ie mirrors that only
carry some channels (exemple : updates.redhat.com only carries the
update part of ftp.redhat.com) or do not mirror rarely used ones
(sources, debug stuff, exotic arches...). The solution for the package
manager is probably to treat each channel as a separate repository,
though the mirror list the upstream project will provide won't specify
what mirrors are partials or not (and this may change over time), so to
find X mirrors of a particular channel you may have to query Y>X
mirrors. But you can get around this by "remembering" what mirror
carried what channel, re-checking for better mirrors every month or so.

Cheers,

-- 
Nicolas Mailhot
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: Ceci est une partie de message num?riquement sign?e.
URL: <http://listman.redhat.com/archives/fedora-devel-list/attachments/20031022/d36563dc/attachment.sig>