InstantMirror needs a rethink

Ian Burrell ianburrell at gmail.com
Thu Jan 24 20:16:13 UTC 2008


On Jan 23, 2008 4:02 PM, Warren Togami <wtogami at redhat.com> wrote:
>
> - Synchronization/locking of multiple connections downloading the same
> file is awkward and broken.

I think a locking scheme on the files could solve this problem.  The
normal file would always be the complete downloaded file.  The first
downloading process would create the temp file and lock it.  When it
finishes, it moves it to the real file and unlocks it.  Any other
downloading processes see the locked temp file and wait for it to be
unlocked.  An unlocked temp file indicates a download failure.
Waiters would have to start over if there was a failure.  Things would
be more complicated if we want the waiters to stream the partial file
as it is downloaded.

> - There is no good way to clean up aborted tmp files.
> - There is no good way to know what are old files that need pruning.
> - There is no good way of keeping track of the "Big Picture" of its own
> cache, "least recently used" knowing what files were unpopular locally
> and should be pruned.
>

These could be solved with a cache cleaning script.  The script would
remove aborted (ie unlocked) tmp files.  The least-recently-used can
be determined with the atime of the files.

An alternative is to store some metadata about the cached files in a
separate database.  Berkeley DB or SQLite would work as would
per-directory or per-file data files.  This would make the most sense
if the Etag and Last-Modified-Tiome need to be stored for the caching
to work correctly.  It could also store the last-accessed-time.
Locking on the entries would be required and that would provide the
locking for simultaneous downloads.

 - Ian




More information about the fedora-devel-list mailing list