[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: archive-installer backend: review applied

On Fri, Jul 15, 2011 at 09:53:42AM -0400, David Cantrell wrote:
> I like this patch, but have some comments as well.  See below.

And again thanks!

> Lose the 'All rights reserved.' here.  If it's in our file, that's wrong.  
> I've received talkings-to about this.  Bottom line is we need to not 
> include the 'All rights reserved.' statement.

It was copied from somewhere -- don't have it in front of me, but
probably the yum backend.  Was just following precedent; happy to
remove it in both cases.  Wouldn't have removed it from the Red Hat
line without Red Hat permission, though!

> Style comment:  Blank line after a function definition and before the start 
> of a new one.


> How about using the 'magic' module (yum install python-magic) here instead 
> of keying on specific file name endings?  The magic module can give you the 
> description that file(1) would print out or the mime type.  I think the 
> mime type would be useful here and would prevent us from having users file 
> bugs when they have an image file named *.TAR.GZ and this code doesn't 
> work, for example.
> Example for magic module:
> import magic
> m = magic.open(magic.MAGIC_MIME)
> m.load()
> type = m.file("/path/to/image/file")
> 'type' will be something like:
> 'application/x-gzip; charset=binary'
> 'application/x-bzip2; charset=binary'
> 'application/x-xz; charset=binary'
> 'application/x-tar; charset=binary'
> 'application/octet-stream; charset=binary'   <- cpio gives me this, meh?
> This will require separating the type checks for the compressed image and 
> the datastream, but I think this approach will be more flexible down the 
> road.

Well, one of the answers is the octet-stream for cpio. Another is
that I thought that the magic module ate data, and so you have to
reopen URL streams and rewind rewindable streams.

I'd think that this would be OK for fallback when the name isn't
recognized.  But I could increase the recognition rate even without
that by comparing against .lower(); the typical way of dealing with
case-insensitive filesystems.

If you do "xz < foo.tar > foo.tar.gz" and you then get an error
because gunzip can't read foo.tar.gz, I don't think that's really
worth working around in an installer backend.  :)

> For extraction, is there any reason we can't use Python's 'tarfile' module?

I don't know if tarfile handles selinux, for one thing.  For another,
we still want it in another process at the other end of a pipe, for
performance reasons.  Last I checked, tar was higher performance than
tarfile for cases where functionality wasn't in question.  GNU tar
supports more different tar formats than tarfile, as I recall,
even if you ignore selinux.

> Likewise, for cpio I see there is the 'cpioarchive' module (yum install 
> python-cpio), but it does not seem as functional as the cpio program.

No indeed.  cpio handles a great many formats that aren't handled by

Overall, tar and cpio will be the first things updated with fixes and
improvements for tar and cpio, and the python modules generally won't
be even a distant second; I expect they'll be an invisibly-distant
fourth or fifth.

Basically, we'll get by far the best performance with pipes and
separate processes, and using relatively small specialist programs
in a pipeline is The Unix Way.  Additionally, you'd never know when
you're missing a feature someone depends on; that's a real potential
source of critical bugs.  This isn't a case of "the module ... really
just doing the same".  It would be a significant amount of work that
fixes no bugs and potentially introduces new bugs, that makes the
module larger and harder to read and maintain.

I say all this as a happy user of the tarfile module in other contexts.

> Is /mnt/source what we use for the install source?  I thought it was 
> /mnt/install/source.  I could be wrong, but for consistency across install 
> methods, we should probably make sure all these sync up.
> Unrelated to this patch, but just a general improvement that could be made: 
>  a patch to centralize the /mnt paths we use throughout anaconda.  Define 
> them as constants somewhere.

That would be cool; I think I copied that from the yum backend; wherever
I got it, it worked when I dumped an archive file onto a CD and tested
that case.

> Aside from my comments above, the rest of the patch I like.  Thanks for the 
> patch!

Continued thanks for the review.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]