[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

The Future of Fedora Package Management and the RPM Philosophy



It's great to see discussion about the future Fedora package SCM again.
However, I think that some people (myself included) are getting a little
bogged down in the details of how to manage branches in a source code
repository or the details of how particular SCMs work. Before deciding
on details like this, there's one big question that we need to answer
before we can make intelligent decisions about SCMs and how we should
use them.

The question is, do we want to move away from RPM's philosophy of using
pristine sources plus patches to build binary packages? Determining the
answer to this question is a fundamental first step before decide
“what's next.”

Currently, all of the packages in Fedora consist of pristine sources,
patches to the pristine sources, and a spec file that contains
instructions that tell the builders how to take the pristine source,
apply the patches, and output a binary package. All of this package
building information is managed in a source code repository. The build
system pulls the information out of the package repository to build the
binary RPMs. This process has served the Fedora Project well so far. The
packages in Fedora, thanks to a lot of hard work by packager maintainers
and reviewers – guided by the packaging guidelines – are already very
high quality and getting better all of the time.

What hasn't been done well so far is the management of the patches that
we apply to the sources. As far as the package repository is concerned
patches appear out of thin air. This is because patches have not been
managed in a central and public manner by the Fedora project (other than
storing them as blobs in the package repository). Package maintainers
have had to devise their own systems for developing and managing
patches, and this has meant that the patches have been developed in an
ad-hoc and private manner.

This has made life difficult in a number of ways. It's harder for
package maintainers to manage patches, both on their own and in
collaboration with other package maintainers. It's harder to communicate
changes to upstream and downstream developers.

Managing patches in a central and public manner will have the following
benefits:

     1. Make the packager maintainer's work easier by making it easier
        to forward-port patches as new releases of upstream code become
        available or to backport bug fixes and security patches as they
        are developed. It will also be easier for package maintainers to
        collaborate on patches. Making the the task of managing patches
        easier will indirectly improve the quality of the packages
        because it will be easier to update packages when new sources
        are released upstream or to backport bug or security fixes that
        have been committed to the upstream SCM but are not yet part of
        a formal release.
        
     2. Make it easier to communicate changes to upstream developers (so
        that patches that fix bugs or add features can benefit the wider
        F/OSS community and the package maintainer doesn't need to
        maintain the patches indefinitely). Fairly or unfairly, there's
        the perception that many patches sit in our CVS and never get
        pushed upstream. Of course, some patches should never get pushed
        upstream since they represent Fedora-specific policy, but in
        most cases we'd all be better off if our patches were
        incorporated upstream.
        
     3. Make it easier for downstream developers (e.g. the RHEL and OLPC
        engineers or anyone else that repackages Fedora) to add their
        own customizations and communicate those changes back to Fedora
        or the upstream developers. Of particular concern here would be
        making it easier for downstream distributions to apply patches
        that implement policies specific to those distributions while
        keeping it easy for them to track changes in the Fedora patches.

There are (at least) two different approaches that we could take to
managing patches in a central, public manner. One method would keep the
traditional package repositories and add separate patch management
repositories. This method would preserve the pristine source plus
patches philosophy of RPM. Another more radical approach would be to
integrate the package and patch management repositories into one.

In the separate package and patch management repositories, the package
management remains largely the same. The SCM technology used to maintain
the package repository might change, but it would remain a collection of
pristine source, patches, and spec files.

The patch management repository for a package would be different – it
would consist of a “vendor” branch that contained the unmodified
upstream code and it would contain a number of “patch” branches that
represent the patches to the upstream code that appear in the source
package repository. Ultimately, every patch that appears in the package
repository would have a branch in the patch management repository.

Since the development of patches now happens in a central, public manner
it's easier to communicate changes upstream, downstream, and within the
Fedora community. Doing things in a central, public manner means that we
can develop tools and procedures that will make managing patches easier
for the package maintainer.

If we want to be more radical, we could integrate the package and the
patch repositories. Package building would no longer use pristine
sources and patches to produce a binary package. Instead, the build
system would pull already-patched code out of the repository and build
the binary package from there.

The advantage to integrating the patch and package repositories would be
reducing the package maintainer's work. With separate package
repositories and patch repositories the package maintainer has to do
some work to export patches from the patch repository and import them
into the package repository (I believe that we could develop some tools
to minimize the amount of work it takes to export/import patches, but
there would always be some manual steps).

The primary disadvantage to integrating patch and package management is
that we move away from RPM's philosophy of using pristine sources.
Pristine sources have been required because it's possible to easily
verify that our copy of the sources matches the upstream copy by
comparing MD5 or GPG signatures. With an integrated repository where
there are no longer pristine sources, it's not possible to verify that
our copy of the code matches the upstream copy by comparing signatures
on tarballs. Verifying the code integrity is possible with the
integrated repository, but it's potentially much more difficult.

Another disadvantage of integrated management is that unless the package
maintainer disciplines himself/herself it can become difficult to
separate changes to the code that were done to implement Fedora-specific
policies (changing defaults in configuration files, moving files around
to suit the FHS, etc.) from changes that were done to fix bugs or
security problems (and thus might be of interest to upstream
developers). Guidelines on how to manage vendor, patch, and policy
branches will help maintain that discipline (but vigilance will be
necessary).

So would moving to an integrated package and patch repository be worth
it? It's hard to say. Some of the predecessors of RPM used modified
sources to build packages – one of the complaints about those early
package management systems was that it was hard to keep track of local
changes to the code. However, with the advantage of modern source code
management systems keeping track of changes to the code shouldn't be an
issue.

In either case we need to get the development and maintenance of patches
“out in the open” and that means having patches developed in central,
public repositories.

Attachment: signature.asc
Description: This is a digitally signed message part


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]