[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Heads-up: brand new RPM version about to hit rawhide

Dan Williams wrote:
On Tue, 2008-07-15 at 08:15 -0800, Jeff Spaleta wrote:
On Tue, Jul 15, 2008 at 8:01 AM, Dan Williams <dcbw redhat com> wrote:
Yeah, there is actually a benefit to tarball+patches approach we take
right now; and that benefit is that it's extremely easy to see just what
we've done to the upstream package, and it's usually really easy to
extract those changes and push them upstream.  You don't want a
mega-diff that includes 20 specific patches.
I know of at least one example currently in our cvs where we went from
a set of separate small patch files to one encompassing patch file.  I
think it was a diff from git. If we move to more advanced vcs are we
going to have a harder time keeping patches separated? Or is it just a
matter of education on how not to reach for the easy to produce mega
patch shortcut?

That's the problem here:  if we move to git (or any DVCS really), as a
packager you would have to be _much_ more aware of how to use the VCS to
achieve the same separation of patches and upstream source.  You'd need
to do something like topic branches for each patch and then merge each
topic branch into a "release" branch to ensure that each of the patches
was cleanly separated from the main source.

Well, I spent 4 days learning all of git's sourcecode once (mind you, it
was a long time ago) and just recently I spent that long trying, and
failing, to get the exact sources to the kernel I was running when I found
a bug in it. Since the kernel is managed in git, I was quite appalled to
find out that fedora doesn't have a repo anywhere with tags set so I could
just clone it, check the right version out and fire up a bisect-run.

In the end, I settled for hacking the source rpm to run a "git commit -a"
between each patchfile so I could at least bisect on the result of that,
and then exclude the fedora patches from the list of possible culprits to
my particular problem.

Mind you, with all the hackery I had to go through to get that working, I
can't say for sure that what I was looking at in the end was actually the
sources of my running kernel anyway so it could just as well have been
a complete and utter waste of time.

Anyways, different workflow or not, using a distributed version control
system provides three huge advantages over tarball + patches, namely:
* Endpoint-hacker access to the reason a particular patch is needed.
 Without this, it's extremely tedious to know what to test when altering
 code in the same area a particular other patch touched, and so is much
 more likely to introduce regressions.
* Easy access to the exact revision.
 I won't ever try to debug the fedora kernel again. I'll just clone the
 vanilla kernel tree and find out which version fixes my particular issue or,
 if none of them does, start hacking on the upstream one instead.
 If some issue I'm seeing isn't in the upstream, I'll just report it as being
 caused by one of the patches in fedora. Hardly any work left for the poor
 fedora kernel folks to do what with the >100 patches you apply to the tarball.
* Bisection.
 If you've never used an scm that has a bisect command, you won't know what I'm
 talking about and you won't know what you've missed. It's like telling your
 scm "find which exact revision introduced this bug", and it does it.
 Instead of looking at a sourcetree of 10k-5M LoC you get to see a single patch
 that introduced the bug you're looking for.

Basically, moving to a DVCS and exploded source trees just raises the
bar for packagers since they'd have to learn quite a bit about how DVCS
works.  CVS + tarball + patches are quite easy for most people to learn,
but a DVCS + branches + merges is much more complicated if the
changesets are small.  And the changesets should always be small,
otherwise we're completely failing at pushing stuff upstream.

Why would you have packagers doing merges? They really shouldn't need to do
that. Only developers (and yes, package managers for really complex projects,
like the kernel) will need to know about branches and merges. Package-managers
just need to know how to extract a tarball from a repository, so that's a
single command they need to know about.

Maybe the fix here is to let package maintainers who want to use a DVCS
style, and those that don't want to use the old style, and provide the
ability to switch between the two styles when a new maintainer takes
over the package?

I think that's what Doug has been after the entire time. Obviously, the
kernel is tons easier to manage if it's all in git, with patches committed
to it as changesets rather than separate files. I'd imagine the same goes
for every other project whose upstream is managed in git as well.

Andreas Ericsson                   andreas ericsson op5 se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]