fedora 7 schedule (was Re: Fedora 7 planing)

Wed Dec 13 03:48:54 UTC 2006

NB: Most of this email will probably seem obvious, since everyone here
is experienced and intelligent. I offer it on the off chance that it
isn't obvious, and in the hope that it will spur everyone to ask
questions and think critically about assumptions Fedora may have
carried over from RH that should be examined in light of the different
goals a community project has from an enterprise operating system.

On 12/12/06, Dave Jones <davej at redhat.com> wrote:
> On Tue, Dec 12, 2006 at 05:47:47PM -0500, Luis Villa wrote:
>
>  > Has Fedora, the community, actively post-mortemed past slips? If so,
>  > what were the most likely causes of past slips, and what steps are you
>  > taking to avoid them this time around?
>
> I recall at least 2-3 issues last time around.
>
> * Xen being horked and taking forever to get working.
>   Not particularly easy to fix given we're at times dependant upon
>   an unresponsive upstream.

So what would the plan be for a similar problem in the FC7 schedule?
Lets call this 'feature X'. Possible options would be:
(0) Prevent feature X from going into distro trunk until feature X was
actually ready-ish, such that there was never risk of delay from
feature X.
(1) back feature X out completely, or at least to the FC6 state.
(2) admit that feature X will not work in FC7.
(3) delay FC7.

In my ideal development world, one aims for (0) as much as possible;
In GNOME, we'd typically do (1); it seems like in FC6 with Zen you
chose (3).

I would have suggested in the Xen case that you should have done (2),
since I presume Xen is too tightly tied to the kernel to allow for (0)
or (1). Why did you do (3) instead? Commitment to feature based
releases over time-based releases? Some other reason?

Focusing on the future, what is Fedora's plan for the Feature X (which
will almost inevitably occur) in FC7? (0)? (1)? (2)? (3)? None of the
above? [Note that (3) doesn't seem to be an option, given the hard
deadline, which suggests FC needs to assess 0-2 and set policies for
how to choose between 0, 1, and 2.]

I'd also note that (3) makes a ton of sense in an enterprise OS
context, where you've made hard commitments to customers about feature
lists, so that 0/1 (and particularly 2) are not options. This is the
kind of thing where Fedora can be (should be?) different from internal
RH engineering processes, I think. But that is an explicit policy
choice- to be time-based, and not feature-based- that Fedora's
leadership should explicitly think about and choose.

> * Trademark braindamage.
>   Not much we can do here either, other than pick sane upstream sources.

Agreed that it seems that 0,1, and 2 aren't really options in this
situation, so I agree you probably had to do (3) here, given the
situation. (I assume this is Firefox? Or something else?)

Since it is specifically external legal liability that prevents (2),
that suggests being as proactive as possible about all *legal* issues
in order to avoid delays. Understanding that perfect foresight is
impossible, who is in charge of assessing legal issues and being the
best humanly possible lookout for legal icebergs? What are they doing
right now to help meet this proposed schedule?

Perhaps similarly, I assume you might have other sorts of 'hard'
commitments which prevent doing (2) in other cases- marketing
commitments, perhaps, or... dunno? Is there a rubric for assessing
which things fall into (2)(OK to ship with them broken) and which
things fall into (3)(must delay?) Again, this is a Fedora leadership
policy choice that (it seems to be) should be done explicitly, and
done now, so that everything is more predictable and smooth at the end
of the cycle. (It is probably already done, but as the naive outsider
I don't know about it :), in which case the question is 'why didn't it
catch the trademark and Xen things'.)

> * Discovery of late breaking nasty bugs (Like the ext3-went-boom bug)
>   Whilst we'd all like it if this weren't to repeat itself ever again,
>   it can't be ruled out. Sometimes things just fall out of testing late
>   in the cycle, and right before release is when we really stress things
>   as much as possible.

Probably again a naive question, but why were changes to something as
critical as the file system being made late enough in the game to
delay the release? Aren't there freezes for such things? This sounds
like a good candidate for option (0)- prevent it from getting into the
tree in the first place by testing it in a branch, or not accepting
code churn in critical subsystems further out from release.

Probably more usefully, if for some reason you can't have earlier
freezes for critical, complex subsystems, who is in charge of
encouraging early stress testing? Is there someone whose job it is to
evangelize widespread testing?

> Oh, and there was the 'not really test4, but sort of' release that
> was needed because test3, well.. stunk.   There were a number of really
> nasty bugs in that which meant it wouldn't even install for quite a
> few people. How that one got out the building alive is anyones guess.

'How that one got out of the building' sounds like a really critical
question to answer. Maybe *the* critical question to answer if you're
seriously planning on getting FC7 out on time. I think from your
comments about increased testing, you have a better answer than you
let on here, but it seems like it would be a good idea to make it
explicit and figure out what the policy is for the future. Probably
you already have, and I'm just making everyone slog through it again,
but just in case... :)

On 12/12/06, Jesse Keating <jkeating at redhat.com> wrote:
> Unfortunately we were hoping that more people were actually
> trying to install rawhide more often than they were, instead of just doing
> yum updates.

So that's one answer to 'how did that one get out of the building' :)
Again, then, who is in charge of evangelizing such testing during FC7?

So, yeah... it sounds like some more post-morteming might be needed,
and some work on policy.

More questions that you could ask, besides 'why did it go wrong last
time?' would be things like how does Fedora define a showstopper? What
are the most-preferred/least-preferred methods for coping with a
showstopper? For a given problem, who gets to choose between those
methods? Who is in charge of proactively finding showstoppers as early
as possible? Who is in charge of creating communities of people who
find showstoppers as early as possible? What methods can be put in
place to prevent showstoppers from getting into the trunk in the first
place?

Some of these questions are obviously already answered, or being
answered- for example, I know the work on revision control systems
will make it easier to keep things out of the trunk, and to revert to
working versions if they do break. But the more *explicit* Fedora is
about answering these questions and making policy about them now,
instead of towards the end of a cycle, the better the chances of
getting Fedora out on time in the future.

HTH-
Luis