[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Discussion: what would not blocking on btrfs look like?



On 8/28/19 1:58 PM, Josef Bacik wrote:
On Tue, Aug 27, 2019 at 07:53:20AM -0400, Laura Abbott wrote:
On 8/26/19 11:39 PM, Neal Gompa wrote:
On Mon, Aug 26, 2019 at 11:16 AM Laura Abbott <labbott redhat com> wrote:

On 8/23/19 9:00 PM, Chris Murphy wrote:
On Fri, Aug 23, 2019 at 1:17 PM Adam Williamson
<adamwill fedoraproject org> wrote:

So, there was recently a Thing where btrfs installs were broken, and
this got accepted as a release blocker:

https://bugzilla.redhat.com/show_bug.cgi?id=1733388

Summary: This bug was introduced and discovered in linux-next, it
started to affect Fedora 5.3.0-rc0 kernels in openqa tests, patch
appeared during rc1, and the patch was merged into 5.3.0-rc2. The bug
resulted in a somewhat transient deadlock which caused installs to
hang, but no corruption. The fix, 2 files changed, 12 insertions, 8
deletions (1/2 the insertions are comments).

How remarkable or interesting is this bug? And in particular, exactly
how much faster should it have been fixed in order to avoid worrying
about it being a blocker bug?

7/25 14:27 utc bug patch was submitted to linux-btrfs@
7/25 22:33 utc bug was first reported in Fedora bugzilla
7/26 19:20 utc I confirmed upstream's patch related to this bug with
upstream and updated the Fedora bug
7/26 22:50 utc I confirmed it was merged into rc2, and updated the Fedora bug

So in the context of status quo, where Btrfs is presented as an option
in the installer and if there are bugs they Beta blocking, how could
or should this have been fixed sooner? What about the handling should
have been different?


That's a fair question. This bug actually represents how this _should_
work. The concern is that in the past we haven't seen a lot engagement
in the past. Maybe today that has changed as demonstrated by this thread.
I'm still concerned about having this be a blocker vs. just keeping it
as an option, simply because a blocker stops the entire release and it
can be a last minute scramble to get things fixed. This was the ideal
case for a blocker bugs and I'm skeptical about all bugs going this well.
If we had a few more people who were willing to be on the btrfs alias and
do the work for blocker bugs it would be a much stronger case.


Out of curiosity, how many such issues have we had in the past 2
years? I personally can't recall any monumental occasions where people
were scrambling over *Btrfs* in Fedora. If anything, we continue to
inherit the work that SUSE and Facebook are doing upstream as part of
us continually updating our kernels, which I'm grateful for.

And in the instances where we've had such issues, has anyone reached
out to btrfs folks in Fedora? Chris and myself are the current ones,
but there have been others in the past. Both of us are subscribed to
the linux-btrfs mailing list, and Chris has a decent rapport with most
of the btrfs developers.

What more do you want? Actual btrfs developers in Fedora? We don't
have any for the majority of filesystems Fedora supports, only XFS. Is
there some kind of problem with communicating with the upstream kernel
developers about Fedora bugs that I'm not aware of?


Again, it's about length of overall development. ext and XFS have
a much longer history in general which is something that's important
for file system stability in general. It's also a bit of a catch-22
where the rate of btrfs use in Fedora is so low we don't actually
see issues.

I note here that ext2 and ext3 are offered as file systems in
Custom/Advanced partitioning and in this sense have parity with Btrfs.
If this same bug occurred in ext2 or ext3 would or should that cause
discussion to drop them from the installer, even if the bug were fixed
within 24 hours of discovery and patch? What about vfat? That's
literally the only truly required filesystem that must work, for the
most commonly supported hardware so it can't be dropped, we'd just be
stuck until it got fixed. That work would have to be done upstream,
yes?


I don't think that's really a fair comparison. Just because options
are presented doesn't mean all of them are equal. ext2/ext3 and vfat
have been in development for much longer than btrfs and length of development
is something that's particularly important for file system stability
from talking with file system developers. It's not impossible for there
to be bugs in ext4 for example (we've certainly seen them before) but
btrfs is only now gaining overall stability and we're still more likely to see
bugs, especially with custom setups where people are likely to find
edge cases.


Nope. We can totally use this because LVM has not existed as long (we
use LVM + filesystem by default, not plain partitions), and we still
encounter quirks with things like thinp LVM combined with these
filesystems. OverlayFS is mostly hot garbage (kernel people know it,
container people know it, filesystem people know it, etc.), and yet we
continue to try to use it in more places. Stratis is in an odd state
of limbo now, since its main developer and advocate left Red Hat.
There are plenty of examples of Red Hat doing crazy/experimental
things... I'd like to think Red Hat isn't supposed to be special here,
but in this realm, it seems like it is...



btrfs still doesn't give me the warm fuzzies and I also think this
is a bigger issue than other features simply because user data is at
stake. We do need to consider that the failure case is not "I can't do X"
but "my precious data which I have been trying to snapshot is now
inaccessible" in a way that's even worse than say rpm database
corruption. Even if it is in the advanced partitioning or not the
default, we can still end up with people clicking in because they
read an article about how btrfs was the hot new thing.

There are two parts to this here: killing off btrfs entirely and
btrfs as release criteria. I think you are correct that there's
enough community support to justify keeping btrfs around at least
in the kernel (I can't speak for anaconda here)

As for btrfs as release criteria, I'd feel much more confident
about that if we could have a file system developer on the btrfs
alias. I'm glad to hear the btrfs upstream community has been
receptive to bugs but it's still much easier to make things
happen if we have contributors who are active in the Fedora
community, especially if we want the advanced features that
btrfs has (which is why people want it anyway). So, who would
you suggest to work with us in Fedora?

You can always CC me, if I get an email from you or anybody else I recognize
from the fedora kernel team I'm going to pay attention to it.

Facebook runs more btrfs file systems than Fedora has installs, so we're pretty
happy with how it works stability wise.  That being said we're slightly more
fault tolerant than most users.  If you guys are hitting problems chances are
we'll hit them eventually as well, so it makes sense for us to be on top of
them.

I agree it would be better if somebody inside Fedora was able to help out, but
again I'm only an email away.  Thanks,


So it appears you are on the btrfs alias already:

fedora-kernel-btrfs: fs-maint redhat com,josef toxicpanda com,bugzilla colorremedies com

This technically meets the requirements if you are willing to stay on this
alias and (continue) to help with requests as needed. I would feel more
confident if we had a few more people involved as well. Even better
would be proactively going through the bugzillas to help find the
btrfs ones.

Fedora is ultimately a community project and as far as the kernel
goes, there does seem to be enough interest from the community if
Josef et. al. are willing to be involved. I hope to see this
continue.

Thanks,
Laura


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]