[libvirt] [PATCH v7 00/13] qemu: Add quorum support to libvirt

Peter Krempa pkrempa at redhat.com
Fri Jan 8 17:20:04 UTC 2016


On Thu, Dec 03, 2015 at 15:35:10 +0100, Matthias Gatto wrote:
> The purpose of these patches is to introduce quorum for libvirt
> I've try to follow this proposal:
> http://www.redhat.com/archives/libvir-list/2014-May/msg00533.html
> 

TL;DR: I'm concerned that the quorum implementation is not really useful
and will introduce a lot of code with little benefit.

---

So I have a few comments/observations regarding the quorum block driver
in qemu and it's usability.

At first I'd like to as you to describe your use case a bit more. I'm 
currently lacking the motivation to do anything about this, as the
series is just partial and I don't really see any advantage of using the
qorum driver at all and can't come up with any useful use case.

Also a good use case is usually a good reason to drive development of a
feature and I'm afraid that this could become abandoned without any real
use.

My problems with supporting the quorum backend are:

1) No traking of integrity
    As the quorum members don't have headers, failed quorum members are
    not recorded and remembered. The user or management app then has to
    do this externally for given storage devices.

2) No internal tracking of quorum members
    Members of the quorum don't have any header marking them as such and
    thus any images may be mixed together with unforseen/catastrophic
    results. Higher level management then needs to take the role of
    remembering which images belong together. Reimplementing this looks
    like reimplementing a distriuted storage system to me.

3) Lack of auto-resync:
    Once the quorum get's few inconsistencies it does not automatically
    resync like the linux MD driver. With the current implementation the
    only way to resync this would be to issue a block-mirror (blockCopy)
    to /dev/null so that all blocks are read and rewritten to the
    identical copy. This also requires a user action.

    Additionally the member of the quorum is not ignored if it was out
    of sync in any previous time without being resynced
    allowing for split-brain/corruption scenarios.

4) Necessity for at least 3 copies
    Since a majority needs to win in a vote, you need at least 3 member
    disks for this to be fault-tolerant.

5) Lack of speedup
    Since always all blocks are read from all members and verified the
    quorum backend doesn't really add any speed to the reads. This can
    be mostly attributed to the fact that fault tracking is not present.

    In other cases, due to internal error correcting codes it's very
    unlikely that a storage medium would return a corrupted sector
    without producing a error.

6) Almost every remote storage technology does quorums internally
    Any distributed storage (ceph/rbd, gluster, sheepdog, etc..) provide
    the quorum functionality internally with added benefit that their
    internal working fixes problems when split of the network occurs.

7) Tools are restricted to qemu and qemu-img
    It's a "proprietary" implementation so for a rebuild you have to use
    one of the two tools. AFAIK qemu-img is not really user friendly for
    the less common disk backends and we don't really provide any
    abstraction on top of that. This means that there really aren't any
    reasonable tools to do a offline resync. (Okay, if you know which
    instance is okay, you can just copy it ...)

This series also lacks implementation of any user/maganement warning
method that a block operation didn't have 100% votes in the quorum
voting thus it's not really possible for the users to do a
rebuild/diagnostic if something fails.

Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20160108/85cb809e/attachment-0001.sig>


More information about the libvir-list mailing list