[libvirt] [RFC PATCHv2 1/8] threshold: new API virDomainBlockSetWriteThreshold

Peter Krempa pkrempa at redhat.com
Mon Jun 15 13:19:56 UTC 2015


On Fri, Jun 12, 2015 at 13:29:25 -0600, Eric Blake wrote:
> qemu 2.3 added a new QMP command block-set-write-threshold,
> which allows callers to get an interrupt when a file hits a
> write threshold, rather than the current approach of repeatedly
> polling for file allocation.  This patch prepares the API for
> callers to register to receive the event, as well as a way
> to query the threshold via virDomainListGetStats().
> 
> The event is one-shot in qemu - a guest must re-register a new
> threshold after each time it triggers.  However, the
> virConnectDomainEventRegisterAny() call does not allow
> parameterization, so callers must use a pair of APIs - one
> to register the callback (one-time call) that will be used each
> time a threshold triggers for any guest disk, and another to
> repeatedly set the desired threshold (must be called each time
> a threshold should be changed).
> 
> Note that the threshold can either be registered by a byte
> offset, or by a thousandth of a percentage (a value between
> 0 and 100000, scaled to the disk size).  But the value is
> always reported as a byte offset, even when registered as a
> percentage.  I also considered having the setup parameter be
> a double, to allow a finer resolution on percentage; with the
> choice of an integer fixed-point scale, this means a 100G
> disk can only set a threshold to a granularity of 1M, but
> that is probably sufficient for the usage.
> 
> To make the patch series more digestible, this patch
> intentionally omits remote support, by using a couple of
> placeholders at a point where the compiler forces the addition
> of a case label within a switch statement.
> 
> * include/libvirt/libvirt-domain.h
> (virDomainBlockSetWriteThreshold): New API.
> (virConnectDomainEventWriteThresholdCallback): New event.
> * src/libvirt_public.syms (LIBVIRT_1.2.17): Export it.
> * src/libvirt-domain.c (virDomainBlockSetWriteThreshold): New API.
> (virConnectGetAllDomainStats): New stat.
> * src/driver-hypervisor.h (virDrvDomainBlockSetWriteThreshold):
> New hypervisor entry point.
> * tools/virsh-domain.c (vshEventWriteThresholdPrint): Print new
> event.
> * tools/virsh.pod (domstats): Document new stat.
> * daemon/remote.c (domainEventCallbacks): Add stub.
> * src/conf/domain_event.c (virDomainEventDispatchDefaultFunc):
> Likewise.
> 
> Signed-off-by: Eric Blake <eblake at redhat.com>
> ---
>  daemon/remote.c                  |  2 +
>  include/libvirt/libvirt-domain.h | 47 ++++++++++++++++++++
>  src/conf/domain_event.c          |  4 +-
>  src/driver-hypervisor.h          |  7 +++
>  src/libvirt-domain.c             | 95 ++++++++++++++++++++++++++++++++++++++++
>  src/libvirt_public.syms          |  5 +++
>  tools/virsh-domain.c             | 23 ++++++++++
>  tools/virsh.pod                  |  1 +
>  8 files changed, 183 insertions(+), 1 deletion(-)
> 

...

> diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
> index d851225..7514656 100644
> --- a/include/libvirt/libvirt-domain.h
> +++ b/include/libvirt/libvirt-domain.h
> @@ -1297,6 +1297,17 @@ int                     virDomainBlockStatsFlags (virDomainPtr dom,
>                                                    virTypedParameterPtr params,
>                                                    int *nparams,
>                                                    unsigned int flags);
> +
> +typedef enum {
> +    /* threshold is thousandth of a percentage (0 to 100000) relative to

You managed to choose a unusual unit. Commonly used ones are 1/1000 and
1/1 000 000. Financial world also uses 1/10 000. Your unit of 1/100 000
is not among:

https://en.wikipedia.org/wiki/Parts-per_notation#Parts-per_expressions

I'd again suggest to use 1/1 000 000. Or if you want to be uber preciese
you might choose 1/(2^64 - 1).
 
> +     * image size rather than byte limit */
> +    VIR_DOMAIN_BLOCK_SET_WRITE_THRESHOLD_PERCENTAGE = (1 << 0),
> +} virDomainBlockSetWriteThresholdFlags;
> +int virDomainBlockSetWriteThreshold(virDomainPtr dom,
> +                                    const char *disk,
> +                                    unsigned long long threshold,
> +                                    unsigned int flags);
> +
>  int                     virDomainInterfaceStats (virDomainPtr dom,
>                                                   const char *path,
>                                                   virDomainInterfaceStatsPtr stats,
> @@ -3246,6 +3257,41 @@ typedef void (*virConnectDomainEventDeviceAddedCallback)(virConnectPtr conn,
>                                                           void *opaque);
> 
>  /**
> + * virConnectDomainEventWriteThresholdCallback:
> + * @conn: connection object
> + * @dom: domain on which the event occurred
> + * @devAlias: device alias
> + * @threshold: threshold that was exceeded, in bytes
> + * @length: length beyond @threshold that was involved in the triggering
> + *          write, or 0 if not known
> + * @opaque: application specified data
> + *
> + * The callback signature to use when registering for an event of type
> + * VIR_DOMAIN_EVENT_ID_WRITE_THRESHOLD with virConnectDomainEventRegisterAny()
> + *
> + * This callback occurs when a block device detects a write event that
> + * exceeds a non-zero threshold set by
> + * virDomainBlockSetWriteThreshold().  When this event occurs, the
> + * threshold is reset to 0, and a new limit must be installed to see
> + * the event again on the same device.  The intent of this event is to
> + * allow time for the underlying storage to be resized dynamically
> + * prior to the point where the guest would be paused due to running
> + * out of space, without having to poll for allocation values.
> + *
> + * The contents of @devAlias will be "vda" when the threshold is triggered
> + * on the active layer of guest disk vda.  Some hypervisors also support
> + * threshold reporting on backing images, such as during a block commit;
> + * when that happens, @devAlias will be "vda[1]" for the backingStore at
> + * index 1 within the chain of host resources for guest disk vda.

Is it perhaps worth to include a optional field that will contain the
file path since most use cases of this event will use a local block
device with the event? iscsi and NBD block devices then could return the
field empty as we now do in the bulk stats API

> + */
> +typedef void (*virConnectDomainEventWriteThresholdCallback)(virConnectPtr conn,
> +                                                            virDomainPtr dom,
> +                                                            const char *devAlias,
> +                                                            unsigned long long threshold,
> +                                                            unsigned long long length,
> +                                                            void *opaque);
> +
> +/**
>   * VIR_DOMAIN_TUNABLE_CPU_VCPUPIN:
>   *
>   * Macro represents formatted pinning for one vcpu specified by id which is
...

> diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c
> index 7e6d749..53114d3 100644
> --- a/src/libvirt-domain.c
> +++ b/src/libvirt-domain.c
> @@ -5743,6 +5743,99 @@ virDomainBlockStatsFlags(virDomainPtr dom,
> 
> 
>  /**
> + * virDomainBlockSetWriteThreshold:
> + * @dom: pointer to domain object
> + * @disk: path to the block device, or device shorthand
> + * @threshold: limit at which a write threshold event can trigger
> + * @flags: bitwise-OR of virDomainBlockSetWriteThresholdFlags
> + *
> + * This function is used to set a one-shot write threshold.  It should
> + * be used in tandem with virConnectDomainEventRegisterAny()
> + * installing a handler for VIR_DOMAIN_EVENT_ID_WRITE_THRESHOLD.  If
> + * the hypervisor detects that a write request (whether guest data, or
> + * host metadata) would exceed the host byte offset specified in
> + * @threshold, then an event is raised, and the threshold is reset to
> + * 0 at that time.  The event registration is only needed once, but
> + * this function must be called each time a new threshold is desired;
> + * the event will only fire if a non-zero threshold is
> + * exceeded.
> + *
> + * By default, @threshold is specified in bytes, and must not exceed
> + * the size of the block device.  However, when @flags includes
> + * VIR_DOMAIN_BLOCK_SET_WRITE_THRESHOLD_PERCENTAGE, @threshold is
> + * instead a value between 0 an 100,000, as a thousandth of a percent
> + * of the current size of the disk, and the driver will compute the
> + * corresponding byte value.  For example, 80000 represents 80.000%.
> + * A driver may round the requested threshold to a granularity that
> + * can actually be supported.
> + *
> + * Setting a threshold allows event-based resizing of host resources
> + * that back a guest disk without having to poll the current disk
> + * allocation, while still having enough time to complete the resize
> + * before the guest would end up halting due to insufficient space.
> + * Calling this function to set the threshold back to zero will stop
> + * further firing of the event.  virConnectGetAllDomainStats() can be
> + * used to track the current threshold value, always in the form
> + * normalized to bytes.
> + *
> + * The @disk parameter is either the device target shorthand (the
> + * <target dev='...'/> sub-element, such as "vda"), or (since 0.9.8)

Since this will be added in 1.3.0 (or 1.2.15) the "since" statement is
not exactly true.

> + * an unambiguous source name of the block device (the <source
> + * file='...'/> sub-element, such as "/path/to/image").  Valid names
> + * can be found by calling virDomainGetXMLDesc() and inspecting
> + * elements within //domain/devices/disk. Some drivers might also
> + * accept strings such as "vda[1]" for setting the threshold of a
> + * backing image, useful when doing a block commit into the backing
> + * image.
> + *
> + * Domains may have more than one block device.  To set thresholds for
> + * each you should make multiple calls to this function.  If write
> + * thresholds are not supported, an application will have to instead
> + * poll virDomainGetBlockInfo() or similar to track allocation.
> + *
> + * Returns -1 in case of error, 0 in case of success.
> + */

Otherwise looks good to me.

Peter
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20150615/bcd2ac95/attachment-0001.sig>


More information about the libvir-list mailing list