[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [libvirt] [PATCH 1/2] qemu: don't log error for missing optional sources on stats




On 12/12/18 9:32 AM, Nikolay Shirokovskiy wrote:
> 
> 
> On 12.12.2018 16:28, John Ferlan wrote:
>>
>>
>> On 12/12/18 3:04 AM, Nikolay Shirokovskiy wrote:
>>>
>>>
>>> On 11.12.2018 17:33, John Ferlan wrote:
>>>>
>>>>
>>>> On 12/11/18 2:34 AM, Nikolay Shirokovskiy wrote:
>>>>>
>>>>>
>>>>> On 11.12.2018 01:05, John Ferlan wrote:
>>>>>>
>>>>>> $SUBJ:
>>>>>>
>>>>>> 'storage sources'
>>>>>>
>>>>>> On 11/12/18 7:58 AM, Nikolay Shirokovskiy wrote:
>>>>>>> Every time we call all domain stats for inactive domain with
>>>>>>> unavailable source we get error message in logs. It's a bit noisy.
>>>>>>
>>>>>> Are there ones in particular?
>>>>>
>>>>> Like this one:
>>>>>
>>>>> qemuOpenFileAs:3324 : Failed to open file '/path/to/optinal/disk': No such file or directory
>>>>>
>>>>
>>>> So this is the one that perhaps is bothersome to the degree of we're
>>>> ignoring that it doesn't exist, but then again the domain is not active,
>>>> so does it really matter.
>>>>
>>>
>>> I would prefer not to see false positive messages in logs if it is possible.
>>>
>>>>>>
>>>>>>> While it's arguable whether we need such message or not for mandatory
>>>>>>> disks we would like not to see messages for optional disks. Let's
>>>>>>
>>>>>> Filtering for the 'startupPolicy = optional' for domain stats (with
>>>>>> empty/unavailable local source) would seem to be a "new use" perhaps not
>>>>>> totally in line with the original intention.
>>>>>
>>>>> But I was not going to change behaviour only to stop polluting
>>>>> logs with messages like above.
>>>>>
>>>>
>>>> Your solution requires that someone has modified their definition to
>>>> include that startupPolicy='optional' attribute. I can be persuaded that
>>>> this is desirable; however, it would be nice to get Peter's opinion on
>>>> this especially since he knows the code and "rules" for backing stores
>>>> better than I do. IOW: From a storage backing chain perspective does it
>>>> make sense to use that.
>>>
>>> Not presicely, rather if someone already has optional disk then why logging
>>> this kind of errors.
>>>>>
>>>>>>
>>>>>>> filter at least for cases of local files. Fixing other cases would
>>>>>>> require passing flag down the stack to .backendInit of storage
>>>>>>> which is ugly.
>>>>>>
>>>>>> Yeah, I remember chasing down into backendInit (virStorageFileInitAs)
>>>>>> from qemuDomainStorageOpenStat for
>>>>>>
>>>>>>    virStorageFileBackendForType:145 : internal error: missing storage
>>>>>> backend for 'none' storage
>>>>>>    virStorageFileBackendForType:141 : internal error: missing storage
>>>>>> backend for network files using iscsi protocol
>>>>>>
>>>>>> where the 'none' comes from a volume using a storage pool of iSCSI
>>>>>> disks. I chased for a bit, but yes it got messy fast.
>>>>>>
>>>>>>>
>>>>>>> Stats for active domain are fine because we either drop disks
>>>>>>> with unavailable sources or clean source which is handled
>>>>>>> by virStorageSourceIsEmpty in qemuDomainGetStatsOneBlockFallback.
>>>>>>>
>>>>>>> We have these logs for successful stats since 25aa7035d which
>>>>>>> in turn fixes 596a13713 which added substantial stats for offline
>>>>>>> disks.
>>>>>>>
>>>>>>> Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy virtuozzo com>
>>>>>>> ---
>>>>>>>  src/qemu/qemu_driver.c  | 5 +++++
>>>>>>>  src/qemu/qemu_process.c | 9 +++++++++
>>>>>>>  src/qemu/qemu_process.h | 2 ++
>>>>>>>  3 files changed, 16 insertions(+)
>>>>>>>
>>>>>>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>>>>>>> index a52e249..72e4cfe 100644
>>>>>>> --- a/src/qemu/qemu_driver.c
>>>>>>> +++ b/src/qemu/qemu_driver.c
>>>>>>> @@ -20290,6 +20290,11 @@ qemuDomainGetStatsBlockExportDisk(virDomainDiskDefPtr disk,
>>>>>>>      const char *backendstoragealias;
>>>>>>>      int ret = -1;
>>>>>>>  
>>>>>>> +    if (!virDomainObjIsActive(dom) &&
>>>>>>> +        qemuProcessMissingLocalOptionalDisk(disk))
>>>>>>> +        return qemuDomainGetStatsBlockExportHeader(disk, disk->src, *recordnr,
>>>>>>> +                                                   records, nrecords);
>>>>>>> +
>>>>>>
>>>>>> From my quick read this part would seem reasonable since the *Frontend
>>>>>> and *Backend wouldn't have valid data and *GetStatsOneBlock is fetching
>>>>>> source sizing data which for an empty source would be unobtainable.
>>>>>>
>>>>>>>      for (n = disk->src; virStorageSourceIsBacking(n); n = n->backingStore) {
>>>>>>>          if (blockdev) {
>>>>>>>              frontendalias = QEMU_DOMAIN_DISK_PRIVATE(disk)->qomName;
>>>>>>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>>>>>>> index 9cf9718..802274e 100644
>>>>>>> --- a/src/qemu/qemu_process.c
>>>>>>> +++ b/src/qemu/qemu_process.c
>>>>>>> @@ -6070,6 +6070,15 @@ qemuProcessPrepareSEVGuestInput(virDomainObjPtr vm)
>>>>>>>  }
>>>>>>>  
>>>>>>>  
>>>>>>> +bool
>>>>>>> +qemuProcessMissingLocalOptionalDisk(virDomainDiskDefPtr disk)
>>>>>>
>>>>>> Curious why you chose qemu_process and not qemu_domain or even
>>>>>> virstoragefile?  This has nothing to do with whether the qemu process is
>>>>>> running or not.
>>>>>
>>>>> Yeah, we have nothing qemu specific here. Just not sure is this useful
>>>>> for other purpuses or not.
>>>>>
>>>>
>>>> Since your chosen filter is "domain startup policy", I think qemu_domain
>>>> is where it belongs.
>>>>
>>>>>>
>>>>>>> +{
>>>>>>> +    return disk->startupPolicy == VIR_DOMAIN_STARTUP_POLICY_OPTIONAL &&
>>>>>>
>>>>>> While I agree in principle about optional disks; however, since
>>>>>> startupPolicy is optional it would seem this particular check is chasing
>>>>>> a very specific condition. Would it be reasonable to use a flag instead?
>>>>>> Something passed on the command line to essentially say to only collect
>>>>>> the name/format the name of the local empty sources.
>>>>>
>>>>> Not sure I understand you. This patch does not change anything in respect
>>>>> to collected stats so I can not see why we need extra flags.
>>>>>
>>>>
>>>> To be clear, I'm not advocating some new domain attribute/flag here.
>>>>
>>>> The flag idea is something provided to virConnectGetAllDomainStats (or
>>>> of course virDomainListGetStats) rather than using startupPolicy. IOW,
>>>> some new VIR_CONNECT_GET_ALL_DOMAINS_STATS* flag.
>>>>
>>>> Just "odd" that using the startupPolicy flag to avoid stats collection
>>>> for inactive guests and in the next patch for backing chains.  Perhaps
>>>> something worth adding to the startupPolicy description if this is the
>>>> option taken.
>>>>  >>
>>>
>>> I agree that it is odd in case of stats collection but not in the second
>>> case because there we check startupPolicy flag during startup which 
>>> corresponds to flag's name and meaning.
>>>
>>> As to stats collection here we have discrepancy between name and usage.
>>> On the other hand optional policy makes disk kind of optional asset for domain so
>>> having screaming error messages in logs every several seconds is odd too.
>>> If this patch will be accepted it may make sense to introduce alias for
>>> for startupPolicy, something like 'presense' having in mind it already
>>> affects not only startup but migration/restore/revert.
>>>
>>
>> The qemuProcessMissingLocalOptionalDisk needs to at least move to
>> qemu_domain and be renamed in a v2.
>>
>> Noting in the description of startupPolicy in formatdomain.html.in that
>> collection of detailed disk stats for inactive guests when the path is
>> missing is something that I would think would be nice/good.
> 
> Ok
> 
>>
>> w/r/t Dan's comment about logging level using info vs error - could be a
>> challenge to do that given that you don't necessarily want to be carting
>> that around - although I suppose in the case where we know we're
>> "throwing away" an error (e.g. such as reset last error would do), we
>> could fetch that error and generate a VIR_INFO from it leaving it up to
>> the use to decide how chatty they'd like to have things.
> 
> With current approach this is difficult to archive. In both patches we skip
> code that can generate error. However I can add explicit INFO message
> that disk is skipped. Is this ok?
> 

Sure that's fine.  Although someone else may gripe that's too much
information (yes, a no win situation).

John

> Nikolay
> 
>>
>> John
>>
>>
>>> Nikolay
>>>
>>>>>> That way we're not reusing something that has other uses. Although I'm
>>>>>> open to other opinions.
>>>>>>
>>>>>>> +           virStorageSourceIsLocalStorage(disk->src) && disk->src->path &&
>>>>>>
>>>>>> Use virStorageSourceIsEmpty instead.
>>>>>
>>>>> But this is a bit different, say I use !virStorageSourceIsEmpty(src) instead of the line
>>>>> above, then I can not sure disk->src->path even makes sense in the below check.
>>>>>
>>>>
>>>> Ah, right the !virStorageSourceIsLocalStorage && src->path would pass...
>>>>  Cannot mistake virStorageSourceIsEmpty for (non existent)
>>>> virStorageLocalSourceIsEmpty
>>>>
>>>>>>
>>>>>>> +           !virFileExists(disk->src->path);
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> And while I believe I understand why you chose this, it would seem to me
>>>>>> that it might be useful to know if an optional local disk had a path
>>>>>> disappear.  IOW: If all the other conditions are met, I'd like to know
>>>>>> that the source path is missing. That might be a good thing to know if
>>>>>> I'm getting stats for a domain. Maybe that's just me.
>>>>>
>>>>> Yeah, its arguable. But as getting stats is periodic with rather high
>>>>> frequency seeing such log error over and over again in logs can be
>>>>> tedious.
>>>>>
>>>>
>>>> Playing devil's advocate for a moment - what if someone didn't know the
>>>> $path to that backing store was missing? By at least logging somewhere
>>>> (but not fatal), the awareness can be made. By filtering the message if
>>>> the unrelated in name startupPolicy was set to optional, this knowledge
>>>> could be missed.
>>>>
>>>> The concept of filtering is fine to me - the concern is really the usage
>>>> of the essentially unrelated startupPolicy to do so is what gives me
>>>> pause for concern. If there are others that are OK with this, it's not a
>>>> deal breaker for me.
>>>>
>>>> John
>>>>
>>>>>>
>>>>>> BTW: I fixed a bug in the domstats path recently where the "last error"
>>>>>> was lost (see commit e1fc7ec08).  Although I don't think that's what
>>>>>> you're chasing here.
>>>>>
>>>>> Nope, it's different case.
>>>>>
>>>>> Nikolay
>>>>>
>>>>>>
>>>>>> John
>>>>>>
>>>>>>
>>>>>>> +}
>>>>>>> +
>>>>>>> +
>>>>>>>  static int
>>>>>>>  qemuProcessPrepareHostStorage(virQEMUDriverPtr driver,
>>>>>>>                                virDomainObjPtr vm,
>>>>>>> diff --git a/src/qemu/qemu_process.h b/src/qemu/qemu_process.h
>>>>>>> index 2037467..52d396d 100644
>>>>>>> --- a/src/qemu/qemu_process.h
>>>>>>> +++ b/src/qemu/qemu_process.h
>>>>>>> @@ -214,4 +214,6 @@ int qemuProcessStartManagedPRDaemon(virDomainObjPtr vm);
>>>>>>>  
>>>>>>>  void qemuProcessKillManagedPRDaemon(virDomainObjPtr vm);
>>>>>>>  
>>>>>>> +bool qemuProcessMissingLocalOptionalDisk(virDomainDiskDefPtr disk);
>>>>>>> +
>>>>>>>  #endif /* __QEMU_PROCESS_H__ */
>>>>>>>


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]