[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [dm-devel] lvmetad doesn't terminate with SIGTERM if thin volume used



Dne 3.9.2016 v 05:17 james harvey napsal(a):
On Tue, Aug 16, 2016 at 5:57 AM, Zdenek Kabelac <zkabelac redhat com> wrote:
Dne 6.8.2016 v 04:08 james harvey napsal(a):

Same problem and question about if an immediate SIGKILL is OK for
dmeventd.

On Thu, Aug 4, 2016 at 11:20 PM, james harvey <jamespharvey20 gmail com>
wrote:

Does it matter at all if lvmetad shuts down gracefully?

Can I safely just have systemd right off the bat send a SIGKILL?

Most things I wouldn't ask about, but I'm wondering if this is PURELY
a caching daemon where gracefully shutting down doesn't really do
anything.



Sigterm/sigint is ignored by dmeventd when device is monitored.

Before stopping dmevend - devices shall be unmonitored.
(vg/lvchange)

Killing 'dmeventd' in the middle of i.e. recovery operation might leave your
system in dizzy state (suspended devices) essentially useless.


Somewhat similar ATM does apply to lvmetad - where lvm2 command will not
like death of lvmetad in the middle of operation and this may result in
operation failure (thought here the situation might get somewhat improved
over the time...) - but ATM don't kill  - just stop services.

Fedora should be doing it properly on reboot - switching to ramdisk and
continuing with shutdown sequence from there.  Unsure how other OS-es solves
this.

Using 'kill -9' (SIGKILL) is in general unsupported and any reported
problems caused by this usage are ignored...

Regards

Zdenek


Got it.  Fedora defaults to having lvm2-monitor.service enabled, Arch
doesn't.  (I've asked for that to be fixed.)  Arch also uses a
shutdown ramdisk.

Using some device type WITHOUT monitoring is quite 'crazy' idea...
Unless you are well aware of what you are doing,  thin, raid, mirror,
snapshot device should be always monitored...

So IMHO a thing to fix in Arch....


1) Should the lvm2-lvmetad, dm-event, and lvm2-monitor unit files be
modified so they are never given a SIGKILL?  Even with
lvm2-monitor.service enabled, even on Fedora, if systemd sees they
don't SIGTERM/SIGINT within 90 seconds (systemd v231 is 90 seconds,
was 10 second before), it's sending them a SIGKILL.  I think adding
"SendSIGKILL=no" to the Service and Socket sections will do this, if I
understand it correctly.

That's a different story here - it something is 'deadlocked' and
can't move forward - killing things after 90 seconds can't make
the situation any more worst likely - especially if you are doing shutdown...

So no - there is no plan to use such option (SendSIGKILL=no) ATM
(State-machine is pretty complex and when some devices are 'forgotten' in suspend - it's quite hard to fix it).


2) Should lvm2-lvmetad and dm-event systemd unit files want
lvm2-monitor.service?

lvm2-lvmetad is unrelated to monitoring service (dmevent).


3) Could all LVM programs be changed so if they receive a
SIGTERM/SIGINT and choose to ignore it, they give a warn/info/debug
message?  Not doing so invites thinking a SIGKILL is the proper thing
to do.


SIGINT should be handle with logging - (at least I've taken care in dmeventd - it should log this to syslog).

Both daemons should be able to gracefully shutdown if they are not in use
(i.e. no connection to lvmetad,  no monitored device in dmeventd).

lvm2 command usually block signal processing while it's holding VG lock,
but it should be breakable (SIGINT) in those 'process_each_lv' loops or if the command prompts - support for SIGTERM is planned - but low-prio - so it will happen - it's known issue - but bigger fishes are there for hunting ATM... :)

The best is to open BZ if you find something breaking common logic.
(So it's not lost in mailing list noice).

Regards

Zdenek


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]