[libvirt] [PATCH] Make virsh reconnect when loosing connection

Daniel Veillard veillard at redhat.com
Wed Mar 10 16:00:20 UTC 2010


On Fri, Mar 05, 2010 at 03:33:10PM -0500, Laine Stump wrote:
> I applied this patch and tried it out. It appears to work as
> advertised, which is useful. Would be even better if there was some
> higher level handling to automatically retry the commands that fail
> due to sigpipe.
> 
> Beyond that, I noticed a few typos, but don't have enough expertise
> about signals to know the definitive answer about resetting the
> signal handler (multiple times does work properly for me, so at
> least on Linux it seems to be unnecessary).
> 
> On 03/05/2010 05:11 AM, Daniel Veillard wrote:
> >   This is a usability issue for virsh in case of disconnections,
> >for example if the remote libvirtd is restarted:
> >   https://bugzilla.redhat.com/show_bug.cgi?id=526656
> >
> >the patch catch those and tries to automatically reconnect instead
> >of virsh exitting. The typical interaction with this patch is that
> 
> s/exitting/exiting/
> 
> >the command fails, but virsh automatically reconnects instead of
> >exitting, but it won't try to restart the failed command (since this
> 
> s/exitting/exiting/
> 
> >could have significant side effects). Example of such interraction:
> 
> s/interraction/interaction/
> 
> (your key repeat setting is too quick ;-)

  I blame my new expresso machine, delivering way more caffeeine for the
same amount (and brand) of coffee than the previous one, I'm still in the
tuning phase >:->

> >-------------------------------------------------------
> >
> >virsh # list --all
> >  Id Name                 State
> >----------------------------------
> >   - RHEL-5.4-64          shut off
> >   - WinXP                shut off
> >
> >virsh # list --all
> >error: Failed to list active domains
> >error: cannot send data: Broken pipe
> >error: Reconnected to the hypervisor
> >
> >virsh # list --all
> >  Id Name                 State
> >----------------------------------
> >   - RHEL-5.4-64          shut off
> >   - WinXP                shut off
> >
> >virsh #
> >
> >-------------------------------------------------------
> >
> >  The only thing I'm unsure is if the signal handler should be reset
> >once it was received once. I don't in this patch and that seems to
> >work fine, but I somehow remember the fact that in some circumstances
> >a signal handler needs to be rearmed when received once. As is this
> >seems to work fine with SIGPIPE and linux.
> >
> >
> >
> >Make virsh reconnect when loosing connection
> 
> s/loosing/losing/

  okay

> >Right now when the connected libvirtd restarts virsh gets a SIGPIPE
> >and dies, this change the behaviour to try to reconnect if the
> >signal was received or command error indicated a connection or RPC
> >failure. Note that the failing command is not restarted.
> >
> >* tools/virsh.c: catch SIGPIPE signals as well as connection related
> >   failures, add some automatic reconnection code and appropriate error
> >   messages.
> >
> >diff --git a/tools/virsh.c b/tools/virsh.c
> >index 65487ed..2ec9cfc 100644
> >--- a/tools/virsh.c
> >+++ b/tools/virsh.c
> >@@ -30,6 +30,7 @@
> >  #include<errno.h>
> >  #include<sys/stat.h>
> >  #include<inttypes.h>
> >+#include<signal.h>
> >
> >  #include<libxml/parser.h>
> >  #include<libxml/tree.h>
> >@@ -397,6 +398,58 @@ out:
> >      last_error = NULL;
> >  }
> >
> >+/*
> >+ * Detection of disconnections and automatic reconnection support
> >+ */
> >+static int disconnected = 0; /* we may have been disconnected */
> >+
> >+/*
> >+ * vshCatchDisconnect:
> >+ *
> >+ * We get here when a SIGPIPE is being raised, we can't do much in the
> >+ * handler, just save the fact it was raised
> >+ */
> >+static void vshCatchDisconnect(int sig, siginfo_t * siginfo,
> >+                               void* context ATTRIBUTE_UNUSED) {
> >+    if ((sig == SIGPIPE) || (siginfo->si_signo == SIGPIPE))
> >+        disconnected++;
> >+}
> >+
> >+/*
> >+ * vshSetupSignals:
> >+ *
> >+ * Catch SIGPIPE signals which may arise when desconnection from libvirtd occurs */
> 
> s/desconnection/disconnection/

 also split in 2 lines, it was too long

> >+static int
> >+vshSetupSignals(void) {
> >+    struct sigaction sig_action;
> >+
> >+    sig_action.sa_sigaction = vshCatchDisconnect;
> >+    sig_action.sa_flags = SA_SIGINFO;
> >+    sigemptyset(&sig_action.sa_mask);
> >+
> >+    sigaction(SIGPIPE,&sig_action, NULL);
> >+}
> >+
> >+/*
> >+ * vshReconnect:
> >+ *
> >+ * Reconnect after an
> >+ *
> >+ */
> >+static int
> >+vshReconnect(vshControl *ctl) {
> >+    if (ctl->conn != NULL)
> >+        virConnectClose(ctl->conn);
> >+
> >+    ctl->conn = virConnectOpenAuth(ctl->name,
> >+                                   virConnectAuthPtrDefault,
> >+                                   ctl->readonly ? VIR_CONNECT_RO : 0);
> >+    if (!ctl->conn)
> >+        vshError(ctl, "%s", _("Failed to reconnect to the hypervisor"));
> >+    else
> >+        vshError(ctl, "%s", _("Reconnected to the hypervisor"));
> >+    disconnected = 0;
> >+}
> >
> >  /* ---------------
> >   * Commands
> >@@ -8332,6 +8385,9 @@ vshCommandRun(vshControl *ctl, const vshCmd *cmd)
> >      while (cmd) {
> >          struct timeval before, after;
> >
> >+        if ((ctl->conn == NULL) || (disconnected != 0))
> >+            vshReconnect(ctl);
> >+
> >          if (ctl->timing)
> >              GETTIMEOFDAY(&before);
> >
> >@@ -8343,6 +8399,17 @@ vshCommandRun(vshControl *ctl, const vshCmd *cmd)
> >          if (ret == FALSE)
> >              virshReportError(ctl);
> >
> >+        /* try to catch automatically disconnections */
> 
> s/catch automatically/automatically catch/

  okay

> >+            ((disconnected != 0) ||
> >+             ((last_error != NULL)&&
> >+              (((last_error->code == VIR_ERR_SYSTEM_ERROR)&&
> >+                (last_error->domain == 13)) ||
> 
> 13 == VIR_FROM_REMOTE, right? Why use the literal?

  no reason, fixed !

> >+               (last_error->code == VIR_ERR_RPC) ||
> >+               (last_error->code == VIR_ERR_NO_CONNECT) ||
> >+               (last_error->code == VIR_ERR_INVALID_CONN)))))
> >+            vshReconnect(ctl);
> >+
> >          if (STREQ(cmd->def->name, "quit"))        /* hack ... */
> >              return ret;
> >
> >@@ -8673,9 +8740,11 @@ vshError(vshControl *ctl, const char *format, ...)
> >  {
> >      va_list ap;
> >
> >-    va_start(ap, format);
> >-    vshOutputLogFile(ctl, VSH_ERR_ERROR, format, ap);
> >-    va_end(ap);
> >+    if (ctl != NULL) {
> >+        va_start(ap, format);
> >+        vshOutputLogFile(ctl, VSH_ERR_ERROR, format, ap);
> >+        va_end(ap);
> >+    }
> >
> >      fputs(_("error: "), stderr);
> >
> >@@ -8751,6 +8820,9 @@ vshInit(vshControl *ctl)
> >      /* set up the library error handler */
> >      virSetErrorFunc(NULL, virshErrorHandler);
> >
> >+    /* set up the signals handlers to catch disconnections */
> >+    vshSetupSignals();
> >+
> >      ctl->conn = virConnectOpenAuth(ctl->name,
> >                                     virConnectAuthPtrDefault,
> >                                     ctl->readonly ? VIR_CONNECT_RO : 0);
> >

 Okay, applied and pushed,

  thanks for the feedback !

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel at veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/




More information about the libvir-list mailing list