[libvirt] [PATCH v4 4/4] gluster: cache glfs connection object per volume

Peter Krempa pkrempa at redhat.com
Wed Dec 7 17:08:06 UTC 2016


On Tue, Dec 06, 2016 at 22:52:01 +0530, Prasanna Kumar Kalever wrote:
> Currently, in case if we have 4 extra attached disks, then for each disk
> we need to call 'glfs_init' (over network) and friends which could be costly.
> 
> Additionally snapshot(external) scenario will further complex the situation.

This part was extracted to a separate patch already, thus this doesn't
have to deal with snapshots in any way than with other gluster
connections.

> This patch maintain a cache of glfs objects per volume, hence the
> all disks/snapshots belonging to the volume whose glfs object is cached
> can avoid calls to 'glfs_init' and friends by simply taking a ref on
> existing object.
> 
> The glfs object is shared only if volume name and all the volfile servers match
> (includes hostname, transport and port number).
> 
> Thanks to 'Peter Krempa' for all the inputs.
> 
> Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever at redhat.com>
> ---
>  src/storage/storage_backend_gluster.c | 279 +++++++++++++++++++++++++++++++---
>  1 file changed, 254 insertions(+), 25 deletions(-)
> 
> diff --git a/src/storage/storage_backend_gluster.c b/src/storage/storage_backend_gluster.c
> index e0841ca..9f633ac 100644
> --- a/src/storage/storage_backend_gluster.c
> +++ b/src/storage/storage_backend_gluster.c
> @@ -36,6 +36,20 @@
>  
>  VIR_LOG_INIT("storage.storage_backend_gluster");
>  
> +
> +typedef struct _virStorageBackendGlusterState virStorageBackendGlusterState;
> +typedef virStorageBackendGlusterState *virStorageBackendGlusterStatePtr;
> +
> +typedef struct _virStorageFileBackendGlusterPriv virStorageFileBackendGlusterPriv;
> +typedef virStorageFileBackendGlusterPriv *virStorageFileBackendGlusterPrivPtr;
> +
> +typedef struct _virStorageBackendGlusterCache virStorageBackendGlusterCache;
> +typedef virStorageBackendGlusterCache *virStorageBackendGlusterCachePtr;
> +
> +typedef struct _virStorageBackendGlusterCacheConn virStorageBackendGlusterCacheConn;
> +typedef virStorageBackendGlusterCacheConn *virStorageBackendGlusterCacheConnPtr;
> +
> +
>  struct _virStorageBackendGlusterState {
>      glfs_t *vol;
>  
> @@ -47,8 +61,205 @@ struct _virStorageBackendGlusterState {
>      char *dir; /* dir from URI, or "/"; always starts and ends in '/' */
>  };
>  
> -typedef struct _virStorageBackendGlusterState virStorageBackendGlusterState;
> -typedef virStorageBackendGlusterState *virStorageBackendGlusterStatePtr;
> +struct _virStorageFileBackendGlusterPriv {
> +    glfs_t *vol;

You should also keep a reference to the object in the cache, so that you
can return it and don't have to look it up again.

> +    char *canonpath;
> +};
> +
> +struct _virStorageBackendGlusterCacheConn {
> +    virObjectLockable parent;
> +    virStorageFileBackendGlusterPrivPtr priv;

You don't really need the complete object here. you only need the glfs_t
part here. The 'cannonpath' variable is different for possible storage
source objects.

> +    size_t nhosts;
> +    virStorageNetHostDefPtr hosts;
> +    char *volname;
> +};
> +
> +struct _virStorageBackendGlusterCache {
> +    virMutex lock;
> +    size_t nConn;
> +    virStorageBackendGlusterCacheConnPtr *conn;
> +};
> +
> +
> +static virStorageBackendGlusterCachePtr virStorageBackendGlusterGlobalCacheObj;
> +static virClassPtr virStorageBackendGlusterCacheConnClass;
> +static virOnceControl virStorageBackendGlusterCacheOnce = VIR_ONCE_CONTROL_INITIALIZER;
> +static bool virStorageBackendGlusterCacheInitGlobalError;
> +static bool virStorageBackendGlusterCacheConnCleaned;

As said. Don't use global flags. You should be able to avoid both of
them.

Okay. I've seen below. They not only are avoidable, but your usage is
outright dangerous.

> +
> +
> +static void virStorageBackendGlusterCacheConnDispose(void *obj);
> +
> +
> +static int virStorageBackendGlusterCacheConnOnceInit(void)
> +{
> +    if (!(virStorageBackendGlusterCacheConnClass =
> +                                virClassNew(virClassForObjectLockable(),
> +                                "virStorageBackendGlusterCacheConnClass",
> +                                sizeof(virStorageBackendGlusterCacheConn),
> +                                virStorageBackendGlusterCacheConnDispose)))

I've noticed that you actually create a lockable object here, but don't
use the locks. This is currently okay, since it's write only. The lock
might come handy though ...

> +        return -1;
> +
> +    return 0;
> +}
> +
> +static void virStorageBackendGlusterCacheConnDispose(void *object)
> +{
> +    virStorageBackendGlusterCacheConnPtr conn = object;
> +
> +    glfs_fini(conn->priv->vol);
> +
> +    VIR_FREE(conn->priv->canonpath);
> +    VIR_FREE(conn->priv);
> +    VIR_FREE(conn->volname);
> +
> +    virStorageNetHostDefFree(conn->nhosts, conn->hosts);
> +
> +    /* all set to delete the connection from cache */
> +    virStorageBackendGlusterCacheConnCleaned = true;
> +}
> +
> +VIR_ONCE_GLOBAL_INIT(virStorageBackendGlusterCacheConn);
> +
> +static bool
> +virStorageBackendGlusterCompareHosts(size_t nSrcHosts,
> +                                     virStorageNetHostDefPtr srcHosts,
> +                                     size_t nDstHosts,
> +                                     virStorageNetHostDefPtr dstHosts)
> +{
> +    bool match = false;
> +    size_t i;
> +    size_t j;
> +
> +    if (nSrcHosts != nDstHosts)
> +        return false;
> +    for (i = 0; i < nSrcHosts; i++) {
> +        for (j = 0; j < nDstHosts; j++) {
> +            if (srcHosts[i].type != dstHosts[j].type)
> +                continue;
> +            switch (srcHosts[i].type) {
> +                case VIR_STORAGE_NET_HOST_TRANS_UNIX:
> +                    if (STREQ(srcHosts[i].u.uds.path, dstHosts[j].u.uds.path))
> +                        match = true;
> +                    break;
> +                case VIR_STORAGE_NET_HOST_TRANS_TCP:
> +                case VIR_STORAGE_NET_HOST_TRANS_RDMA:
> +                    if (STREQ(srcHosts[i].u.inet.addr, dstHosts[j].u.inet.addr)
> +                                               &&
> +                        STREQ(srcHosts[i].u.inet.port, dstHosts[j].u.inet.port))
> +                        match = true;
> +                    break;
> +                case VIR_STORAGE_NET_HOST_TRANS_LAST:
> +                    break;
> +            }
> +            if (match)
> +                break; /* out of inner for loop */
> +        }
> +        if (!match)
> +            return false;
> +        match = false; /* reset */
> +    }
> +    return true;
> +}
> +
> +static int
> +virStorageBackendGlusterCacheAdd(virStorageSourcePtr src,
> +                                 virStorageFileBackendGlusterPrivPtr priv)
> +{
> +    virStorageBackendGlusterCacheConnPtr newConn = NULL;
> +    virStorageBackendGlusterCacheConnPtr conn = NULL;
> +    size_t i;
> +
> +    for (i = 0; i < virStorageBackendGlusterGlobalCacheObj->nConn; i++) {

Umm ... you iterate the cache ...

> +        conn = virStorageBackendGlusterGlobalCacheObj->conn[i];
> +        if (STRNEQ(conn->volname, src->volume))
> +            continue;
> +        if (virStorageBackendGlusterCompareHosts(conn->nhosts, conn->hosts,
> +                                                 src->nhosts, src->hosts)) {
> +            return 0;
> +        }
> +    }
> +
> +    if (virStorageBackendGlusterCacheConnInitialize() < 0)
> +        return -1;

... before attempting to initialize the global cache?!? That doesn't
make sense. Also this call is not needed here, since the only entrypoint
is virStorageFileBackendGlusterInit which initializes it.

> +
> +    if (!(newConn = virObjectLockableNew(virStorageBackendGlusterCacheConnClass)))
> +        return -1;
> +
> +    if (VIR_STRDUP(newConn->volname, src->volume) < 0)
> +        goto error;
> +
> +    newConn->nhosts = src->nhosts;
> +    if (!(newConn->hosts = virStorageNetHostDefCopy(src->nhosts, src->hosts)))
> +        goto error;
> +    newConn->priv = priv;
> +
> +    if (VIR_APPEND_ELEMENT(virStorageBackendGlusterGlobalCacheObj->conn,
> +                           virStorageBackendGlusterGlobalCacheObj->nConn,
> +                           newConn) < 0)
> +        goto error;
> +
> +    return 0;
> +
> + error:
> +    virStorageNetHostDefFree(newConn->nhosts, newConn->hosts);
> +    VIR_FREE(newConn->volname);

You leak newConn in some cases.

> +    return -1;
> +}
> +
> +static glfs_t *
> +virStorageBackendGlusterCacheQuery(virStorageSourcePtr src)

We usually use "Lookup".

> +{
> +    virStorageBackendGlusterCacheConnPtr conn;
> +    size_t i;
> +
> +    if (virStorageBackendGlusterGlobalCacheObj == NULL)
> +        return NULL;
> +
> +    for (i = 0; i < virStorageBackendGlusterGlobalCacheObj->nConn; i++) {
> +        conn = virStorageBackendGlusterGlobalCacheObj->conn[i];
> +        if (STRNEQ(conn->volname, src->volume))
> +            continue;
> +        if (virStorageBackendGlusterCompareHosts(conn->nhosts, conn->hosts,
> +                                                 src->nhosts, src->hosts)) {
> +            virObjectRef(conn);
> +            return conn->priv->vol;
> +        }
> +    }
> +
> +    return NULL;
> +}
> +
> +static void
> +virStorageBackendGlusterCacheRefresh(virStorageSourcePtr src)

This name is weird. Did you mean "Release"?

> +{
> +    virStorageFileBackendGlusterPrivPtr priv = src->drv->priv;
> +    virStorageBackendGlusterCacheConnPtr conn = NULL;
> +    size_t i;
> +
> +    virMutexLock(&virStorageBackendGlusterGlobalCacheObj->lock);
> +    for (i = 0; i < virStorageBackendGlusterGlobalCacheObj->nConn; i++) {
> +        conn = virStorageBackendGlusterGlobalCacheObj->conn[i];
> +        if (STRNEQ(conn->volname, src->volume))
> +            continue;
> +        if (virStorageBackendGlusterCompareHosts(conn->nhosts, conn->hosts,
> +                                                 src->nhosts, src->hosts)) {
> +            virObjectUnref(conn);
> +            if (!virStorageBackendGlusterCacheConnCleaned) {

Umm, nope, that's not how it works. This is totaly thread unsafe. Other
thread can modify your global flag.

You probably did not check how virObjectUnref actually works. The return
value notifes you if the object was disposed (you removed the last
referernce) or not.

> +                if (priv && priv->canonpath)
> +                    VIR_FREE(priv->canonpath);
> +                goto unlock;
> +            }
> +            virStorageBackendGlusterCacheConnCleaned = false;
> +            VIR_DELETE_ELEMENT(virStorageBackendGlusterGlobalCacheObj->conn, i,
> +                               virStorageBackendGlusterGlobalCacheObj->nConn);
> +            VIR_FREE(src->drv);

This leaks src->drv in cases where the entry was used by two objects
simultaneously. This needs to be done in the calling function. (As also
pointed out in one of the other patches)

> +        }
> +    }
> +
> + unlock:
> +    virMutexUnlock(&virStorageBackendGlusterGlobalCacheObj->lock);
> +}
>  
>  static void
>  virStorageBackendGlusterClose(virStorageBackendGlusterStatePtr state)
> @@ -538,32 +749,15 @@ virStorageBackend virStorageBackendGluster = {
>  };
>  
>  
> -typedef struct _virStorageFileBackendGlusterPriv virStorageFileBackendGlusterPriv;
> -typedef virStorageFileBackendGlusterPriv *virStorageFileBackendGlusterPrivPtr;
> -
> -struct _virStorageFileBackendGlusterPriv {
> -    glfs_t *vol;
> -    char *canonpath;
> -};
> -
> -
>  static void
>  virStorageFileBackendGlusterDeinit(virStorageSourcePtr src)
>  {
> -    virStorageFileBackendGlusterPrivPtr priv = src->drv->priv;
> -
>      VIR_DEBUG("deinitializing gluster storage file %p (gluster://%s:%s/%s%s)",
>                src, src->hosts->u.inet.addr,
>                src->hosts->u.inet.port ? src->hosts->u.inet.port : "0",
>                src->volume, src->path);
>  
> -    if (priv->vol)
> -        glfs_fini(priv->vol);
> -    VIR_FREE(priv->canonpath);
> -
> -    VIR_FREE(priv);
> -
> -    VIR_FREE(src->drv);
> +    virStorageBackendGlusterCacheRefresh(src);
>  }
>  
>  static int
> @@ -610,6 +804,20 @@ virStorageFileBackendGlusterInitServer(virStorageFileBackendGlusterPrivPtr priv,
>      return 0;
>  }
>  
> +static void
> +virStorageBackendGlusterCacheInit(void)
> +{
> +    if (VIR_ALLOC(virStorageBackendGlusterGlobalCacheObj) < 0) {
> +        virStorageBackendGlusterCacheInitGlobalError = false;
> +        return;
> +    }
> +
> +    if (virMutexInit(&virStorageBackendGlusterGlobalCacheObj->lock) < 0) {
> +        virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +                       _("Unable to initialize mutex"));
> +        virStorageBackendGlusterCacheInitGlobalError = false;
> +    }
> +}
>  
>  static int
>  virStorageFileBackendGlusterInit(virStorageSourcePtr src)
> @@ -632,11 +840,32 @@ virStorageFileBackendGlusterInit(virStorageSourcePtr src)
>                src, priv, src->volume, src->path,
>                (unsigned int)src->drv->uid, (unsigned int)src->drv->gid);
>  
> +    if (virOnce(&virStorageBackendGlusterCacheOnce,
> +                virStorageBackendGlusterCacheInit) < 0)
> +        return -1;
> +
> +    if (virStorageBackendGlusterCacheInitGlobalError)
> +        return -1;
> +
> +    virMutexLock(&virStorageBackendGlusterGlobalCacheObj->lock);
> +
> +    priv->vol = virStorageBackendGlusterCacheQuery(src);
> +    if (priv->vol) {
> +        src->drv->priv = priv;
> +        virMutexUnlock(&virStorageBackendGlusterGlobalCacheObj->lock);
> +        return 0;
> +    }
> +
>      if (!(priv->vol = glfs_new(src->volume))) {
>          virReportOOMError();
> -        goto error;
> +        goto unlock;
>      }
>  
> +    if (virStorageBackendGlusterCacheAdd(src, priv) < 0)
> +        goto unlock;
> +
> +    virMutexUnlock(&virStorageBackendGlusterGlobalCacheObj->lock);
> +

So yet another thread-unsafe part. You correctly add the not-yet-opened
connection to the cache, but there is no mechanism to guard suff after
this.

If a second thread obtains the cache entry wile the first one is not yet
finished opening the connection it will get an invalid one.

Any thread obtaining an entry from the cache needs to check whether the
conection was initialized already as the first thing to do. It can
continue only if it's already open and otherwise needs to wait until the
original thread finishes.

The first thread that added the object into the cache needs to open the
connection and notify all other threads potentialy waiting on the
condition that it was opened. A virCond should help.

>      for (i = 0; i < src->nhosts; i++) {
>          if (virStorageFileBackendGlusterInitServer(priv, src->hosts + i) < 0)
>              goto error;
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20161207/8e35c39e/attachment-0001.sig>


More information about the libvir-list mailing list