[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] DRBD + GNBD + GFS Race Conditions?

Dear listmates,

Floating about the Internet are many howtos and references to backing GNBD with DRBD in order to have failover GNBD and mount GFS atop of the GNBD device. Does anyone know how the following possible race condition is handled?

1. GFS writes to its GNBD device.
   GNBD client node writes to GNBD server node.
   GNBD server writes to DRBD-primary.
   DRBD begins to write to itself and to DRBD-secondary.
Before DRBD completes the write to DRBD-secondary (thus, before it returns since writes are synchronous) the DRBD-primary node looses power.
   The GNBD server dies with the power loss.
   GNBD client node drops connection to the GNBD server.

2. Heartbeat notices the death of DRBD-primary, switches the DRBD-secondary to DRBD-primary, re-exports /dev/drbd0 via GNBD, and re-creates the virtual IP which the GNBD client was connecting to.

3. The GNBD client writing on behalf of GFS reconnects.

Now, what happens to the write originally going to the DRBD volume? Will the GNBD-client retry the write? Are there situations where the write could be dropped all together?

Are there other kinds of race conditions which could take place? Other concerns outside of this scenario?

We are thinking about implementing DRBD+GNBD+GFS+Xen to support failover and domain migration. In the event of a failure like power loss, I would like to be certain that when the failed-to GNBD server node comes online, that any GNBD clients which were half-way through a write will re-commit the write.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]