[Cluster-devel] [PATCH 17/18] [try #2] DLM: fix to reschedule rwork

Tue Sep 12 09:02:02 UTC 2017

When an error occurs in kernel_recvmsg or kernel_sendpage and
close_connection is called and receive work is already scheduled,
receive work is canceled. In that case, the receive work will not
be scheduled forever after reconnection, because CF_READ_PENDING
flag is established.

Signed-off-by: Tadashi Miyauchi <miyauchi at toshiba-tops.co.jp>
Signed-off-by: Tsutomu Owa <tsutomu.owa at toshiba.co.jp>
---
 fs/dlm/lowcomms.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 5d0de91..c64e39f 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -593,10 +593,14 @@ static void close_connection(struct connection *con, bool and_other,
 {
 	bool closing = test_and_set_bit(CF_CLOSING, &con->flags);
 
-	if (tx && !closing && cancel_work_sync(&con->swork))
+	if (tx && !closing && cancel_work_sync(&con->swork)) {
 		log_print("canceled swork for node %d", con->nodeid);
-	if (rx && !closing && cancel_work_sync(&con->rwork))
+		clear_bit(CF_WRITE_PENDING, &con->flags);
+	}
+	if (rx && !closing && cancel_work_sync(&con->rwork)) {
 		log_print("canceled rwork for node %d", con->nodeid);
+		clear_bit(CF_READ_PENDING, &con->flags);
+	}
 
 	mutex_lock(&con->sock_mutex);
 	if (con->sock) {
-- 
2.7.4