[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Cluster-devel] [RFC 0/3] GFS2 rename race



GFS is reported to have a rename race due to the following calling sequence:

1. vfs do_rename() does path lookup (on directories)
2. vfs do_rename() calls lock_rename() for local locking
3. vfs do_rename() does final file names lookup
4. vfs do_rename() calls GFS's rename function

Since (2) is not cluster-aware, by the time GFS rename is invoked in (4), another node could have altered (remove or create) the destination file *after* dentry is obtained from (3). The stale dentry will fail the final round of inode sanity check. The system call is subsequently aborted with ENOENT error. Some applications (mostly mail servers) are said to have issues with this - the bug is reported as "error when two nodes rename two files to same new name".

One possible (simple) fix for the subject bugzilla is to alter vfs lock_rename() to take cluster locks. This requires upstream VFS layer change (and it does make the described problem go away). Unfortunately, this issue turns out to be much more complicated than the bugzilla has reported. It is not so much about rename race but a generic GFS inherent design issue. Since GFS never bothers to ask other nodes to refresh their dentry upon removing the file (this would create stale dentries hanging around in other nodes), each node is responsible to check the inode validity for relevant operations (e.g. call gfs_unlink_ok()). This implies if we do extra lockings in lock_rename(), it makes no difference in gfs_rename() - we still need to check gfs_unlink_ok() and it still can return ENOENT error, *if the stale dentries have existed before rename system call is invoked*. So we end up giving up the lock_rename fix but go for the following three patches:

Patch 3-1: Yank inode refresh logic out of gfs(x)_get_dentry (currently only used by NFS server) so both NFS and rename can share the function.
Patch 3-2:  Fix a possible rename deadlock ordering.
Patch 3-3: The core changes for the described bugzilla.

-- Wendy








[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]