[Cluster-devel] Re: [PATCH 2/2] Fix lockd panic

Wed Jan 9 23:33:25 UTC 2008

Wendy Cheng wrote:
> Wendy Cheng wrote:
>
>> Neil Brown wrote:
>>
>>> Some options:
>>>
>>> Have an initial patch which removes all references to f_locks and
>>> includes the change in this patch. With that in place your main
>>> patch won't introduce a bug. If you do this, you should attempt to
>>> understand and justify the performance impact (does nlm_traverse_files
>>> become quadratic in number of locks. Is that acceptable?).
>>>
>>> Change the first patch to explicitly update f_count if you bypass the
>>> call to nlm_inspect_file.
>>>
>>> something else???
>>>
>>>
>>
>> Let's see what hch says in another email... will come back to this soon.
>>
>
> Will do a quick and dirty performance measure tomorrow when I get to 
> the office. Then we'll go from there.

The test didn't go well and I'm still debugging .. However, let's 
revisit the issue:

[quot from Neil's comment]
The important difference between the old code and the new code here is
that the old code tests "file->f_locks" while the new code iterates
through i_flock to see if there are any lockd locks.

f_locks is set to a count of lockd locks in nlm_traverse_locks which
*was* always called by nlm_inspect_file which is called immediately
before the code you are changing.
But since your patch, nlm_inspect_file does not always call
nlm_traverse_locks, so there is a chance that f_locks is wrong.

With this patch, f_locks it not used at all any more.
[end quot]

The point here is "with this patch, f_locks it not used at all any 
more." Note that we have a nice inline function "nlm_file_inuse", why 
should we use f_locks (that I assume people agree that it is awkward) ? 
Could we simply drop f_locks all together in this section of code?

-- Wendy