[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [NFS] nfs write problems



Trond Myklebust wrote:
On Tue, 2007-10-02 at 15:56 -0400, Bob Kryger wrote:
So, I have a relatively new system on which I am seeing strange NFS behavior.

In short I am getting seemingly random errors in files written via NFS.
[snip details]
Anyone ever seen anything like this before?
Suggest where I might look next?
Additional tests?

Feel free to describe your test in a bit more detail. Without more
information, we obviously can't rule out the existence of an NFS bug,
I was trying to be thorough, I hope I succeeded.

Is there anything else that might be helpful? I certainly would not go to a bug first, as I may very well have something misconfigured, but I cannot seem to identify what that might be. I do have about 8 other linux NFS servers in production on different hardware, SATA mostly, where I am not seeing any issues. I don't think it's a hardware issue though, as I cannot reproduce the problem without the use of NFS. (Hmm, maybe if I NFS mount to the server itself. Would that prove anything?)
however usually whenever people describe this sort of problem it is
because they have failed to understand the NFS caching model as
described in

    http://nfs.sourceforge.net/#faq_a8
Excellent, Thanks for the lead and I will test these items shortly.

After reading the FAQ, I'm not sure I see how the cache consistency mechanisms apply to this problem. If I test the files after they are closed shouldn't the data be consistent, written completely to the server? If there were a data write error should I not see it somewhere? If so where? client? server? would it be up to the client program to catch it? I wonder if dd would see it. For the purpose of testing, I have limited this server to serving to only a single client at a time, so there will be no other variables/systems interfering.

So to test this I read back the data of a newly written, 256M file, right from the client that wrote it. In this case with nocto option. This should take the client cache into account. I compared the results from the server side as well. It had errors, the same errors in the same locations on both the client and the server. So, this seems to indicate that it is the issue is on the nfs client not the server. (hmmm) But the same client does not have a problem with any other server. At least one has never been reported. I'll verify that rigorously.

I am not familiar with the mechanism that NFS uses to verify data validity between the client and the server. I assume that there is some sort of checksum. Did I mention that this is NFSv3? At least I have not specified v4.
So please include a reproducible test for us.
Easily reproducible on this system. Short of providing access to this system, not sure what more to do. Oh, wait, was that humor? Indicating that I have provided significant detail? Dang, I've got to sharpen my international tongue-in-cheek detector.
Cheers
  Trond
Cool name

thanks
Bob



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]