[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] GFS and samba problem, again



Hi Abhi,

>I'm in the process of gathering a few windows boxes to run your test. >I should hopefully have 4 windows clients tomorrow.

Tell me something about the results.

>A warning first up, I'd recommend that you *not* use GFS2 and the >latest cluster suite for your tests just yet. With constant >development going on, some components are unstable and more problems >is not what you need right now :-) . The RHEL4 tag in CVS has stable >code from the most recent release. I'd suggest you compile gfs and e >cluster suite from that CVS branch.

Yes, I've been installing GFS2 in our test environment, and it seems very experimental for production use. I'm almost finishing installation, so I'll try it with samba to see how it works.

>I'm running a 3-node x86 cluster with RHEL4. The cluster suite and >gfs are from the RHEL4 branch of CVS along with some innocuous >patches. The samba version is 3.0.10-1.4E.2. I'm using an smb.conf >almost identical to the one you posted in your previous mail. I don't >have any other kernel/samba locking settings that I'm aware of.

Ok, I'm going to try with the CVS version you're proposing.

>You did mention in an email few weeks ago that you were trying to >export the same GFS mount over multiple samba servers on multiple >nodes simultaneously (active-active samba). I'm guessing you achieved >this by setting the locking and pid directories of samba to be on the >shared gfs filesystem. (This is a wrong approach and doesn't work. >There's a lot of debate on this in the samba and samba-technical list >archives are samba.org). I'm wondering if you still have these >directories on the GFS filesystem, which could possibly be causing >your hang?

Well, this was one of the unsuccessful test I did, but now I have samba in ext3 filesystem (locking and pids included). A few days ago, in samba-technical list, a proposal for a clustered samba was made. Details are in the following document: http://wiki.samba.org/index.php/Samba_%26_Clustering if you're interested in.
Of course, it's a proposal and I guess It won't be opperative soon.

>Also, do you see anything unusual in /var/log/messages on the GFS >node when this hang occurs? I'm interested in any >kernel-panic/assertion failures in GFS that might indicate some >problem.

I don't see nothing abnormal in GFS logs when samba hangs occur, but I made strace of smbd and I saw a lot of call systems that were unfinished until samba is restarted.

4665  11:09:31.242316 <... geteuid32 resumed> ) = 503 <0.000118>
4665 11:09:31.242405 write(19, "close fd=22 fnum=6371 (numopen=2"..., 34) = 34 <0.000031>
4665  11:09:31.242572 nanosleep({0, 2000001},  <unfinished ...>
4667  11:09:31.245063 kill(4665, SIG_0) = 0 <0.000018>
4665  11:09:31.248047 <... nanosleep resumed> NULL) = 0 <0.005406>
4665  11:09:31.249355 nanosleep({0, 2000001}, NULL) = 0 <0.002621>
4665  11:09:31.252091 nanosleep({0, 2000001}, NULL) = 0 <0.003853>
4665  11:09:31.256088 nanosleep({0, 2000001}, NULL) = 0 <0.003906>
.................. a lot of nanosleeps ..............................
4665  11:10:04.887037 nanosleep({0, 2000001},  <unfinished ...>
4665 11:10:04.887219 <... nanosleep resumed> 0) = ? ERESTART_RESTARTBLOCK (To be restarted) <0.000111>
4665  11:10:04.888197 +++ killed by SIGKILL +++
4667  11:10:04.890712 kill(4665, SIG_0 <unfinished ...>
4666 11:10:04.920965 kill(4665, SIG_0) = -1 ESRCH (No such process) <0.000017>
4667  11:10:04.934486 kill(4665, SIG_0 <unfinished ...>

Many Thanks,

		Sandra Hernández



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]