Reproducible error on FC4 + rsync.

Naoki naoki at valuecommerce.com
Wed Aug 31 09:45:14 UTC 2005


Two boxes both with FC4.

I start an rsync of about 11GB between them, and hilarity ensues.  The
client process hangs and needs to be CTRL-C'ed.  The server locks up
completely.  Nothing returns from the console after entering a username
and hitting return.  It does reply to ping though.

Client - 2.6.12-1.1398_FC4 i386
Server - 2.6.12-1.1447_FC4smp x86_64 

I need to power cycle the server and here is what from 'sar' I found
when it came back :

proc/s = 100.00
cswch/s = 0.00
%iowait = 100.00
intr/s = 100

And so on..  There are no kernel messages either on console, nor in
messages.

I left top running on the server and this was the last screen dump :


top - 16:27:01 up 16 min,  1 user,  load average: 0.38, 0.10, 0.06
Tasks:  58 total,   1 running,  57 sleeping,   0 stopped,   0 zombie
Cpu(s):  3.0% us, 18.8% sy,  0.0% ni, 35.9% id, 41.6% wa,  0.0% hi,
0.7% si
Mem:   1023568k total,   508976k used,   514592k free,    40280k buffers
Swap:        0k total,        0k used,        0k free,    99340k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
 2329 root      16   0  282m 266m  744 D 37.3 26.6   0:01.42 rsync
 2253 root      15   0 38524 2876 2096 S  6.3  0.3   0:01.62 sshd

Nothing crazy going on there and the cursor is still blinking.

The rsync and SSH running on the client are stuck on an unfinished
"select()" call.
tethereal shows no traffic between the two (ohh by the way how do you
get ethereal to show a more tcpdump style output so I can actually see
the ports things are coming from / going to?). 

So, what do you recommend ?  Network kernel debug or oprofile perhaps?







More information about the fedora-list mailing list