Mike McGrath wrote:
On Mon, 31 Mar 2008, Toshio Kuratomi wrote:Someone just reported that the wiki was down and I thought xen7 might have crashed again. when I tried to ssh to it my connection timed out. ping xen7 was fine. I serial consoled in and then immediately tried to ssh to xen7 again. it worked. uptime showed that xen7 has been up since this morning and the app servers are all running, etc. There are no iscsi errors in /var/log/messages. Is there a possibility that we're experiencing some sort of networking issue with xen7? Maybe that issue is exacerbating the iscsi bug that causes our xen hosts to crash?you can ssh to (for example) fas1 then ssh to xen7.
To clarify why this is, it's because fas1 is a xen guest on xen7.
heads up for the list: Mike found a mac address that's doing this: 00:1A:64:2C:81:02.I discovered that just before I left on Friday. This leads me to believe that something else on our network is using that IP. We'll have to look at the arp tables and see whats going on.
The good news is this shouldn't affect the xen guests though, in theory, it is affecting the iscsi connection on xen7. xen1 and xen2 are the only hosts we've seen actually rebooting though xen7 is new.
Clarification, xen7 has rebooted itself three times since you left :-/ And I've just had to reboot it because of this: Apr 1 00:32:53 xen7 kernel: scsi 2:0:0:0: rejecting I/O to dead device Apr 1 00:35:31 xen7 last message repeated 11 times -Toshio
Description: OpenPGP digital signature