Outages

Mike McGrath mmcgrath at redhat.com
Mon Jan 7 01:27:51 UTC 2008


On Sat, 5 Jan 2008, Mike McGrath wrote:

>
>
> On Sat, 5 Jan 2008, Mike McGrath wrote:
>
> > we've had some strange hardware outages over the last week or so.
> > Basically 3 box reboots, one time xen6, and once or twice xen1 (I'll have
> > to comb the logs more carefully)
> >
> > They appear to just lose power but by looking at the logs it seems they
> > all lose connection to iscsi just before hand so something else might be
> > going on.  Just the same keep an eye out for anything strange.  Those with
> > access feel free to comb the logs for more information.  I've contacted
> > RHIS to see if they've seen any issues in the colo (brown out, issues with
> > the netapp, etc)
> >
>
>
> Ok, just as I was walking out the door this happened again.  I've moved
> the more important guests off of xen1 on to xen2.  xen2 is now running a
> lot of guests - app1, app2, app3, koji1, releng1, noc1, puppet1, 3 test
> servers.  I'm going to be out for much of the night, any sysadmin-main
> guys interested in looking at whats going on with xen1, feel free.


I'm back and wired again so I've moved the related hosts back to xen1.
We'll let it run again.  I had been running proxy4 off of xen1 this whole
time so it at least had some load (and clustered load so it wouldn't
impact end users if it crashed again) and it seemed to run just fine.

I'm concerned about this box and am going to keep an eye on it.  I've
engaged the RH team to see if we experienced anything strange there (brown
outs, people in the cage, and other oddities).  If they have nothing and
the box continues to act up we'll have to talk to Dell

	-Mike




More information about the Fedora-infrastructure-list mailing list