[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Linux-cluster] IP Relocate Error / IP Restart error



Lon Hohberger wrote:
On Mon, Jul 09, 2007 at 04:06:40PM +0200, dan deshayes algitech com wrote:
Hi,
thx for the reply but I'm not sure thats my problem.
I couldn't find the syntax for disabling the exclusivity (I'm not using gui)
but as far as I've understood its disabled by default. I tried with
exclusive="0" (not sure if its the right syntax though) but didn't solve
my problem.
But if the cluster was running with exclusive-mode the relocation
shouldn't work either, right?
As stated earlier the service restarts fine aslong as the node already
have an external ip.
Anyone with other ideas. maybe related to the "IP monitor failing
periodically"? but I don't have any problems running the cluster aslong as
the bond0 interface goes down, so maybe not.

I haven't figured out the cause here, but disabling the 'ping' test
seems to fix it.

(edit ip.sh and change the 'ping' command to /bin/true or whatever)

I'm afraid it didn't help much.
I changed the pingcmd in the function ping_check to /bin/true restarted the rgmanagers but didn't work.

Here is my full configuration: http://nangilima.se/cluster.conf

I can have the full cluster running without problem, when first starting

bit when i then try to restart it with 'clusvcadm -R' it says:
Jul 10 16:22:20 asl012 clurgmgrd[412]: <notice> Stopping service service:www-project1 Jul 10 16:22:31 asl012 clurgmgrd[412]: <notice> Service service:www-project1 is stopped Jul 10 16:22:31 asl012 clurgmgrd[412]: <notice> Starting stopped service service:www-project1 Jul 10 16:22:32 asl012 clurgmgrd[412]: <notice> start on ip "<external ip 1>" returned 1 (generic error) Jul 10 16:22:32 asl012 clurgmgrd[412]: <warning> #68: Failed to start service:www-project1; return value: 1 Jul 10 16:22:32 asl012 clurgmgrd[412]: <notice> Stopping service service:www-project1 Jul 10 16:22:32 asl012 clurgmgrd: [412]: <err> script:psql-db: stop of /etc/init.d/postgresql failed (returned 1) Jul 10 16:22:32 asl012 clurgmgrd[412]: <notice> stop on script "psql-db" returned 1 (generic error) Jul 10 16:22:32 asl012 clurgmgrd[412]: <crit> #12: RG service:www-project1 failed to stop; intervention required Jul 10 16:22:32 asl012 clurgmgrd[412]: <notice> Service service:www-project1 is failed Jul 10 16:22:32 asl012 clurgmgrd[412]: <crit> #13: Service service:www-project1 failed to stop cleanly

then i disable the service and enable it on node usl001-mgmnt which works fine (since it got net through its own ip and route) Jul 10 16:25:18 usl001 clurgmgrd[30130]: <notice> Starting disabled service service:www-project1 Jul 10 16:25:18 usl001 avahi-daemon[3533]: Registering new address record for <external ip 1> on bond0. Jul 10 16:25:22 usl001 clurgmgrd[30130]: <notice> Service service:www-project1 started

also relocating it to node usl002-mgmnt works and then back to usl001-mgmnt works. But never back to asl012-mgmnt except when i manully puts back the ip and route.

I'm using bond0 interface configured the following:
DEVICE=bond0
USERCTL=no
ONBOOT=yes
BROADCAST=<broadcast>
NETWORK=<network>.32
NETMASK=255.255.255.224
IPADDR=<external ip 1>
GATEWAY=<gw ip>

with slave interfaces eth0 and eth3 like this:
DEVICE=eth0 /3
USERCTL=no
ONBOOT=yes
MASTER=bond0
SLAVE=yes
BOOTPROTO=none

I can supply more info if anyone wants to give it a shot.
sorry for repeting my question but i'm closing a deadline and walking blind ;)

Regards, Dan


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]