[Linux-cluster] Multiple communication channels

Wed Dec 29 19:46:52 UTC 2010

On 12/29/2010 02:33 PM, Digimer wrote:
> On 12/29/2010 02:02 PM, Stefan Lesicnik wrote:
>> Hi all,
>>
>> I am running RHCS 5 and have a two node cluster with a shared qdisk. I have a bonded network bond0 and a back to back crossover eth1.
>>
>> Currently I have multicast cluster communication over the crossover, but was wondering if it was possible to use bond0 as an alternative / failover. So if eth1 was down, it could still communicate?
>>
>> I havent been able to find anything in the FAQ / documentation that would suggest this, so I thought I would ask.
>>
>> Thanks alot and I hope everyone has a great new year :)
>>
>> Stefan 
> 
> From: http://wiki.alteeve.com/index.php/Openais.conf
> 

I forgot to mention that there are the redundant ring options as well.

------------------------------------
	### Redundant Ring Protocol options are below. These are ignored if
	### only one 'interface' directive is defined.

	# This is used to control how the Redundant Ring Protocol is used. If
	# you only have one 'interface' directive, the default is 'none'. If
	# you have two, then please set 'active' or 'passive'. The trade off
	# is that, when the network is degraded, 'active' provides lower
	# latency from transmit to delivery and 'passive' may nearly double the
	# speed of the totem protocol when not CPU bound.
	# Valid options: none, active, passive.
	rrp_mode: passive

	# The next three variables are relevant depending on which mode
	# 'rrp_mode' is set to. Both modes use 'rrp_problem_count_threshold'
	# but only 'active' uses 'rrp_problem_count_timeout' and
	# 'rrp_token_expired_timeout'.
	#
	# - In 'active' mode:
	# If a token doesn't arrive in 'rrp_token_expired_timeout' milliseconds
	# an internal counter called 'problem_count' is incremented by 1. If a
	# token arrives within 'rrp_problem_count_timeout' however, the
	# internal decreases by '1'. If the internal counter equals or exceeds
	# the 'rrp_problem_count_threshold' at any time, the effected interface
	# will be flagged as faulty and it will no longer be used.
	#
	# - In 'passive' mode:
	# The two interfaces have internal counters called 'token_recv_count'
	# and 'mcast_recv_count' that are incremented by 1 each time a token
	# or multicast message is received, respectively. These counts for each
	# interface is counted and if the counts should differ by more than
	# 'rrp_problem_count_threshold', then the interface with the lower
	# count is flagged as faulty and it will no longer be used.
	#
	# If an interface is flagged as faulty, an administrator will need to
	# manually re-enable it.

	# The default problem count timeout is '1000' milliseconds.
	rrp_problem_count_timeout: 1000

	# The default problem count threshold is '20'.
	rrp_problem_count_threshold: 20

	# This is the time in milliseconds to wait before incrementing the
	# internal problem counter. Normally, this variable is automatically
	# calculated by openais and, thus, should not be defined here without
	# fully understanding the effects of doing so.
	#
	# In short; The should always be at least 'rrp_problem_count_timeout'
	# minus 50 milliseconds with the result being divided by
	# 'rrp_problem_count_threshold' or else a reconfiguration can occur.
	# Using the default values then, the default is (1000 - 50)/20=47.5,
	# rounded down to '47'.
	#rrp_token_expired_timeout: 47
------------------------------------

Cheers

-- 
Digimer
E-Mail: digimer at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org