[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Linux-cluster] fence_ovh - Fence agent for OVH (Proxmox 3)



  I've improved my former fence_ovh script so that it works in Proxmox 3 and so that it uses suds library as I was suggested in the linux-cluster mailing list.

1) What is fence_ovh 

fence_ovh is a fence agent based on python for the big French datacentre provider OVH. You can get information about OVH on: http://www.ovh.co.uk/ . I also wanted to make clear that I'm not part of official OVH staff. 

2) Features 
The script has two main functions: 

* Reboot into rescue mode (action=off) 
* Reboot into the hard disk (action=on;action=reboot) 

3) Technical details 
So as you might deduce the classical fence mechanism which turns off the other node is not actually done by turning off the machine but by rebooting it into a rescue mode. 

Another particular thing to mention is that the script checks if the machine has rebooted ok into rescue mode thanks to an OVH API which reports the date when the server rebooted. By the way the OVH API is also used in the main function that consists in rebooting the machine into rescue mode. 

4) How to use it 

4.1) Make sure python-suds package is installed (Debian/Ubuntu).
4.2) Save fence_ovh in /usr/sbin 
4.3) Run: ccs_update_schema so that new metadata is put into cluster.rng 
4.4) If needed validate your configuration: 
ccs_config_validate -v -f /etc/pve/cluster.conf.new 
4.5) Here's an example of how to use it in cluster.conf:

<?xml version="1.0"?>
<cluster name="ha-008-010" config_version="3">

<cman keyfile="/var/lib/pve-cluster\
/corosync.authkey" transport="udpu" \
two_node="1" expected_votes="1">
</cman>

<fencedevices>
        <fencedevice agent="fence_ovh" \
name="fence008" email="myadmin domain com" \
ipaddr="ns1234" login="ab12345-ovh" passwd="MYSECRET" />
        <fencedevice agent="fence_ovh" \
name="fence010" email="myadmin domain com" \
ipaddr="ns5678" login="ab12345-ovh" passwd="MYSECRET" />
</fencedevices>

  <clusternodes>
<clusternode name="nodeA.your.domain" nodeid="1" votes="1">
  <fence>
    <method name="1">
      <device name="fence008" action="off"/>
    </method>
  </fence>
</clusternode>
<clusternode name="nodeB.your.domain" nodeid="2" votes="1">
  <fence>
    <method name="1">
      <device name="fence010" action="off"/>
    </method>
  </fence>
</clusternode>
</clusternodes>



</cluster>



Finally I attach to this email the first version of fence_ovh script for Proxmox 3.

The original thread from Proxmox forum from which I adapted original secofor script: http://forum.proxmox.com/threads/11066-Proxmox-HA-Cluster-at-OVH-Fencing?p=75152#post75152 

-- 

-- 
Adrián Gibanel 
I.T. Manager 

+34 675 683 301 
www.btactic.com 



Ens podeu seguir a/Nos podeis seguir en: 

i 


Abans d´imprimir aquest missatge, pensa en el medi ambient. El medi ambient és cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos. 

AVIS: 
El contingut d'aquest missatge i els seus annexos és confidencial. Si no en sou el destinatari, us fem saber que està prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autorització corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge . 

AVISO: 
El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que está prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorización correspondiente. Si han recibido este mensaje por error, les agradeceríamos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje . 
#!/usr/bin/python
# Copyright 2013 Adrian Gibanel Lopez (bTactic)
# Adrian Gibanel improved this script
# at 2013 to add verification of success
# and to output metadata

# Based on:
# This is a fence agent for use at OVH
# As there are no other fence devices available,
# we must use OVH's SOAP API #Quick-and-dirty
# assemled by Dennis Busch, secofor GmbH,
# Germany
# This work is licensed under a
# Creative Commons Attribution-ShareAlike 3.0 Unported License.

# Manual call parametres example
#
# login=ab12345-ovh
# passwd=MYSECRET
# email=admin myadmin
# ipaddr=ns12345
# action=off

# # where ipaddr is your server's OVH name

import sys, re, pexpect
sys.path.append("/usr/share/fence")
from fencing import *

import sys
from suds.client import Client
from suds.xsd.doctor import ImportDoctor, Import
import time
from datetime import datetime

OVH_RESCUE_PRO_NETBOOT_ID='28'
OVH_HARD_DISK_NETBOOT_ID='1'
STATUS_HARD_DISK_SLEEP=240 # Wait 4 minutes to SO to boot
STATUS_RESCUE_PRO_SLEEP=150 # Wait 2 minutes 30 seconds to Rescue-Pro to run
OVH_FENCE_DEBUG=False # True or False for debug

def netboot_reboot(nodeovh,login,passwd,email,mode):
    imp = Import('http://schemas.xmlsoap.org/soap/encoding/')
    url='https://www.ovh.com/soapi/soapi-re-1.59.wsdl'
    imp.filter.add('http://soapi.ovh.com/manager')
    d = ImportDoctor(imp)
    soap = Client(url, doctor=d)
    session = soap.service.login(login, passwd, 'es', 0)
 
    #dedicatedNetbootModifyById changes the mode of the next reboot
    result = soap.service.dedicatedNetbootModifyById(session, nodeovh, mode, '', email)
 
    #dedicatedHardRebootDo initiates a hard reboot on the given node
    soap.service.dedicatedHardRebootDo(session, nodeovh, 'Fencing initiated by cluster', '', 'es')
 
    soap.service.logout(session)

def reboot_status(nodeovh,login,passwd):
    imp = Import('http://schemas.xmlsoap.org/soap/encoding/')
    url='https://www.ovh.com/soapi/soapi-re-1.59.wsdl'
    imp.filter.add('http://soapi.ovh.com/manager')
    d = ImportDoctor(imp)
    soap = Client(url, doctor=d)
    session = soap.service.login(login, passwd, 'es', 0)
 
    result = soap.service.dedicatedHardRebootStatus(session, nodeovh)
    tmpstart = datetime.strptime(result.start,'%Y-%m-%d %H:%M:%S')
    tmpend = datetime.strptime(result.end,'%Y-%m-%d %H:%M:%S')
    result.start = tmpstart
    result.end = tmpend

    soap.service.logout(session)
    return result

#print stderr to file
save_stderr = sys.stderr
errlog = open("/var/log/fence_ovh_error.log","a")
sys.stderr = errlog

global all_opt

device_opt = [  "email", "ipaddr", "action" , "login" , "passwd" , "nodename" ]

ovh_fence_opt = {
        "email" : {
                "getopt" : "Z:",
                "longopt" : "email",
                "help" : "-Z, --email=<email>          email for reboot message: admin domain com",
                "required" : "1",
                "shortdesc" : "Reboot email",
                "default" : "",
                "order" : 1 },
}

all_opt.update(ovh_fence_opt)
all_opt["ipaddr"]["shortdesc"] = "OVH node name"

atexit.register(atexit_handler)
options=check_input(device_opt,process_input(device_opt))
# Not sure if I need this old notation
## Support for -n [switch]:[plug] notation that was used before
if ((options.has_key("-n")) and (-1 != options["-n"].find(":"))):
	(switch, plug) = options["-n"].split(":", 1)
	if ((switch.isdigit()) and (plug.isdigit())):
		options["-s"] = switch
		options["-n"] = plug

if (not (options.has_key("-s"))):
	options["-s"]="1"

docs = { }
docs["shortdesc"] = "Fence agent for OVH"
docs["longdesc"] = "fence_ovh is an Power Fencing agent \
which can be used within OVH datecentre. \
Poweroff is simulated with a reboot into rescue-pro \
mode. \
 /usr/local/etc/ovhsecret example: \
 \
 [OVH] \
 Login = ab12345-ovh \
 Passwd = MYSECRET \
"
docs["vendorurl"] = "http://www.ovh.net";
show_docs(options, docs)


#I use a own logfile for debugging purpose
if OVH_FENCE_DEBUG:
    logfile=open("/var/log/fence_ovh.log", "a");
    logfile.write(time.strftime("\n%d.%m.%Y %H:%M:%S \n"))
    logfile.write("Parameter:\t")
    for val in sys.argv:
	logfile.write(val + " ")
	logfile.write("\n")

print options

action=options['--action']
email=options['--email']
login=options['--username']
passwd=options['--password']
nodeovh=options['--ip']
if nodeovh[-8:] != '.ovh.net':
    nodeovh += '.ovh.net'
    
# Save datetime just before changing netboot
before_netboot_reboot = datetime.now()

if action == 'off':
    netboot_reboot(nodeovh,login,passwd,email,OVH_RESCUE_PRO_NETBOOT_ID) #Reboot in Rescue-pro
elif action == 'on':
    netboot_reboot(nodeovh,login,passwd,email,OVH_HARD_DISK_NETBOOT_ID) #Reboot from HD
elif action == 'reboot':
    netboot_reboot(nodeovh,login,passwd,email,OVH_HARD_DISK_NETBOOT_ID) #Reboot from HD
else:
    if OVH_FENCE_DEBUG:
	logfile.write("nothing to do\n")
	logfile.close()
    errlog.close()
    sys.exit()

if action == 'off':
    time.sleep(STATUS_RESCUE_PRO_SLEEP) #Reboot in vKVM
elif action == 'on':
    time.sleep(STATUS_HARD_DISK_SLEEP) #Reboot from HD
elif action == 'reboot':
    time.sleep(STATUS_HARD_DISK_SLEEP) #Reboot from HD
else:
    if OVH_FENCE_DEBUG:
	logfile.write("No sense! Check script please!\n")
	logfile.close()
    errlog.close()
    
    sys.exit()

after_netboot_reboot = datetime.now()

# Verification of success

reboot_start_end=reboot_status(nodeovh,login,passwd)
if OVH_FENCE_DEBUG:
    logfile.write("reboot_start_end.start: " +reboot_start_end.start.strftime('%Y-%m-%d %H:%M:%S')+"\n")
    logfile.write("before_netboot_reboot: " +before_netboot_reboot.strftime('%Y-%m-%d %H:%M:%S')+"\n")
    logfile.write("reboot_start_end.end: " +reboot_start_end.end.strftime('%Y-%m-%d %H:%M:%S')+"\n")
    logfile.write("after_netboot_reboot: " +after_netboot_reboot.strftime('%Y-%m-%d %H:%M:%S')+"\n")

if ((reboot_start_end.start > before_netboot_reboot) and (reboot_start_end.end < after_netboot_reboot)):
    if OVH_FENCE_DEBUG:
	logfile.write("Netboot reboot went OK.\n")
else:
    if OVH_FENCE_DEBUG:
	logfile.write("ERROR: Netboot reboot wasn't OK.\n")
	logfile.close()
    errlog.close()
    sys.exit(1)


if OVH_FENCE_DEBUG:
    logfile.close()
errlog.close()

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]