CENTOS 7构建HA集群(2)

2019-05-24 14:24

2 Nodes configured 3 Resources configured

Online: [ node01 node02 ]

Full list of resources:

vip (ocf::heartbeat:IPaddr2): Started node02 Clone Set: dlm-clone [dlm]

Started: [ node01 node02 ] ##DLM的状态

PCSD Status:

node01: Online node02: Online

Daemon Status:

corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled

五、stonith配置

查看本系统支持的fence设备 pcs stonith list

查看即将要使用的fence设备相关信息 pcs stonith describe fence_ilo4

由于服务器是HP DL380 GEN8的，支持ILO4，但是在实际配置中采用fence_ilo4却发现怎么也不通,man fence_ilo4发现fence_ipmilan 也可以配置ilo4的FENCE设备，但是必须要加lanplus=\参数。

pcs cluster cib stonith_cfg ##生产初始配置文件stonith_cfg

pcs -f stonith_cfg stonith create ipmi-fence-node01 fence_ipmilan parms lanplus=\pcmk_host_list=\pcmk_host_check=\action=\ipaddr=\

pcs -f stonith_cfg stonith create ipmi-fence-node02 fence_ipmilan parms lanplus=\pcmk_host_list=\pcmk_host_check=\action=\pcmk_host_list=\ipaddr=\login=USERID passwd=password op monitor interval=60s

解释：创建一个名为ipmi-fence-node01的fence设备名称用于建node01的fence，pcmk_host_check=\的功能是将node01与192.168.103.1对应，后面login=USERID passwd=password op monitor interval=60s不再解释。

pcs -f stonith_cfg stonith

检查stonith_cfg中stonith配置信息

pcs -f stonith_cfg property set stonith-enabled=true 上文关闭了stonish，现在开启stonish pcs -f stonith_cfg property

检查stonith_cfg中stonith是否已经开启

pcs cluster cib-push stonith_cfg 将stonith_cfg写入cib.xml

node02上测试FENCE是否成功stonith_admin --reboot node01 node01上测试FENCE是否成功stonith_admin --reboot node02

pcs cluster standby node01 将node01业务VIP迁移到node02上测试集群是否正常。

在集群所有节点重启后发现gfs的文件系统无法自动挂载，定义了fstab也不行，经过分析发现系统开机时候执行了pcs cluster start，而且很快就返回启动成功的结果，但是系统日志里面显示仍然还启动完成，考虑到系统进入系统之后集群有可能还没启动完成，所有自然无法挂载gfs的文件系统。

如果无法自动挂载，我自己编写了一个循环挂载的脚本。供参考。编辑脚本 mountnas.sh #!/bin/bash i=1

while(($i<50)) do

mount |grep nas

if [ $? = 1 ];then mount /dev/my_vg/gfsdata /nas else exit fi

sleep 3

done

chmod 777 /etc/rc.local centos7 还需要给rc.local加个权限，要不然开机不会执行rc.local 在/etc/rc.local

加入bash /mountnas.sh

六、配置多路心跳

在rhcs中，可以配置多路心跳，一路将fence作为心跳，一路将网络作为心跳，在corosync pacemaker的集群中，找了国内外很多技术文章，均未做相关描述的。主要是因为corosync pacemaker的集群相对来说已经很可靠，在未配置多路心跳之前将业务网卡DOWN后，发现 node02的集群马上失效，集群IP立即切换至node01上，未出现脑裂的情况。作为自己研究，尝试了再配置了一个心跳。

在/etc/corosync.conf中，毕竟corosync管理心跳，所以只有在它里面想办法了，在centos7

之前的corosync.conf配置中，会定义interface {}作为多个网络，但是在centos7中经过PCS统一管理后，经过测试发现pacemaker只认nodelist{}作为网络，难道这又是一个变动？在totem {} 定义：

rrp_mode: passive #默认为none，修改为passive才可以支持两个网段

nodelist { node{

ring0_addr:node01

ring0_addr:test01 -test01为第二个心跳 } node{

ring0_addr:node02

ring0_addr:test02 -test02为第二个心跳 }

}

记得修改host表。

重启集群即可生效，目前node01，node02对应192.168.102.0网段，test01 test02对应192.168.103.0网段,同时可以在192.168.103.0网段上再新建一个业务IP，在配置集群IP之前需要对test01 test02配置认证。 [root@node01 ~]# pcs cluster auth test01 test02 Username: hacluster Password:

test01: Authorized test02: Authorized

出现以下显示表示认证成功。

[root@node02 ~]# pcs cluster auth test01 test02 test01: Already authorized test02: Already authorized

配置集群IP

pcs resource create testip ocf:heartbeat:IPaddr2 ip=192.168.103.10 cidr_netmask=24 op monitor interval=30s 注意testip与上文的vip已经名字不一样，两个集群IP的名字不能一致。

七、配置集群应用

以apache为例： apache安装过程略。

pcs resource create Web ocf:heartbeat:apache configfile=/etc/httpd/conf/httpd.conf statusurl=\

pcs status查看apache状态，

pcs constraint colocation add Web vip INFINITY 将apache与vip绑定在同一个节点上。 pcs constraint order vip then Web，指定集群IP先启动，然后再启动apache pcs constraint location Web prefers node01=200 指定node01优先启动apache

crm_simulate -sL 查看资源黏性值，集群业务优先在资源黏性值高的节点上运行。 Current cluster status: Online: [ node01 node02 ]

vip (ocf:heartbeat:IPaddr2): Started node01 Web (ocf:heartbeat:apache): Started node01 Allocation scores:

native_color: vip allocation score on node01: 200 native_color: vip allocation score on node02: 50 native_color: Web allocation score on node01: 200 native_color: Web allocation score on node02: 50 Transition Summary:

手动切换集群资源

以上配置由集群自动分配资源到node01、node02上，有时候为了维护，必须手动迁移资源到指定节点上。

# pcs constraint location Web prefers node01=INFINITY # pcs constraint --full 查看资源状态

# pcs constraint remove location-Web-node01-INFINITY 将控制权还给集群

共2页:

CENTOS 7构建HA集群(2).doc 将本文的Word文档下载到电脑下载失败或者文档不完整，请联系客服人员解决！

下载这篇word文档