果linux长时间挂起,这个内核会自动的重启系统。这个模块在内核空间运行,不受负载的影响。
配置这个模块需要两个参数: hangcheck_tick:多长时间检查一次,缺省是30秒 hangcheck_margin:延迟上限,缺省是180秒
hangcheck-time模块会根据hangcheck_tick的设置,定时检查内核,只要两次检查的时间间隔小于hangcheck_tick+hangcheck_margin,都会认为内核是运行正常,否则认为系统异常,该模块会自动重启系统。
CRS本身还有一个参数:MissCount参数。
上面的三个参数影响RAC的重构,假设节点间心跳信息丢失,Clusterware必须确保在进行重构时,故障节点确实是dead状态。
严重问题:节点临时负载过高导致心跳丢失,然后其他节点开始重构,但是节点却没有重启(没有dead),这就会损坏数据库。
因此要保证MissCount必须大于hangcheck_tick+hangcheck_margin的和。这样可以保证节点开始重构时,其他节点已经被hangcheck-timer模块重启。
[root@rac1 etc]# find /lib/modules/ -name hangcheck-timer.ko /lib/modules/2.6.18-8.el5/kernel/drivers/char/hangcheck-timer.ko [root@rac1 etc]# modprobe -v hangcheck-timer
insmod /lib/modules/2.6.18-8.el5/kernel/drivers/char/hangcheck-timer.ko hangcheck_tick=30 hangcheck_margin=180
***配置系统启动时自动加载模块。在/etc/rc.d/rc.local中添加如下内容
echo \[root@rac-02 ~]# vi /etc/modprobe.conf
alias scsi_hostadapter mptbase alias scsi_hostadapter1 mptspi alias eth0 pcnet32 alias eth1 pcnet32
options hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
***配置模块的参数
[root@rac-02 ~]# modprobe hangcheck-timer
[root@rac-02 ~]# grep Hangcheck /var/log/messages |tail -2
Jun 19 04:14:13 rac-02 kernel: Hangcheck: starting hangcheck timer 0.9.0 (tick is 30 seconds, margin is 180 seconds).
Jun 19 04:14:13 rac-02 kernel: Hangcheck: Using get_cycles(). 显然参数有问题
重新加载模块,问题解决。 两个节点上都要执行相同的操作
//////////////////////////////////////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////////////////////////////////////////////////////////////////////
***配置裸设备(AS5)
ACTION==\ACTION==\ACTION==\ACTION==\KERNEL==\KERNEL==\KERNEL==\KERNEL==\
[root@rac2 ~]# vi /etc/rc.local
#!/bin/sh #
# This script will be executed *after* all the other init scripts. # You can put your own initialization stuff in here if you don't # want to do the full Sys V style init stuff.
touch /var/lock/subsys/local modprobe -v hangcheck-timer partprobe
raw /dev/raw/raw1 /dev/sdb1 raw /dev/raw/raw2 /dev/sdb2 raw /dev/raw/raw3 /dev/sde1 raw /dev/raw/raw4 /dev/sde2 chown -R oracle.dba /dev/raw/ chmod 660 /dev/raw/raw1 chmod 660 /dev/raw/raw2 chmod 660 /dev/raw/raw3 chmod 660 /dev/raw/raw4
***机器未重启手动先挂载上 [root@rac2 ~]# partprobe raw /dev/raw/raw1 /dev/sdb1 raw /dev/raw/raw2 /dev/sdb2 raw /dev/raw/raw3 /dev/sde1 raw /dev/raw/raw4 /dev/sde2
[root@rac2 ~]# ll /dev/raw total 0
crw------- 1 root root 162, 1 Jun 17 13:22 raw1 crw------- 1 root root 162, 2 Jun 17 13:22 raw2
另一台上执行
chkconfig --list rawdevices
service rawdevices restart
//////////////////////////////////////////////////////////////////////////////////////////////////////////
***创建ASM磁盘
创建ASM磁盘有多种方式,我们使用的是ASMLib方法,要求必须安装ASMLib RPM包,我们前面已经安装过。
建立/oracle/product/database,修改属主为oracle:dba
mkdir /u01/oracle/product/crs -p
mkdir /u01/oracle/product/database -p chown -R oracle.dba /u01/
***在建立ASM磁盘以前,我们首先需要建立相应的磁盘设备 /dev/sdc /dev/sdc1 /dev/sdd /dev/sdd1
[root@rac1 ~]# fdisk /dev/sdc
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel Building a new DOS disklabel. Changes will remain in memory only, until you decide to write them. After that, of course, the previous
content won't be recoverable.
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
Command (m for help): n Command action e extended
p primary partition (1-4) p
Partition number (1-4): 1
First cylinder (1-391, default 1): Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-391, default 391): Using default value 391
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table. Syncing disks.
***创建ASM磁盘
[root@rac1 ~]# /etc/init.d/oracleasm configure Configuring the Oracle ASM library driver.
This will configure the on-boot properties of the Oracle ASM library driver. The following questions will determine whether the driver is loaded on boot and what permissions it will have. The current values will be shown in brackets ('[]'). Hitting
Default user to own the driver interface []: oracle Default group to own the driver interface []: dba Start Oracle ASM library driver on boot (y/n) [n]: y
Fix permissions of Oracle ASM disks on boot (y/n) [y]: y
Writing Oracle ASM library driver configuration: [ OK ]
Creating /dev/oracleasm mount point: [ OK ]
Loading module \ [ OK ] Mounting ASMlib driver filesystem: [ OK ] Scanning system for ASM disks: [ OK ]
***配置和启用ASM驱动。
/etc/init.d/oracleasm createdisk VOL1 /dev/sdc1 /etc/init.d/oracleasm createdisk VOL2 /dev/sdc2 /etc/init.d/oracleasm createdisk VOL3 /dev/sdd1 /etc/init.d/oracleasm createdisk VOL4 /dev/sdd2
[root@rac1 ~]# /etc/init.d/oracleasm createdisk VOL1 /dev/sdc1
Marking disk \ [ OK ] [root@rac1 ~]# /etc/init.d/oracleasm createdisk VOL2 /dev/sdc1
Marking disk \ [ OK ]
***在RAC1上建立了两个ASM磁盘。
oracleasm listdisks
***在RC2上配置一下ASM驱动,一定要加上目录
[root@rac2 ~]# /etc/init.d/oracleasm enable
Writing Oracle ASM library driver configuration: [ OK ]
Creatingmom /dev/oracleasm mount point: [ OK ] Loading module \ [ OK ] Mounting ASMlib driver filesystem: [ OK ] Scanning system for ASM disks: [ OK ] [root@rac2 ~]# /etc/init.d/oracleasm status
Checking if ASM is loaded: [ OK Checking if /dev/oracleasm is mounted: [ OK ] [root@rac2 ~]# /etc/init.d/oracleasm listdisks VOL1 VOL2
[root@rac2 ~]# /etc/init.d/oracleasm configure Configuring the Oracle ASM library driver.
This will configure the on-boot properties of the Oracle ASM library driver. The following questions will determine whether the driver is loaded on boot and what permissions it will have. The current values will be shown in brackets ('[]'). Hitting
Default user to own the driver interface []: oracle
]