6.4. 分析日志组切换频率 SELECT TO_CHAR (first_time, 'yyyy-mm-dd') DAY, COUNT (*) switch_times, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '00', 1, 0)) h00, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '01', 1, 0)) h01, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '02', 1, 0)) h02, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '03', 1, 0)) h03, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '04', 1, 0)) h04, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '05', 1, 0)) h05, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '06', 1, 0)) h06, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '07', 1, 0)) h07, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '08', 1, 0)) h08, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '09', 1, 0)) h09, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '10', 1, 0)) h10, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '11', 1, 0)) h11, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '12', 1, 0)) h12, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '13', 1, 0)) h13, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '14', 1, 0)) h14, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '15', 1, 0)) h15, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '16', 1, 0)) h16, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '17', 1, 0)) h17, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '18', 1, 0)) h18, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '19', 1, 0)) h19, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '20', 1, 0)) h20, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '21', 1, 0)) h21, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '22', 1, 0)) h22, SUM (DECODE (TO_CHAR (first_time, 'hh24'), '23', 1, 0)) h23 FROM v$log_history WHERE first_time >TRUNC (SYSDATE - 30) GROUPBYROLLUP (TO_CHAR (first_time, 'yyyy-mm-dd')); 分析最近时间段redo的切换频率,如果过于频繁或间隔太长增加日志组大小
说明6.5. 查看等待事件 Select username,s.program,s.status,se.event,se.total_waits,se.total_timeouts,se.time_waited,se.average_wait from v$session s,v$session_event se Where s.sid=se.sid And se.event not like 'SQl*Net%' And s.status = 'ACTIVE' And s.username is not null; SQL> select * from (select event \ round(time_waited /(select sum(time_waited) from v$system_event),4) \ total_waits \ round(total_waits /(select sum(total_waits) from v$system_event),4) \ from v$system_event where wait_class!='Idle' order by 2 desc) where rownum <=30; 值%waited代表等待数据比,此值越小越好,此值越高代表等待事物数多,需要检查cp和io
说明6.6. 检查数据库cpu、I/O、内存性能 top - 10:29:35 up 73 days, 19:54, 1 user, load average: 0.37, 0.38, 0.29 Tasks: 353 total, 2 running, 351 sleeping, 0 stopped, 0 zombie Cpu(s): 1.2% us, 0.1% sy, 0.0% ni, 98.8% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 16404472k total, 12887428k used, 3517044k free, 60796k buffers Swap: 8385920k total, 665576k used, 7720344k free, 10358384k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 30495 oracle 15 0 8329m 866m 861m R 10 5.4 7:53.90 oracle 32501 oracle 15 0 8328m 1.7g 1.7g S 2 10.6 1:58.38 oracle 32503 oracle 15 0 8329m 1.6g 1.6g S 2 10.2 2:06.62 oracle 注意上面的蓝色字体部分,此部分内容表示系统剩余的cpu,当其平均值下降至10%以下的时视为CPU使用率异常,需记录下该数值,并将状态记为异常。 6.7. 内存使用情况 [oracle@racdb1 racdb1]$ free -m totalused free shared buffers cached Mem: 2026 1958 67 0 76 1556 -/+ buffers/cache: 326 1700 Swap: 5992 92 5900 如上所示,蓝色部分表示系统总内存,红色部分表示系统使用的内存,黄色部分表示系统剩余内存,当剩余内存低于总内存的10%时视为异常。 6.8. 系统I/O情况 [oracle@racdb1 racdb1]$ iostat -k 1 3 Linux 2.6.32-504.1.3.el6.x86_64 (racdb1.us.example.com) 01/21/15 _x86_64_ (1 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 0.27 0.00 0.51 1.69 0.00 97.53 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn sda 5.07 14.59 46.92 458025 1472924 dm-0 0.10 1.02 0.09 32077 2800 说明说明dm-1 0.01 0.04 0.00 1408 0 dm-2 0.01 0.02 0.01 713 192 dm-3 11.97 11.39 46.15 357501 1448664 dm-4 0.18 0.21 0.60 6637 18788 dm-5 0.18 1.39 0.05 43749 1576 dm-6 0.02 0.04 0.03 1113 872 说明 如上所示,蓝色字体部分表示磁盘读写情况,红色字体部分为cpu IO等待情况。
重点关注的几个指标就是idle值和iowait,但是idle值非常小(小于20%)或者iowait值非常大 (大于70%)时,说明IO出现问题。
由于IO问题涉及操作系统,存储系统,cp负荷及应用系统等一些因素,故当发现问题请及时联系硬件厂家进行分析处理
6.9. 系统负载情况 [oracle@racdb1 racdb1]$ uptime 21:31:08 up 8:56, 2 users, load average: 0.00, 0.00, 0.00 如上所示,蓝体字部分表示系统负载,后面的3个数值如果有高于2.5的时候就表明系统在超负荷运转了,并将此值记录到巡检表,视为异常
说明6.10. 查看是否有僵死进程 SQL>select spid from v$process where addr not in (select paddr from v$session); SPID ------------------------ 4491 4493 有些僵尸进程有阻塞其他业务的正常运行,定期杀掉僵尸进程 6.11. 检查缓冲区命中率 SQL> SELECT a.VALUE + b.VALUE logical_reads, c.VALUE phys_reads, round(100*(1-c.value/(a.value+b.value)),4) hit_ratio FROM v$sysstat a, v$sysstat b, v$sysstat c WHERE a.NAME='db block gets' AND b.NAME ='consistent gets' AND c.NAME ='physical reads' ; LOGICAL_READS PHYS_READS HIT_RATIO ------------- ---------- ---------- 657342 13067 98.0121 说明说明
如果命中率低于90% 则需加大数据库参数db_cache_size。 6.12. 检查共享池命中率 SQL> select sum(pinhits)/sum(pins)*100 from v$librarycache; SUM(PINHITS)/SUM(PINS)*100 -------------------------- 95.4880803 如低于95%,则需要调整应用程序使用绑定变量,或者调整数据库参数shared pool的大小。 6.13. 检查排序区 SQL> select name,value from v$sysstat where name like '%sort%'; NAME VALUE ---------------------------------------------------------------- ---------- sorts (memory) 66230 sorts (disk) 0 sorts (rows) 408773 如果disk/(memoty+row)的比例过高,则需要调整sort_area_size(workarea_size_policy=false)或pga_aggregate_target(workarea_size_policy=true)。
说明说明 6.14. 检查日志缓冲区 SQL> select name,value from v$sysstat where name in ('redo entries','redo buffer allocation retries'); NAME VALUE ---------------------------------------------------------------- ---------- redo entries 45014 redo buffer allocation retries 1 如果redo buffer allocation retries/redo entries 超过1% ,则需要增大log_buffer。
说明6.15. 检查失效的索引 Sql>select index_name,table_name,tablespace_name,status owner='CTAIS2' And status<>'VALID'; From dba_indexes Where 注:分区表上的索引status为N/A是正常的,如有失效索引则对该索引做rebuild,如: Sql>alter index INDEX_NAME rebuild tablespace TABLESPACE_NAME;
6.16. 检查不起作用的约束 SELECT owner, constraint_name, table_name, constraint_type, status FROM dba_constraints WHERE status ='DISABLE' and constraint_type='P'; 如有失效约束则启用,如:
Sql>alter Table TABLE_NAME Enable Constraints CONSTRAINT_NAME;
6.17. 检查无效的trigger SELECT owner, trigger_name, table_name, status FROM dba_triggers WHERE status = 'DISABLED'; 如有失效触发器则启用,如: Sql>alter Trigger TRIGGER_NAME Enable; 6.18. 检查尚未建立索引的表 SELECT /*+ rule */ owner, segment_name, segment_type, tablespace_name, TRUNC (BYTES / 1024 / 1024, 1) size_mb FROM dba_segments t WHERE NOT EXISTS ( SELECT 'x' FROM dba_indexes i WHERE t.owner = i.table_owner AND t.segment_name = i.table_name) AND t.segment_type IN ('TABLE', 'TABLE PARTITION') AND t.owner NOT IN ('SYS', 'SYSTEM') ORDER BY 5 DESC; 不应该含有比较大而且又不含索引的业务中正式使用的表 6.19. 检查运行时间长的SQL SELECT sql_text \运行次数\/ 4000 \响应时间\FROM v$sql WHERE buffer_gets / decode(executions, 0,1, executions) / 4000 > 10 AND executions > 0; 说明说明
此项目查询结果可以提交开发人员,建议对这些SQL进行调整,或者结合等待事件top5,综
合判断效率低下的原因
6.20. 检查性能差的前10条SQL SELECT * FROM (SELECT PARSING_USER_ID EXECUTIONS,SORTS,COMMAND_TYPE,DISK_READS, SQL_TEXT FROM V$SQLAREA ORDER BY DISK_READS DESC) WHERE ROWNUM<10 ; 6.21. 查看占 io 较大的正在运行的 session SELECTse.sid,se.serial#,pr.SPID,se.username,se.status,se.terminal,se.program, se.MODULE,se.sql_address,st.event,st.p1text,si.physical_reads,si.block_changes FROM v$session se,v$session_wait st,v$sess_io si,v$process pr WHERE st.sid=se.sid AND st.sid=si.sid AND se.PADDR=pr.ADDR AND se.sid>6 AND st.wait_time=0 AND st.event NOT LIKE '%SQL%' ORDER BY physical_reads DESC; 6.22. 检查消耗CPU最高的PID对应的SQL SELECT P.pid pid,S.sid sid,P.spid spid,S.username username,S.osuser osname,P.serial# S_#,P.terminal,P.program program,P.background,S.status,RTRIM(SUBSTR(a.sql_text, 1, 80)) SQL FROM v$process P, v$session S,v$sqlarea A WHERE P.addr = s.paddr AND S.sql_address = a.address (+) AND P.spid LIKE '%&1%'; SELECT '++'||S.username username,RTRIM(REPLACE(a.sql_text,chr(10),''))||';' FROM v$process P, v$session S,v$sqlarea A WHERE P.addr = s.paddr AND S.sql_address = a.address (+) AND P.spid LIKE '%&&1%';