| « | 三月 2010 | » | ||||
|---|---|---|---|---|---|---|
| 一 | 二 | 三 | 四 | 五 | 六 | 日 |
| 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| 8 | 9 | 10 | 11 | 12 | 13 | 14 |
| 15 | 16 | 17 | 18 | 19 | 20 | 21 |
| 22 | 23 | 24 | 25 | 26 | 27 | 28 |
| 29 | 30 | 31 | ||||
经常有人问temp表空间暴涨的问题,以及如何回收临时表空间,由于版本的不同,方法显然也多种多样,但这些方法显示是治标不治本的办法,只有深刻理解temp表空间快速增加的原因,才能从根本上解决temp ts的问题。
是什么操作在使用temp ts?
- 索引创建或重创建.
- ORDER BY or GROUP BY
- DISTINCT 操作.
- UNION & INTERSECT & MINUS
- Sort-Merge joins.
- Analyze 操作
- 有些异常将会引起temp暴涨
所以,在处理以上操作时,dba需要加倍关注temp的使用情况,v$sort_segment字典可以记载temp的比较详细的使用情况,而v$sort_usage将会告诉我们是谁在做什么.
sql>select tablespace_name,current_users,total_blocks,used_blocks,free_blocks from v$sort_segment;
TABLESPACE_NAME CURRENT_USERS TOTAL_BLOCKS USED_BLOCKS FREE_BLOCKS
------------------------------- ------------- ------------ ----------- -----------
TEMP 1 63872 30464 33408
sql>
SQL>select username,session_addr,sqladdr,sqlhash from v$sort_usage
USERNAME SESSION_ADDR SQLADDR SQLHASH
------------------------------ ---------------- ---------------- ----------
CYBERCAFE C0000000D7EF99E8 C0000000E1BFE970 4053158416
然后通过多表联接,我们可以找出更详细的操作:
SQL>select se.username,se.sid,su.extents,su.blocks*to_number(rtrim(p.value)) as Space,tablespace,segtype,sql_text from v$sort_usage su,v$parameter p,v$session se,v$sql s
where p.name='db_block_size' and su.session_addr=se.saddr and s.hash_value=su.sqlhash and s.address=su.sqladdr order by se.username,se.sid;
USERNAME SID EXTENTS SPACE TABLESPACE SEGTYPE
------------------------------ ---------- ---------- ---------- ------------------------------- ---------
SQL_TEXT
-------------------------------------------------------------------------------------------------------------------------
CYBERCAFE 42 238 249561088 TEMP SORT
select 1 from sys.streams$_prepare_ddl p where ((p.global_flag=1 and :1 is null) or (p.global_flag=0 and p.usrid=:2)) and rownum=1
本例应该是由一些异常引起的,其实大多数情况下sort都会在几乎内结束,如果在sort操作的若干秒内刚好就捕获了该SQL,应该走狗屎运的事情,即你知道某个SQL将会发生sort操作,当你想捕抓它们时,发现它们已经sort完了,排序完毕后sort segment会被smon清除。但很多时间,我们则会遇到临时段没有被释放,temp表空间几乎满的状况,这时该如何处理呢?
metalink上推荐的方法收集整理如下
-- 重启实例
重启实例重启时,smon进程会完成临时段释放,不过很多的时侯我们的库是不允许down的,
所以这种方法缺应用机会不多,不过这种方法还是很好用的,如果你的实例在重启后sort段
没有被释放,这种情况就需要慎重对待。
-- 修改参数 (仅适用于8i及8i以下版本)
SQL>alter tablespace temp increase 1;
SQL>alter tablespace temp increase 0;
-- 合并碎片
SQL>alter tablespace temp coalesce;
-- 诊断事件
SQL>alter session set events 'immediate trace name DROP_SEGMENTS level 4'
说明:temp表空间的TS#为3,So TS#+1=4
-- 重建temp
SQL>alter database temp tempfile '......' drop;
SQL>alter tablespace temp add tempfile '......';
可以说,以上的方法都是治标不治本的,因为temp增长过快显然是由于disk sort过多,造成disk
sort的原因也很多,比如sort area较小等原因,当然,sort area设置多大才合理?这个当然需要满足In-memory Sort大于99%以上哦。
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Buffer Nowait %: 100.00 Redo NoWait %: 99.99
Buffer Hit %: 99.36 In-memory Sort %: 100.00
Library Hit %: 99.87 Soft Parse %: 99.84
Execute to Parse %: 1.17 Latch Hit %: 99.96
Parse CPU to Parse Elapsd %: 92.00 % Non-Parse CPU: 94.59
排序区域的分配
- 专用服务器分配sort area.
排序区域在PGA.
- 共享服务器分配sort area.
排序区域在UGA. (UGA在shared pool中分配).
在9i以前的版本,由sort_area_size决定sort area的分配,在9i及以后的版本,当workarea_size_policy等auto时,由pga_aggregate_target参数决定sort
area的大于,这时的sort area应该是pga总内存的5%.当workarea_size_policy等manual时,sort area的大小还是于sort_area_size决定.
无论是那个版本,如果sort area开得过小,In-memory Sort率较低,那temp表空间肯定会增长得很快,如果开得较高,在C/S结构中将会导致内存消耗严重(长连接较多).
由于smon进程每隔5分钟都要对不再使用的sort segment进行回收,如果你不想让
smon回收sort segment的话,可以使用以下两个event写入初始化参数文件,然后重启实例,这样如果你的磁盘排序较多,很快就会涨暴磁盘......
event="10061 trace name context forever, level 10" //禁止加收
event="10269 trace name context forever, level 10" //禁止合并碎片
通过合理地设置pga或sort_area_size,可以消除大部分的dist sort,那其它的disk
sort该如何处理呢?从sort引起的原因来看,索引/分析/异常引起的disk sort应该是很少的一部分,其它的应该是select中的distinct/union/group by/order by以及
merge sort join啦,那我们如何捕获这些操作呢?通常如何有磁盘排序的SQL,它的逻辑读/物理读/排序/执行时间等都是比较大的,所以我们可以对v$sqlarea或v$sql字典进行过滤,经过长期地监控数据库,相信可以把这些害群之马找出来.即然找出这些引起disk sort的SQL后怎么办呢?当然是对SQL进行分析,尽而优化之。
[oracle@www1 sql]$ more show_sql.sh
#!/bin/bash
sqlplus -s aaa/bbb
col disk_reads format 999999.99
col bgets_per format 99999999.99
col "ELAPSD_TIME(s)" format 9999.99
col "cpu_time(s)" format 9999.99
set long 99999999999
set pagesize 9999
select address,hash_value,disk_reads/executions disk_reads,elapsed_time/1000000/executions as "ELAPSD_TIME(s)",
buffer_gets/executions bgets_per,executions,first_load_time as first_time,sql_text
from v$sql
where executions > 0 and (disk_reads/executions > 500 or buffer_gets/executions > 20000) and command_type = 3
order by 3,4;
--select s.disk_reads,s.buffer_gets/s.executions bgets_per,first_load_time,st.sql_text
-- from v$sql s,v$sqltext_with_newlines st
--where s.address=st.address and s.hash_value=st.hash_value
-- and s.disk_reads > 1000 or (s.executions > 0 and s.buffer_gets/s.executions > 50000)
--order by st.piece;
exit
!
总结,如何从根本上降低temp表空间的膨胀呢?方法有2个:
1 设置合理的pga或sort_area_size
2 优化引起disk sort的sql
xzh2000 | 19 一月, 2005 16:14
表结构优化实例
本方法适合表中存在大量的标志字段,如下表,很多字段就只存储Y/N或0/1/2/4等,
如结构如下,
create table agent_settime_onself
( agent_settime_onself_id number(11,0),
agent_card_type_id number(11,0) not null enable,
start_time_1 char(5) default '00:00' not null enable,
start_time_1_enabled char(1) default 'N',
stop_time_1 char(5) default '23:59' not null enable,
stop_time_1_enable char(1) default 'N',
start_time_2 char(5) default '00:00' not null enable,
start_time_2_enabled char(1) default 'N',
stop_time_2 char(5) default '23:59' not null enable,
stop_time_2_enable char(1) default 'N',
start_time_3 char(5) default '00:00' not null enable,
start_time_3_enabled char(1) default 'N',
stop_time_3 char(5) default '23:59' not null enable,
stop_time_3_enable char(1) default 'N',
start_time_4 char(5) default '00:00' not null enable,
start_time_4_enabled char(1) default 'N',
stop_time_4 char(5) default '23:59' not null enable,
stop_time_4_enable char(1) default 'N',
start_time_5 char(5) default '00:00' not null enable,
start_time_5_enabled char(1) default 'N',
stop_time_5 char(5) default '23:59' not null enable,
stop_time_5_enable char(1) default 'N',
start_time_6 char(5) default '00:00' not null enable,
start_time_6_enabled char(1) default 'N',
stop_time_6 char(5) default '23:59' not null enable,
stop_time_6_enable char(1) default 'N',
start_time_7 char(5) default '00:00' not null enable,
start_time_7_enabled char(1) default 'N',
stop_time_7 char(5) default '23:59' not null enable,
stop_time_7_enable char(1) default 'N')
表的访问SQL如下:
select a.*,sysdate now,to_char(sysdate,'D') as week
from agent_settime_onself a
where ((start_time_1_enabled = 'Y' OR start_time_2_enabled = 'Y'
OR start_time_3_enabled = 'Y' OR start_time_4_enabled = 'Y'
OR start_time_5_enabled = 'Y' OR start_time_6_enabled = 'Y'
OR start_time_7_enabled = 'Y')
OR (stop_time_1_enable = 'Y' OR stop_time_2_enable = 'Y'
OR stop_time_3_enable = 'Y' OR stop_time_4_enable = 'Y'
OR stop_time_5_enable = 'Y' OR stop_time_6_enable = 'Y'
OR stop_time_7_enable = 'Y'))
这么多用于判断的字段,如果都创建索引也是不合理的,再说这些字段的数据分布不均匀,
如何对这样的表结构进行优化呢?优化后的表结构如下:
create table agent_settime_onself
( agent_settime_onself_id number(11,0),
agent_card_type_id number(11,0) not null enable,
start_time_1 char(5) default '00:00' not null enable,
start_time_enabled char(7) default 'NNNNNNN',
stop_time_1 char(5) default '23:59' not null enable,
stop_time_enable char(7) default 'NNNNNNN',
start_time_2 char(5) default '00:00' not null enable,
stop_time_2 char(5) default '23:59' not null enable,
start_time_3 char(5) default '00:00' not null enable,
stop_time_3 char(5) default '23:59' not null enable,
start_time_4 char(5) default '00:00' not null enable,
stop_time_4 char(5) default '23:59' not null enable,
start_time_5 char(5) default '00:00' not null enable,
stop_time_5 char(5) default '23:59' not null enable,
start_time_6 char(5) default '00:00' not null enable,
stop_time_6 char(5) default '23:59' not null enable,
start_time_7 char(5) default '00:00' not null enable,
stop_time_7 char(5) default '23:59' not null enable)
create index idx_aso_sta_enabled on agent_settime_onself(start_time_enabled);
create index idx_aso_sto_enabled on agent_settime_onself(stop_time_enabled);
那以后访问该表的SQL如下:
select a.*,sysdate now,to_char(sysdate,'D') as week
from agent_settime_onself a
where start_time_enabled = 'YYYYYYY'
select a.*,sysdate now,to_char(sysdate,'D') as week
from agent_settime_onself a
where stop_time_enabled = 'YYYYYYY'
今天早上一来,数据库load就比往常高了许多。想想数据库唯一的变化是昨天早上我曾经重新分析过数据库对象。
发现数据库load很高,首先看top发现没有特别异常的进程,在数据库中适时抓取正在运行的sql也没发现异常(通常运行时间非常短的sql是不能被抓取到的)。询问相关应用程序人员,最近没有变动。检查应用程序日志发现今天早上跟往常也没有过多登陆和操作。基本上可以圈定是在数据库服务器本身上面。
但是这个时候我还没有办法确定到底是哪个应用的哪个查询的问题,因为数百个进程的几十台server连着,我不能去及时的追踪。打算等到10点过去后,抽取8/9/10高峰期的整点的statspack出来,跟上星期的这个时间产生的报告对比看看。
通过对比报告我们发现 CPU TIME 今天一小时内增加了大约1200秒(2,341 - 1,175 )。这是一个重大的变化,很显然是两种可能
1:今天过多地执行了某些sql
2:某些sql的执行计划发生变化导致cpu使用过多
Top 5 Timed Events
~~~~~~~~~~~~~~~~~~ % Total
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
CPU time 2,341 42.60
db file sequential read 387,534 2,255 41.04
global cache cr request 745,170 231 4.21
log file sync 98,041 229 4.17
log file parallel write 96,264 158 2.88
Event Waits Time (s) Ela Time
-------------------------------------------- ------------ ----------- --------
db file sequential read 346,851 1,606 47.60
CPU time 1,175 34.83
global cache cr request 731,368 206 6.10
log file sync 90,556 91 2.71
db file scattered read 37,746 90 2.66
接下来我对比了 sql 部分内容,发现
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
17,899,606 47,667 375.5 55.6 1161.27 1170.22 3481369999
Module: /home/oracle/AlitalkSrv/config/../../AlitalkSrv/
SELECT login_id, to_char(gmt_create, 'YYYY-MM-DD HH24:MI:SS')
from IM_BlackList where black_id = :b1
这条sql出现在了今天报告的前列,而以往的报告中该sql根本不排在 buffer gets 前面位置,显然这条sql消耗了大约 1161.27 秒 cpu time .检查原来的报告
Buffer Gets Executions Gets per Exec %Total Time (s) Time (s) Hash Value
--------------- ------------ -------------- ------ -------- --------- ----------
107,937 47,128 2.3 0.8 7.39 6.94 3481369999
Module: /home/oracle/AlitalkSrv/config/../../AlitalkSrv/
SELECT login_id, to_char(gmt_create, 'YYYY-MM-DD HH24:MI:SS')
from IM_BlackList where black_id = :b1
我们发现只消耗了 7.39 秒的 cpu time 。
到这个时候我基本可以断定,是由于这个sql没有走索引而走了全表扫描。但是为什么会走全表扫描呢,这是一个问题,接下来我检查了表的索引:
SQL> select index_name,column_name from user_ind_columns where table_name = 'IM_BLACKLIST';
IM_BLACKLIST_PK LOGIN_ID
IM_BLACKLIST_PK BLACK_ID
IM_BLACKLIST_LID_IND BLACK_ID
很显然存在着 black_id 的单独的索引,应该正常使用才对。于是我在生产库上执行这个sql一看,却发现走了全表扫描。为此我到一个周六的standby的opren read only 的数据库上查询了一下该索引字段的histogram(这个时候昨天早上分析对象的日志还没有被应用过去)
sys@OCN> select COLUMN_NAME ,ENDPOINT_NUMBER, ENDPOINT_VALUE , ENDPOINT_ACTUAL_VALUE from dba_histograms
2 where table_name = 'IM_BLACKLIST' and column_name = 'BLACK_ID';
COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
------------------------------ --------------- -------------- ------------------------------
BLACK_ID 0 2.5031E+35
BLACK_ID 1 2.6065E+35
BLACK_ID 2 2.8661E+35
BLACK_ID 3 5.0579E+35
BLACK_ID 4 5.0585E+35
BLACK_ID 5 5.0585E+35
BLACK_ID 6 5.0589E+35
BLACK_ID 7 5.0601E+35
BLACK_ID 8 5.1082E+35
BLACK_ID 9 5.1119E+35
BLACK_ID 10 5.1615E+35
BLACK_ID 11 5.1616E+35
BLACK_ID 12 5.1628E+35
BLACK_ID 13 5.1646E+35
BLACK_ID 14 5.2121E+35
BLACK_ID 15 5.2133E+35
BLACK_ID 16 5.2155E+35
BLACK_ID 17 5.2662E+35
BLACK_ID 18 5.3169E+35
BLACK_ID 19 5.3193E+35
BLACK_ID 20 5.3686E+35
BLACK_ID 21 5.3719E+35
BLACK_ID 22 5.4198E+35
BLACK_ID 23 5.4206E+35
BLACK_ID 24 5.4214E+35
BLACK_ID 25 5.4224E+35
BLACK_ID 26 5.4238E+35
BLACK_ID 27 5.4246E+35
BLACK_ID 28 5.4743E+35
BLACK_ID 29 5.5244E+35
BLACK_ID 30 5.5252E+35
BLACK_ID 31 5.5252E+35
BLACK_ID 32 5.5272E+35
BLACK_ID 33 5.5277E+35
BLACK_ID 34 5.5285E+35
BLACK_ID 35 5.5763E+35
BLACK_ID 36 5.6274E+35
BLACK_ID 37 5.6291E+35
BLACK_ID 38 5.6291E+35
BLACK_ID 39 5.6291E+35
BLACK_ID 40 5.6291E+35
BLACK_ID 42 5.6311E+35
BLACK_ID 43 5.6794E+35
BLACK_ID 44 5.6810E+35
BLACK_ID 45 5.6842E+35
BLACK_ID 46 5.7351E+35
BLACK_ID 47 5.8359E+35
BLACK_ID 48 5.8887E+35
BLACK_ID 49 5.8921E+35
BLACK_ID 50 5.9430E+35
BLACK_ID 51 5.9913E+35
BLACK_ID 52 5.9923E+35
BLACK_ID 53 5.9923E+35
BLACK_ID 54 5.9931E+35
BLACK_ID 55 5.9947E+35
BLACK_ID 56 5.9959E+35
BLACK_ID 57 6.0428E+35
BLACK_ID 58 6.0457E+35
BLACK_ID 59 6.0477E+35
BLACK_ID 60 6.0479E+35
BLACK_ID 61 6.1986E+35
BLACK_ID 62 6.1986E+35
BLACK_ID 63 6.1994E+35
BLACK_ID 64 6.2024E+35
BLACK_ID 65 6.2037E+35
BLACK_ID 66 6.2521E+35
BLACK_ID 67 6.2546E+35
BLACK_ID 68 6.3033E+35
BLACK_ID 69 6.3053E+35
BLACK_ID 70 6.3069E+35
BLACK_ID 71 6.3553E+35
BLACK_ID 72 6.3558E+35
BLACK_ID 73 6.3562E+35
BLACK_ID 74 6.3580E+35
BLACK_ID 75 1.1051E+36
然后对比了一下当前的histograms
COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
------------------------------ --------------- -------------- ------------------------------
BLACK_ID 0 1.6715E+35
BLACK_ID 1 2.5558E+35
BLACK_ID 2 2.7619E+35
BLACK_ID 3 2.9185E+35
BLACK_ID 4 5.0579E+35
BLACK_ID 5 5.0589E+35
BLACK_ID 6 5.0601E+35
BLACK_ID 7 5.1100E+35
BLACK_ID 8 5.1601E+35
BLACK_ID 9 5.1615E+35
BLACK_ID 10 5.1624E+35
BLACK_ID 11 5.1628E+35
BLACK_ID 12 5.1642E+35
BLACK_ID 13 5.2121E+35
BLACK_ID 14 5.2131E+35
BLACK_ID 15 5.2155E+35
BLACK_ID 16 5.2676E+35
BLACK_ID 17 5.3175E+35
BLACK_ID 18 5.3684E+35
BLACK_ID 19 5.3727E+35
BLACK_ID 20 5.4197E+35
BLACK_ID 21 5.4200E+35
BLACK_ID 22 5.4217E+35
BLACK_ID 23 5.4238E+35
BLACK_ID 24 5.4244E+35
BLACK_ID 25 5.4755E+35
BLACK_ID 26 5.5252E+35
BLACK_ID 27 5.5252E+35
BLACK_ID 28 5.5252E+35
BLACK_ID 29 5.5283E+35
BLACK_ID 30 5.5771E+35
BLACK_ID 31 5.6282E+35
BLACK_ID 32 5.6291E+35
BLACK_ID 33 5.6291E+35
BLACK_ID 34 5.6291E+35
BLACK_ID 35 5.6299E+35
BLACK_ID 36 5.6315E+35
BLACK_ID 37 5.6794E+35
BLACK_ID 39 5.6816E+35
BLACK_ID 40 5.6842E+35
BLACK_ID 41 5.7838E+35
BLACK_ID 42 5.8877E+35
BLACK_ID 43 5.8917E+35
BLACK_ID 44 5.9406E+35
BLACK_ID 45 5.9909E+35
BLACK_ID 46 5.9923E+35
BLACK_ID 47 5.9923E+35
BLACK_ID 48 5.9946E+35
BLACK_ID 49 5.9950E+35
BLACK_ID 50 5.9960E+35
BLACK_ID 51 5.9960E+35
BLACK_ID 52 5.9960E+35
BLACK_ID 53 5.9960E+35
BLACK_ID 54 5.9960E+35
BLACK_ID 55 5.9960E+35
BLACK_ID 56 5.9960E+35BLACK_ID 57 6.0436E+35
BLACK_ID 58 6.0451E+35
BLACK_ID 59 6.0471E+35
BLACK_ID 60 6.1986E+35
BLACK_ID 61 6.1998E+35
BLACK_ID 62 6.2014E+35
BLACK_ID 63 6.2037E+35
BLACK_ID 64 6.2521E+35
BLACK_ID 65 6.2544E+35
BLACK_ID 66 6.3024E+35
BLACK_ID 67 6.3041E+35
BLACK_ID 68 6.3053E+35
BLACK_ID 69 6.3073E+35
BLACK_ID 70 6.3558E+35
BLACK_ID 71 6.3558E+35
BLACK_ID 72 6.3558E+35
BLACK_ID 73 6.3558E+35
BLACK_ID 74 6.3580E+35
BLACK_ID 75 1.1160E+36
我们发现原来的histograms值分布比较均匀,而昨天分析后的值分布就有一些地方是集中的,参考上面红色部分。
于是我再做了个 10053 dump对比,昨天分析之前
通过 alter session set events '10053 trace name context forever';
然后执行相关的sql 再去看trace文件
Table stats Table: IM_BLACKLIST Alias: IM_BLACKLIST
TOTAL :: CDN: 57477 NBLKS: 374 AVG_ROW_LEN: 38
-- Index stats
INDEX NAME: IM_BLACKLIST_LID_IND COL#: 2
TOTAL :: LVLS: 1 #LB: 219 #DK: 17181 LB/K: 1 DB/K: 2 CLUF: 44331
INDEX NAME: IM_BLACKLIST_PK COL#: 1 2
TOTAL :: LVLS: 1 #LB: 304 #DK: 57477 LB/K: 1 DB/K: 1 CLUF: 55141
_OPTIMIZER_PERCENT_PARALLEL = 0
***************************************
SINGLE TABLE ACCESS PATH
Column: BLACK_ID Col#: 2 Table: IM_BLACKLIST Alias: IM_BLACKLIST
NDV: 17181 NULLS: 0 DENS: 5.8204e-05
NO HISTOGRAM: #BKT: 1 #VAL: 2
TABLE: IM_BLACKLIST ORIG CDN: 57477 ROUNDED CDN: 3 CMPTD CDN: 3
Access path: tsc Resc: 38 Resp: 38 Access path: index (equal)
Index: IM_BLACKLIST_LID_IND
TABLE: IM_BLACKLIST
RSC_CPU: 0 RSC_IO: 4 IX_SEL: 0.0000e+00 TB_SEL: 5.8204e-05
Skip scan: ss-sel 0 andv 27259
ss cost 27259
table io scan cost 38
Access path: index (no sta/stp keys)
Index: IM_BLACKLIST_PK
TABLE: IM_BLACKLIST
RSC_CPU: 0 RSC_IO: 309
IX_SEL: 1.0000e+00 TB_SEL: 5.8204e-05
BEST_CST: 4.00 PATH: 4 Degree: 1
***************************************
OPTIMIZER STATISTICS AND COMPUTATIONS
***************************************
GENERAL PLANS
***********************
Join order[1]: IM_BLACKLIST [IM_BLACKLIST]
Best so far: TABLE#: 0 CST: 4 CDN: 3 BYTES: 75
Final:
CST: 4 CDN: 3 RSC: 4 RSP: 4 BYTES: 75
IO-RSC: 4 IO-RSP: 4 CPU-RSC: 0 CPU-RSP: 0
这是昨天分析之后的
SINGLE TABLE ACCESS PATH
Column: BLACK_ID Col#: 2 Table: IM_BLACKLIST Alias: IM_BLACKLIST
NDV: 17069 NULLS: 0 DENS: 1.4470e-03
HEIGHT BALANCED HISTOGRAM: #BKT: 75 #VAL: 75
TABLE: IM_BLACKLIST ORIG CDN: 57267 ROUNDED CDN: 83 CMPTD CDN: 83
Access path: tsc Resc: 38 Resp: 38
Access path: index (equal)
Index: IM_BLACKLIST_LID_IND
TABLE: IM_BLACKLIST
RSC_CPU: 0 RSC_IO: 65
IX_SEL: 0.0000e+00 TB_SEL: 1.4470e-03
Skip scan: ss-sel 0 andv 27151
ss cost 27151
table io scan cost 38
Access path: index (no sta/stp keys)
Index: IM_BLACKLIST_PK
TABLE: IM_BLACKLIST
RSC_CPU: 0 RSC_IO: 384
IX_SEL: 1.0000e+00 TB_SEL: 1.4470e-03
BEST_CST: 38.00 PATH: 2 Degree: 1
***************************************
OPTIMIZER STATISTICS AND COMPUTATIONS
***************************************
GENERAL PLANS
***********************
Join order[1]: IM_BLACKLIST [IM_BLACKLIST]
Best so far: TABLE#: 0 CST: 38 CDN: 83 BYTES: 2407
Final:
CST: 38 CDN: 83 RSC: 38 RSP: 38 BYTES: 2407
IO-RSC: 38 IO-RSP: 38 CPU-RSC: 0 CPU-RSP: 0
我发现分析之前之后全表扫描cost都是38,但是分析之后的根据索引扫描却成为了 65 ,而分析之前是4。
很显然是由于这个查询导致昨天早上分析之后走了全表扫描。于是我再对表进行了分析,只不过这次我没有分析索引字段,而是
analyze table im_blacklist compute statistics;
这样之后,dbms_histograms 中信息只剩下
COLUMN_NAME ENDPOINT_NUMBER ENDPOINT_VALUE ENDPOINT_ACTUAL_VALUE
------------------------------ --------------- -------------- ------------------------------
GMT_CREATE 0 2452842.68
GMT_MODIFIED 0 2452842.68
LOGIN_ID 0 2.5021E+35
BLACK_ID 0 1.6715E+35
GMT_CREATE 1 2453269.44
GMT_MODIFIED 1 2453269.44
LOGIN_ID 1 6.3594E+35
BLACK_ID 1 1.1160E+36
再执行该sql,就走了索引,从而使得数据库的load降了下来。
分析这整个过程中,我无法知道oracle的走索引cost 65 是怎么计算出来的,当然是跟 histograms有关,但计算方法我却是不清楚的。
这条sql是 bind var,但是却走了全表扫描,这是由于920中数据库在对bind var 的sql进行第一次解析的时候去histograms中窥视了数据分布从而根据cost选择了FTS。然后后面继续执行的sql呢,则不论是否该走索引,都走了FTS。这是920这个版本的特性的弊病。也就是说,这有偶然性因素的存在。 但是对于这个表,我做了分析(不分析索引字段)之后不存在histograms,则sql无论如何都走了索引扫描。
kill session[metalink]
Problem Description -------------------
Killing the session will not clear the locks. The session on the remote
database will remain idle waiting for input until the network read times
out. Only then the kill session is processed, and locks are released.
Solution Description
--------------------
Options:
========
You can shutdown and restart the database or use the ORAKILL utility to kill
threads.
Oracle has provided an ORAKILL utility that will kill shadow threads. Each
user's connection is represented by a thread in the Oracle process. If a
user's session is killed, then their Oracle session is killed - not the thread.
Oracle has provided an ORAKILL utility which can be passed a thread ID and will
kill the specified thread.
To make sure you do not kill a background process (which would crash your
database), you must perform a select to ensure you get the correct thread.
---
select p.spid "OS Thread", b.name "Name-User", s.osuser, s.program
from v$process p, v$session s, v$bgprocess b
where p.addr = s.paddr
and p.addr = b.paddr UNION ALL
select p.spid "OS Thread", s.username "Name-User", s.osuser, s.program
from v$process p, v$session s
where p.addr = s.paddr
and s.username is not null;
---
This will list all Shadow processes and backgound processes.
Each shadow process will show the thread ID - this is what must be killed via
the ORAKILL utility.
The kill session behavior is mentioned in the ORACLE7 Server Administrator's
Guide (pg 4-13). It does not explicitly indicate what happens to the session
while it is in the KILLED PSEUDO state. What's happening is that PMON
periodically checks to see if any sessions have been killed.
If it finds one, it attempts to rollback the transaction for that session
(that was in progress when it was killed). The reason this can take a long time is
because PMON may have more than one transaction to rollback at a time
(if other sessions have been killed, or if processes have died etc).
Thus, it may take a while to finally cleanup the killed session and have it
disappear from the session monitor. The system i/o monitor correctly shows
the reads and writes being performed by PMON in order to rollback the
session's transaction.
PMON will not delete the session object itself until the client connected to
that session notices that it has been killed. Therefore, the sequence of
events is:
1) alter system kill session is issued - the STATUS of the session object in
V$SESSION becomes KILLED, its server becomes PSEUDO.
2) PMON cleans up the *resources* allocated to the session
(i.e., rolls back its transaction, releases its locks, etc).
3) the entry in V$SESSION remains there until the client of that session (the
client is the process associated with the OSUSER,MACHINE,PROCESS columns in
the V$SESSION view) tries to do another request.
4) the client attempts another SQL statement and gets back ORA-28.
5) PMON can now remove the entry from V$SESSION.
This behavior is necessary because the client still has pointers to the
session object even though the session has been killed. Therefore, the
object cannot be deleted until the client is no longer pointing at it.
表碎片起因及解决办法
跟表碎片有关的基础知识:
什么是水线(High Water Mark)?
----------------------------
所有的oracle段(segments,在此,为了理解方便,建议把segment作为表的一个同义词) 都有一个在段内容纳数据的上限,我们把这个上限称为"high water mark"或HWM。这个HWM是一个标记,用来说明已经有多少没有使用的数据块分配给这个segment。HWM通常增长的幅度为一次5个数据块,原则上HWM只会增大,不会缩小,即使将表中的数据全部删除,HWM还是为原值,由于这个特点,使HWM很象一个水库的历史最高水位,这也就是HWM的原始含义,当然不能说一个水库没水了,就说该水库的历史最高水位为0。但是如果我们在表上使用了truncate命令,则该表的HWM会被重新置为0。
HWM数据库的操作有如下影响:
a) 全表扫描通常要读出直到HWM标记的所有的属于该表数据库块,即使该表中没有任何数据。
b) 即使HWM以下有空闲的数据库块,键入在插入数据时使用了append关键字,则在插入时使用HWM以上的数据块,此时HWM会自动增大。
如何知道一个表的HWM?
a) 首先对表进行分析:
ANALYZE TABLE
b) SELECT blocks, empty_blocks, num_rows
FROM user_tables
WHERE table_name =
BLOCKS 列代表该表中曾经使用过得数据库块的数目,即水线。
EMPTY_BLOCKS 代表分配给该表,但是在水线以上的数据库块,即从来没有使用的数据块。
让我们以一个有28672行的BIG_EMP1表为例进行说明:
1) SQL> SELECT segment_name,segment_type,blocks
FROM dba_segments
WHERE segment_name='BIG_EMP1';
SEGMENT_NAME SEGMENT_TYPE BLOCKS EXTENTS
----------------------------- ----------------- ---------- -------
BIG_EMP1 TABLE 1024 2
1 row selected.
2) SQL> ANALYZE TABLE big_emp1 ESTIMATE STATISTICS;
Statement processed.
3) SQL> SELECT table_name,num_rows,blocks,empty_blocks
FROM user_tables
WHERE table_name='BIG_EMP1';
TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS
------------------------------ ---------- ---------- ------------
BIG_EMP1 28672 700 323
1 row selected.
注意:
BLOCKS + EMPTY_BLOCKS (700+323=1023)比DBA_SEGMENTS.BLOCKS少个数据库块,这是因为有一个数据库块被保留用作segment header。DBA_SEGMENTS.BLOCKS 表示分配给这个表的所有的数据库块的数目。USER_TABLES.BLOCKS表示已经使用过的数据库块的数目。
4) SQL> SELECT COUNT (DISTINCT
DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid)||
DBMS_ROWID.ROWID_RELATIVE_FNO(rowid)) "Used"
FROM big_emp1;
Used
----------
700
1 row selected.
5) SQL> DELETE from big_emp1;
28672 rows processed.
6) SQL> commit;
Statement processed.
7) SQL> ANALYZE TABLE big_emp1 ESTIMATE STATISTICS;
Statement processed.
8) SQL> SELECT table_name,num_rows,blocks,empty_blocks
FROM user_tables
WHERE table_name='BIG_EMP1';
TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS
------------------------------ ---------- ---------- ------------
BIG_EMP1 0 700 323
1 row selected.
9) SQL> SELECT COUNT (DISTINCT
DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid)||
DBMS_ROWID.ROWID_RELATIVE_FNO(rowid)) "Used"
FROM big_emp1;
Used
----------
0 -- 这表名没有任何数据库块容纳数据,即表中无数据
1 row selected.
10) SQL> TRUNCATE TABLE big_emp1;
Statement processed.
11) SQL> ANALYZE TABLE big_emp1 ESTIMATE STATISTICS;
Statement processed.
12) SQL> SELECT table_name,num_rows,blocks,empty_blocks
2> FROM user_tables
3> WHERE table_name='BIG_EMP1';
TABLE_NAME NUM_ROWS BLOCKS EMPTY_BLOCKS
------------------------------ ---------- ---------- ------------
BIG_EMP1 0 0 511
1 row selected.
13) SQL> SELECT segment_name,segment_type,blocks
FROM dba_segments
WHERE segment_name='BIG_EMP1';
SEGMENT_NAME SEGMENT_TYPE BLOCKS EXTENTS
----------------------------- ----------------- ---------- -------
BIG_EMP1 TABLE 512 1
1 row selected.
注意:
TRUNCATE命令回收了由delete命令产生的空闲空间,注意该表分配的空间由原先的1024块降为512块。
为了保留由delete命令产生的空闲空间,可以使用
TRUNCATE TABLE big_emp1 REUSE STORAGE
用此命令后,该表还会是原先的1024块。
行链接(Row chaining) 与行迁移(Row Migration)当一行的数据过长而不能插入一个单个数据块中时,可能发生两种事情:行链接(row chaining)或行迁移(row migration)。
行链接
当第一次插入行时,由于行太长而不能容纳在一个数据块中时,就会发生行链接。在这种情况下,oracle会使用与该块链接的一块或多块数据块来容纳该行的数据。行连接经常在插入比较大的行时才会发生,如包含long, long row, lob等类型的数据。在这些情况下行链接是不可避免的。
行迁移
当修改不是行链接的行时,当修改后的行长度大于修改前的行长度,并且该数据块中的空闲空间已经比较小而不能完全容纳该行的数据时,就会发生行迁移。在这种情况下,Oracle会将整行的数据迁移到一个新的数据块上,而将该行原先的空间只放一个指针,指向该行的新的位置,并且该行原先空间的剩余空间不再被数据库使用,这些剩余的空间我们将其称之为空洞,这就是产生表碎片的主要原因,表碎片基本上也是不可避免的,但是我们可以将其降到一个我们可以接受的程度。注意,即使发生了行迁移,发生了行迁移的行的rowid 还是不会变化,这也是行迁移会引起数据库I/O性能降低的原因。其实行迁移是行链接的一种特殊形式,但是它的起因与行为跟行链接有很大不同,所以一般把它从行链接中独立出来,单独进行处理。
行链接和行迁移引起数据库性能下降的原因:
引起性能下降的原因主要是由于引起多余的I/O造成的。当通过索引访问已有行迁移现象的行时,数据库必须扫描一个以上的数据块才能检索到改行的数据。这主要有一下两种表现形式:
1) 导致row migration 或row chaining INSERT 或 UPDATE语句的性能比较差,因为它们需要执行额外的处理
2) 利用索引查询已经链接或迁移的行的select语句性能比较差,因为它们要执行额外的I/O
如何才能检测到行迁移与行链接:
在表中被迁移或被链接的行可以通过带list chained rows选项的analyze语句识别出来。这个命令收集每个被迁移或链接的行的信息,并将这些信息放到指定的输出表中。为了创建这个输出表,运行脚本UTLCHAIN.SQL。
SQL> ANALYZE TABLE scott.emp LIST CHAINED ROWS;
SQL> SELECT * FROM chained_rows;
当然你也可以通过检查v$sysstat视图中的'table fetch continued row'来检查被迁移或被链接的行。
SQL> SELECT name, value FROM v$sysstat WHERE name = 'table fetch continued row';
NAME VALUE
---------------------------------------------------------------- ---------
table fetch continued row 308
尽管行迁移与行链接是两个不同的事情,但是在oracle内部,它们被当作一回事。所以当你检测行迁移与行链接时,你应该仔细的分析当前你正在处理的是行迁移还是行链接。
解决办法
o 在大多数情况下,行链接是无法克服的,特别是在一个表包含象LONGS, LOBs 等这样的列时。当在不同的表中有大量的链接行,并且哪些表的行的长度不是很长时,你可以通过用更大的block size重建数据库的方法来解决它。
例如:当前你的数据库的数据块的大小为4K,但是你的行的平均长度为6k,那么你可以通过用8k大小的数据块来重建数据库的办法解决行链接现象。
o 行迁移主要是由于设置的PCTFREE参数过小,导致没有给update操作留下足够的空闲空间引起。为了避免行迁移,所有被修改的表应该设置合适的PCTFREE 值,以便在每个数据块内为数据修改保留足够的空间。可以通过增加PCTFREE值的办法来避免行迁移,但这种解决办法是以牺牲更多的空间为代价的,这也就是我们通常所说的以空间换效率。 而且通过增加PCTFREE值的办法只能缓解行迁移现象,而不能完全解决行迁移,所以较好的办法是在设置了合适的PCTFREE值的后,在发现行迁移现象比较严重时,对表的数据进行重组。
下面是对行迁移数据进行重组的步骤(这种方法也被成为CTAS):
-- Get the name of the table with migrated rows:
ACCEPT table_name PROMPT 'Enter the name of the table with migrated rows: '
-- Clean up from last execution
set echo off
DROP TABLE migrated_rows;
DROP TABLE chained_rows;
-- Create the CHAINED_ROWS table
@.../rdbms/admin/utlchain.sql
set echo on
spool fix_mig
-- List the chained and migrated rows
ANALYZE TABLE &table_name LIST CHAINED ROWS;
-- Copy the chained/migrated rows to another table
create table migrated_rows as
SELECT orig.*
FROM &table_name orig, chained_rows cr
WHERE orig.rowid = cr.head_rowid
AND cr.table_name = upper('&table_name');
-- Delete the chained/migrated rows from the original table
DELETE FROM &table_name WHERE rowid IN (SELECT head_rowid FROM chained_rows);
-- Copy the chained/migrated rows back into the original table
INSERT INTO &table_name SELECT * FROM migrated_rows;
spool off
当对一个表进行全表扫描时,我们实际上忽略行迁移中各个指向其它行的指针,因为我们知道,全表扫描会遍历全表,最终会读到发生行迁移的行的行数据,在此时才会处理这些行数据。因此,在全表扫描中,行迁移不会引发其它额外的工作。
当通过索引读一个表的数据时,被迁移的行会引起额外的I/O操作。这是因为从所引中我们会读到数据行的rowid,它告诉数据库到指定文件的指定数据块的指定slot上可以找到需要的数据,但是因为发生了行迁移,此处只存放一个指向数据的指针,而不是真正的数据,所以数据库又需要根据该指针(类似rowid)到指定文件的指定数据块的指定slot上去找真正的数据,重复上面的过程,知道找到真正的数据。我们可以看出,这会引入额外的I/O操作。
发现又严重表碎片的表的步骤:表需要整理原因有2:
a) 有太多的migration rows
b) 表经过删除数据后有大量的空块, 而全表扫描时,仍需要读这些空块
发现需要reorganization的表,需要从表的实际使用的空间与表的hwm入手
首先分析表:
Alter table emp compute statistics.
然后可以查询出有数据的数据块的个数:
For ORACLE 7:
SELECT COUNT(DISTINCT SUBSTR(rowid,15,4)||
SUBSTR(rowid,1,8)) "Used"
FROM schema.table;
For ORACLE 8+:
SELECT COUNT (DISTINCT
DBMS_ROWID.ROWID_BLOCK_NUMBER(rowid)||
DBMS_ROWID.ROWID_RELATIVE_FNO(rowid)) "Used"
FROM schema.table;
or
SELECT COUNT (DISTINCT SUBSTR(rowid,1,15)) "Used"
FROM schema.table;
查询出HWM以下的数据块的个数(可能由于delete, 数据块中并不包含数据):
This will update the table statistics. After generating the statistics, to determine the high water mark:
SELECT blocks, empty_blocks, num_rows
FROM user_tables
WHERE table_name =
下面给出一个综合的sql语句,它可以查询出浪费空间的表(浪费超过25%),而且还计算出其它信息(使用时根据具体情况修改where子句中的blocks,owner限制条件):
SELECT OWNER, SEGMENT_NAME TABLE_NAME, SEGMENT_TYPE,
GREATEST(ROUND(100 * (NVL(HWM - AVG_USED_BLOCKS,0)/GREATEST(NVL(HWM,1),1) ), 2), 0) WASTE_PER,
ROUND(BYTES/1024, 2) TABLE_KB, NUM_ROWS,
BLOCKS, EMPTY_BLOCKS, HWM HIGHWATER_MARK, AVG_USED_BLOCKS,
CHAIN_PER, EXTENTS, MAX_EXTENTS, ALLO_EXTENT_PER,
DECODE(GREATEST(MAX_FREE_SPACE - NEXT_EXTENT, 0), 0,'N','Y') CAN_EXTEND_SPACE,
NEXT_EXTENT, MAX_FREE_SPACE,
O_TABLESPACE_NAME TABLESPACE_NAME
FROM
(SELECT A.OWNER OWNER, A.SEGMENT_NAME, A.SEGMENT_TYPE, A.BYTES,
B.NUM_ROWS, A.BLOCKS BLOCKS, B.EMPTY_BLOCKS EMPTY_BLOCKS,
A.BLOCKS - B.EMPTY_BLOCKS - 1 HWM,
DECODE( ROUND((B.AVG_ROW_LEN * NUM_ROWS * (1 + (PCT_FREE/100)))/C.BLOCKSIZE, 0),
0, 1,
ROUND((B.AVG_ROW_LEN * NUM_ROWS * (1 + (PCT_FREE/100)))/C.BLOCKSIZE, 0)
) + 2 AVG_USED_BLOCKS,
ROUND(100 * (NVL(B.CHAIN_CNT, 0)/GREATEST(NVL(B.NUM_ROWS, 1), 1)), 2) CHAIN_PER,
ROUND(100 * (A.EXTENTS/A.MAX_EXTENTS), 2) ALLO_EXTENT_PER,A.EXTENTS EXTENTS,
A.MAX_EXTENTS MAX_EXTENTS, B.NEXT_EXTENT NEXT_EXTENT, B.TABLESPACE_NAME O_TABLESPACE_NAME
FROM SYS.DBA_SEGMENTS A,
SYS.DBA_TABLES B,
SYS.TS$ C
WHERE A.OWNER =B.OWNER and
SEGMENT_NAME = TABLE_NAME and
SEGMENT_TYPE = 'TABLE' AND
B.TABLESPACE_NAME = C.NAME
UNION ALL
SELECT A.OWNER OWNER, SEGMENT_NAME || '.' || B.PARTITION_NAME, SEGMENT_TYPE, BYTES,
B.NUM_ROWS, A.BLOCKS BLOCKS, B.EMPTY_BLOCKS EMPTY_BLOCKS,
A.BLOCKS - B.EMPTY_BLOCKS - 1 HWM,
DECODE( ROUND((B.AVG_ROW_LEN * B.NUM_ROWS * (1 + (B.PCT_FREE/100)))/C.BLOCKSIZE, 0),
0, 1,
ROUND((B.AVG_ROW_LEN * B.NUM_ROWS * (1 + (B.PCT_FREE/100)))/C.BLOCKSIZE, 0)
) + 2 AVG_USED_BLOCKS,
ROUND(100 * (NVL(B.CHAIN_CNT,0)/GREATEST(NVL(B.NUM_ROWS, 1), 1)), 2) CHAIN_PER,
ROUND(100 * (A.EXTENTS/A.MAX_EXTENTS), 2) ALLO_EXTENT_PER, A.EXTENTS EXTENTS,
A.MAX_EXTENTS MAX_EXTENTS, B.NEXT_EXTENT,
B.TABLESPACE_NAME O_TABLESPACE_NAME
FROM SYS.DBA_SEGMENTS A,
SYS.DBA_TAB_PARTITIONS B,
SYS.TS$ C,
SYS.DBA_TABLES D
WHERE A.OWNER = B.TABLE_OWNER and
SEGMENT_NAME = B.TABLE_NAME and
SEGMENT_TYPE = 'TABLE PARTITION' AND
B.TABLESPACE_NAME = C.NAME AND
D.OWNER = B.TABLE_OWNER AND
D.TABLE_NAME = B.TABLE_NAME AND
A.PARTITION_NAME = B.PARTITION_NAME),
(SELECT TABLESPACE_NAME F_TABLESPACE_NAME,MAX(BYTES)
MAX_FREE_SPACE
FROM SYS.DBA_FREE_SPACE
GROUP BY TABLESPACE_NAME)
WHERE F_TABLESPACE_NAME = O_TABLESPACE_NAME AND
GREATEST(ROUND(100 * (NVL(HWM - AVG_USED_BLOCKS, 0)/GREATEST(NVL(HWM, 1), 1) ), 2), 0) > 25
AND OWNER = '??' AND BLOCKS > 128
ORDER BY 10 DESC, 1 ASC, 2 ASC;
各列的说明:
WASTE_PER:已分配空间中水线以下的空闲空间(即浪费空间)的百分比。
TABLE_KB:该表目前已经分配的所有空间的大小,以k为单位。
NUM_ROWS:在在表中数据的行数
BLOCKS:该表目前已经分配的数据块的块数,包含水线以上的部分
EMPTY_BLOCKS:已分配空间中水线以上的空闲空间
HIGHWATER_MARK:目前的水线
AVG_USED_BLOCKS:理想情况下(没有行迁移),该表数据应该占用的数据块的个数
CHAIN_PER:发生行迁移现象的行占总行的比率
EXTENTS:该表目前已经分配的extent数
MAX_EXTENTS:该表可以分配的最大extent的个数
ALLO_EXTENT_PER:目前已分配的extent的个数占可以分配最大extent的比率
CAN_EXTEND_SPACE:是否可以分配下一个extent
NEXT_EXTENT:下一个extent的大小
MAX_FREE_SPACE:表的已分配空间中最大的空闲空间
一个关于ORACLE性能优化的好贴
ORACLE 8.0.X 版本
SGA=((db_block_buffers * block size)+(shared_pool_size+large_pool_size+log_buffers)+1MB
ORACLE 8.1.X 版本
SGA=((db_block_buffers * block size)+(shared_pool_size+large_pool_size+java_pool_size+log_buffers)+1MB
理论上SGA可占OS系统物理内存的1/2——1/3,我们可以根据需求调整
我推荐SGA=0.45*(OS RAM)
假设服务器运行ORACLE 8.1.X 版本, OS系统内存为2G MEM, db_block_size 是8192 bytes,
除了运行ORACLE数据库外, 没有其它的应用程序或服务器软件.
这样SGA合计约为921M ( 0.45*2048M ),
设shared_pool_size 300M (300*1024*1024 bytes)
设database buffer cache 570M (72960*8192 bytes)
initorasid.ora文件里具体各参数如下:
shared_pool_size = 314572800
# 300 M
db_block_buffers = 72960
# 570 M
log_buffer = 524288
# 512k (128K*CPU个数)
large_pool_size = 31457280
# 30 M
java_pool_size = 20971520
# 20 M
sort_area_size = 524288
# 512k (65k--2M)
sort_area_retained_size = 524288
# MTS 时 sort_area_retained_size = sort_area_size
SUN Solaris里/etc/system文件里的几个参数同样跟内存分配有关
ORACLE安装时缺省的设置: 建议修改的设置:
set shmsys:shminfo_shmmax=4294967295
set shmsys:shminfo_shmmin=1
set shmsys:shminfo_shmmni=100
set shmsys:shminfo_shmseg=15
set semsys:seminfo_semmns=200
set semsys:seminfo_semmni=70
set ulimit=3000000
set semsys:seminfo_semmni=315
set semsys:seminfo_semmsl=300
set semsys:seminfo_semmns=630
set semsys:seminfo_semopm=315
set semsys:seminfo_semvmx=32767
set shmsys:shminfo_shmmax=4294967295
set shmsys:shminfo_shmmni=315
set shmsys:shminfo_shmseg=10
set shmsys:shminfo_shmmin=1
其中这些参数的含义
shmmax - 共享内存段,建议设大点, 达到最大SGA
shmmin - 最小的共享内存段.
shmmni - 共享内存标志符的数量.
shmseg - 一个进程可分配的最大内存段数.
shmall - 最大可允许的内存数,比SGA还要大.
semmns - 信号灯,跟ORACLE的PROCESS数有关.
semmsl - 一个信号灯中最大的信号灯数.
转贴:诊断性能问题-----Oracle杂志上的一篇文章
诊断性能问题
作者:Cary Millsap
使用扩展SQL跟踪数据来了解是什么在耗费这么长的时间。
假如有一天你开车去上班,但最后还是没能及时参加一个重要会议。你无法将你的革命性的想法呈现给客户,所以他们也不会采用。你的拖拖拉拉使你感到沮丧,你发誓决不再犯同样的错误。那么,为了不再发生类似情况,你怎么判断问题的原因呢?按照下面这个列表进行检查怎么样?
检查汽车外表是否有缺陷,因为外表有缺陷会使汽车的最高速度降低1%或更多。
检查车轮定位,因为外倾角、后倾角或前束角不合适都会导致汽车的操纵不灵活并且耗费时间。
检查发动机,以确保达到额定马力的99%或更高。如果不是这样,则要考虑重装或更换发动机。
不,你可能不会采用这种检查方法;那样太可笑了。你可能会以完全不同的方式来判断问题之所在,可能只是问你自己一个简单的问题:什么事情让我花了这么长时间?
从这个角度出发,问题就迎刃而解了。如果开车需要40分钟,而你在会议开始前20分钟才动身,那么下次就要提前30分钟动身。如果因为交通拥堵浪费了20分钟,那么,下次要么再早一些动身,换条路线,要么更仔细地查看早7点的路况报告。如果是你迷了路,结果浪费了20分钟去兜圈子,那么下次你大概就要事先看看地图。如此等等。
我感到奇怪的是,那些擅长解决日常性能优化问题的数据库专业人员在工作中却使用完全不同的方法来解决数据库性能问题。许多数据库"调优人员"从来不问,"是什么让这个程序运行了这么长时间?"相反,他们会参考检查内容清单,并试图阻止错误发生:
检查所有Oracle块请求是否都由数据库缓存提供服务
检查是否有全表扫描
检查所有排序是否都在内存中进行
检查重做日志是否与其他所有数据库文件进行了适当的隔离
等等。
对于某些工作来说,使用检查内容清单也许很好。但是对于判断性能问题这样的工作,试图确定理论上可能会出错的每一件事,从而对这个问题进行处理的做法的效率会很低。更有效的方法就是找到这个简单问题的答案:
是什么花了这么长时间?
用于优化Oracle程序的好的策略就如同日常生活中用到的策略。就像这样:
1. 使用专门的仪器来测定程序的性能,从而监视运行速度慢的程序。
2. 为运行慢的程序创建资源描述,把程序的响应时间细分为几种有用的类型。
3. 通过首先处理响应时间最长的部分来缩短程序的响应时间。
当你了解了若干技术细节之后,这个方法就非常简单了。如果你真的这样做,那么每次你都能获得一个有用的方法,久而久之,你将能在进行性能改进之前预知其结果。
跟踪
如果你有用于收集程序中每个执行步骤的时间统计信息的高级工具,那就用吧。但只收集汇总数据(如通过对系统全局区[SGA]或其基础共享存储段采样获得的数据)的工具对于某些类型的问题就不适合。
使用昂贵的监控工具时最常见的汇总错误是它们会跨整个Oracle数据库实例来汇总某一给定时间间隔内资源的使用情况。但是,运行速度慢的程序实际上可能不受资源争用问题的影响,而这个问题却完全控制着系统中一些不太重要的程序的性能。
即便是那些在Oracle数据库会话级上汇总信息的工具在诊断一些重要的问题类型时也存在着缺陷。例如,假设一个程序运行10分钟,调用了10000次Oracle SQL*Net message from client 这一"等待事件",会话等待该事件的总用时为8.3分钟。这意味着会话对SQL*Net message from client事件的等待时间平均为3秒。但是单从汇总数据看,你无法知道这10000次调用是否每次都用3秒,还是这些调用中也许有一个用了5分钟,而其余9999次调用每次只用0.02秒。这两种情况需要进行完全不同的处理。
在这种情况下最能为你提供帮助的诊断数据是Oracle的扩展SQL跟踪数据。扩展SQL跟踪文件按时间顺序显示了Oracle数据库内核在指定时间内所完成工作的逐条记录。收集扩展SQL跟踪数据几乎是免费的。最大的花销是存储每一个需要引起注意的跟踪文件所需磁盘空间(很少超过几兆字节)的费用。
跟踪自己的代码。如果能访问程序的源代码,则打开其扩展SQL跟踪就非常容易。首先必须确保会话的TIMED_STATISTICS和MAX_DUMP_ FILE_SIZE参数设置正确:
| Code: | [Copy to clipboard] | |
| ||
| Code: | [Copy to clipboard] | |
| ||
| Code: | [Copy to clipboard] | |
| ||
| Code: | [Copy to clipboard] | |
| ||
| Code: | [Copy to clipboard] | |
| ||
| Code: | [Copy to clipboard] | |
| ||
| Code: | [Copy to clipboard] | |
| ||
| Code: | [Copy to clipboard] | |
| ||
| Code: | [Copy to clipboard] | |
| ||
ORACLE 锁简单介绍 (zt)
ORACLE 锁 (zt)
ORACLE数据库是现今数据库领域应用最广泛的,同时它也是一个庞大的系统,全面了解它、玩转它不但需要一定的理论知识,更需要开发经验与工程经验。本人是ORACLE一爱好者,以下是本人对ORACLE锁的一些经验,希望能与大家共同分享。
预备知识:
DDL(DATABASE DEFINITION LANGUAGE):数据库定义语言,如create table、drop table.....
DML(DATABASE MODIFICATION LANGUAGE):数据库修改语言,如insert、delete、update......
参考资料:Oracle8 Administrator"s Guide, Release 8.0
Oracle8 Tuning, Release 8.0
ORACLE锁具体分为以下几类:
1.按用户与系统划分,可以分为自动锁与显示锁
自动锁:当进行一项数据库操作时,缺省情况下,系统自动为此数据库操作获得所有有必要的锁。
显示锁:某些情况下,需要用户显示的锁定数据库操作要用到的数据,才能使数据库操作执行得更好,显示锁是用户为数据库对象设定的。
2.按锁级别划分,可分为共享锁与排它锁
共享锁:共享锁使一个事务对特定数据库资源进行共享访问——另一事务也可对此资源进行访问或获得相同共享锁。共享锁为事务提供高并发性,但如拙劣的事务设计+共享锁容易造成死锁或数据更新丢失。
排它锁:事务设置排它锁后,该事务单独获得此资源,另一事务不能在此事务提交之前获得相同对象的共享锁或排它锁。
3.按操作划分,可分为DML锁、DDL锁
+DML锁又可以分为,行锁、表锁、死锁
-行锁:当事务执行数据库插入、更新、删除操作时,该事务自动获得操作表中操作行的排它锁。
-表级锁:当事务获得行锁后,此事务也将自动获得该行的表锁(共享锁),以防止其它事务进行DDL语句影响记录行的更新。事务也可以在进行过程中获得共享锁或排它锁,只有当事务显示使用LOCK TABLE语句显示的定义一个排它锁时,事务才会获得表上的排它锁,也可使用LOCK TABLE显示的定义一个表级的共享锁(LOCK TABLE具体用法请参考相关文档)。
-死锁:当两个事务需要一组有冲突的锁,而不能将事务继续下去的话,就出现死锁。
如事务1在表A行记录#3中有一排它锁,并等待事务2在表A中记录#4中排它锁的释放,而事务2在表A记录行#4中有一排它锁,并等待事务; 1在表A中记录#3中排它锁的释放,事务1与事务2彼此等待,因此就造成了死锁。死锁一般是因拙劣的事务设计而产生。死锁只能使用SQL下:alter system kill session "sid,serial#";或者使用相关操作系统kill进程的命令,如UNIX下kill -9 sid,或者使用其它工具杀掉死锁进程。
+DDL锁又可以分为:排它DDL锁、共享DDL锁、分析锁
-排它DDL锁:创建、修改、删除一个数据库对象的DDL语句获得操作对象的 排它锁。如使用alter table语句时,为了维护数据的完成性、一致性、合法性,该事务获得一排它DDL锁。
-共享DDL锁:需在数据库对象之间建立相互依赖关系的DDL语句通常需共享获得DDL锁。
如创建一个包,该包中的过程与函数引用了不同的数据库表,当编译此包时,该事务就获得了引用表的共享DDL锁。
-分析锁:ORACLE使用共享池存储分析与优化过的SQL语句及PL/SQL程序,使运行相同语句的应用速度更快。一个在共享池中缓存的对象获得它所引用数据库对象的分析锁。分析锁是一种独特的DDL锁类型,ORACLE使用它追踪共享池对象及它所引用数据库对象之间的依赖关系。当一个事务修改或删除了共享池持有分析锁的数据库对象时,ORACLE使共享池中的对象作废,下次在引用这条SQL/PLSQL语句时,ORACLE重新分析编译此语句。
4.内部闩锁
内部闩锁:这是ORACLE中的一种特殊锁,用于顺序访问内部系统结构。当事务需向缓冲区写入信息时,为了使用此块内存区域,ORACLE首先必须取得这块内存区域的闩锁,才能向此块内存写入信息。
以上是 本人对ORACLE锁的一些总结,不足之处还望大家海涵,同时也希望大家多提出自己对ORACLE锁的一些看法。
回复: (转贴 )
ORACLE里锁有以下几种模式
0:none
1:null 空
2:Row-S 行共享(RS):共享表锁
3:Row-X 行专用(RX):用于行的修改
4:Share 共享锁(S):阻止其他DML操作
5:S/Row-X 共享行专用(SRX):阻止其他事务操作
6:exclusive 专用(X):独立访问使用
数字越大锁级别越高, 影响的操作越多。
一般的查询语句如select ... from ... ;是小于2的锁, 有时会在v$locked_object出现。
select ... from ... for update; 是2的锁。
当对话使用for update子串打开一个游标时,
所有返回集中的数据行都将处于行级(Row-X)独占式锁定,
其他对象只能查询这些数据行,不能进行update、delete或select...for update操作。
insert / update / delete ... ; 是3的锁。
没有commit之前插入同样的一条记录会没有反应,
因为后一个3的锁会一直等待上一个3的锁, 我们必须释放掉上一个才能继续工作。
创建索引的时候也会产生3,4级别的锁。
locked_mode为2,3,4不影响DML(insert,delete,update,select)操作,
但DDL(alter,drop等)操作会提示ora-00054错误。
有主外键约束时 update / delete ... ; 可能会产生4,5的锁。
DDL语句时是6的锁。
以DBA角色, 查看当前数据库里锁的情况可以用如下SQL语句:
select object_id,session_id,locked_mode from v$locked_object;
select t2.username,t2.sid,t2.serial#,t2.logon_time
from v$locked_object t1,v$session t2
where t1.session_id=t2.sid order by t2.logon_time;
如果有长期出现的一列,可能是没有释放的锁。
我们可以用下面SQL语句杀掉长期没有释放非正常的锁:
alter system kill session 'sid,serial#';
如果出现了锁的问题, 某个DML操作可能等待很久没有反应。
当你采用的是直接连接数据库的方式,
也不要用OS系统命令 $kill process_num 或者 $kill -9 process_num来终止用户连接,
因为一个用户进程可能产生一个以上的锁, 杀OS进程并不能彻底清除锁的问题。
在数据库级别可用alter system kill session 'sid,serial#';杀掉不正常的锁。
对2楼的转贴表示疑问
4級鎖有:Create Index, Lock Share
locked_mode爲2,3,4不影響DML(insert,delete,update,select)操作, 但DDL(alter,drop等)操作會提示ora-00054錯誤。
我测试过,当对表加SHARE锁时,在另外一SESSION,是无法
对该表进行UPDATE的。 照道理说,当对一表的记录进行修改时,
是对该记录加“独占”锁,对表结构加“共享”锁,这与SHARE锁
--本身的定义:“共享锁”并无冲突,为什么会这样?
就是说:对表加SHARE锁时,并不影响DML操作,但实际并不是。
另外,这一点,经测试,我看到的情况也不似楼主所说:
5級鎖有:Lock Share Row Exclusive
具體來講有主外鍵約束時update / delete ... ; 可能會産生4,5的鎖。
我测试的结果是:
当对父表进行修改时,若子表对应的记录存在,则产生ORA-02292
异常;否,则只对父表加TX锁(级别6)和TM锁(级别3);
当对子表进行修改时,则除了对子表加加TX锁和TM锁外;对父表
也加了加SS(级别2)锁,这符合逻辑,因为父表此时必须满足参考完整性;
也就是,对父表加SS锁,避免此时对父表进行修改操作。
上面锁说的修改,都是对子父表存在参考完整性的字段操作的。
| Originally posted by ZALBB at 2004-9-24 11:37: 4級鎖有:Create Index, Lock Share locked_mode爲2,3,4不影響DML(insert,delete,update,select)操作, 但DDL(alter,drop等)操作會提示ora-00054錯誤。 locked_mode为4时在表上会设置共享锁(shared lock), 在一个对象上的share lock与exclusive lock是有冲突的. 而进行ddl操作需要表级的独占锁. 所以ddl操作就会提示ORA-00054: resource busy and acquire with NOWAIT specified错误信息了 我测试过,当对表加SHARE锁时,在另外一SESSION,是无法 对该表进行UPDATE的。 照道理说,当对一表的记录进行修改时, 是对该记录加“独占”锁,对表结构加“共享”锁,这与SHARE锁 --本身的定义:“共享锁”并无冲突,为什么会这样? 具体的测试过程能不能贴出来看看, ^_^ 就是说:对表加SHARE锁时,并不影响DML操作,但实际并不是。 另外,这一点,经测试,我看到的情况也不似楼主所说: 5級鎖有:Lock Share Row Exclusive 具體來講有主外鍵約束時update / delete ... ; 可能會産生4,5的鎖。 有主外键约束的时候, 锁机制还是比较复杂的.. 我测试的结果是: 当对父表进行修改时,若子表对应的记录存在,则产生ORA-02292 异常;否,则只对父表加TX锁(级别6)和TM锁(级别3); 当对子表进行修改时,则除了对子表加加TX锁和TM锁外;对父表 也加了加SS(级别2)锁,这符合逻辑,因为父表此时必须满足参考完整性; 也就是,对父表加SS锁,避免此时对父表进行修改操作。 上面锁说的修改,都是对子父表存在参考完整性的字段操作的。 |
如何监控索引的使用
如何监控索引的使用?
研究发现,oracle数据库使用的索引不会超过总数的25%,或者不易他们期望被使用的方式使用。通过 监控数据库索引的使用,释放那些未被使用的索引,从而节省维护索引的开销,优化性能。
1、在oracle8i中,确定使用了那个索引的方法意味着要对存在语共享SQL区中的所有语句运行EXPLIAN PALN,然后查询计划表中的OPERATION列,从而识别有OBJECT_OWNER和OBJECT_NAME列所确定的那个索引上的索引访问。
下面是一个监控索引使用的脚本,这个脚本仅仅是一个样品,在某种条件下成立:
条件:
运行这个脚本的用户拥有权限解释所有的v$sqlarea中的sql,除了不是被SYS装载的。
plan_table.remarks能够别用来决定与特权习惯的错误。
对所有的共享池中SQL,参数OPTIMIZER_GOAL是一个常量,无视v$sqlarea.optimizer_mode。
两次快照之间,统计资料被再次分析过。
没有语句别截断。
所有的对象都是局部的。
所有被引用的表或视图或者是被运行脚本的用户所拥有,或者完全有资格的名字或同义词被使用。
自从上次快照以来,没有不受"欢迎"的语句被冲洗出共享池(例如,在装载)。
对于所有的语句, v$sqlarea.version_count = 1 (children)。
脚本:
| Code: | [Copy to clipboard] | |
| ||
ORACLE碎片整理[转帖]
转自oracle数据库在线(http://www.dbonline.cn/index.htm)
===========================================
我们知道, Oracle 作为一种大型数据库,广泛应用于金融、邮电、电力、民航等数据吞吐量巨大,计算机网络广泛普及的重要部门。对于系统管理员来讲,如何保证网络稳定运行,如何提高数据库性能,使其更加安全高效,就显得尤为重要。作为影响数据库性能的一大因素 -- 数据库碎片,应当引起 DBA 的足够重视,及时发现并整理碎片乃是 DBA 一项基本维护内容。
1 、碎片是如何产生的
---- 当生成一个数据库时,它会分成称为表空间( Tablespace )的多个逻辑段( Segment ),如系统( System )表空间 , 临时( Temporary )表空间等。一个表空间可以包含多个数据范围( Extent )和一个或多个自由范围块,即自由空间( Free Space )。
---- 表空间、段、范围、自由空间的逻辑关系如下:
---- 当表空间中生成一个段时,将从表空间有效自由空间中为这个段的初始范围分配空间。在这些初始范围充满数据时,段会请求增加另一个范围。这样的扩展过程会一直继续下去,直到达到最大的范围值,或者在表空间中已经没有自由空间用于下一个范围。最理想的状态就是一个段的数据可被存在单一的一个范围中。这样,所有的数据存储时靠近段内其它数据,并且寻找数据可少用一些指针。但是一个段包含多个范围的情况是大量存在的,没有任何措施可以保证这些范围是相邻存储的,如图〈 1 〉。当要满足一个空间要求时,数据库不再合并相邻的自由范围(除非别无选择), 而是寻找表空间中最大的自由范围来使用。这样将逐渐形成越来越多的离散的、分隔的、较小的自由空间,即碎片。例如:
2 、碎片对系统的影响
---- 随着时间推移,基于数据库的应用系统的广泛使用,产生的碎片会越来越多,将对数据库有以下两点主要影响:
---- ( 1 )导致系统性能减弱
---- 如上所述,当要满足一个空间要求时,数据库将首先查找当前最大的自由范围,而 " 最大 " 自由范围逐渐变小,要找到一个足够大的自由范围已变得越来越困难,从而导致表空间中的速度障碍,使数据库的空间分配愈发远离理想状态;
---- ( 2 )浪费大量的表空间
---- 尽管有一部分自由范围(如表空间的 pctincrease 为非 0 )将会被 SMON (系统监控)后台进程周期性地合并,但始终有一部分自由范围无法得以自动合并,浪费了大量的表空间。
3 、自由范围的碎片计算
---- 由于自由空间碎片是由几部分组成,如范围数量、最大范围尺寸等,我们可用 FSFI--Free Space Fragmentation Index (自由空间碎片索引)值来直观体现:
FSFI=100*SQRT(max(extent)/sum(extents))*1/SQRT(SQRT(count(extents)))
---- 可以看出, FSFI 的最大可能值为 100 (一个理想的单文件表空间)。随着范围的增加, FSFI 值缓慢下降,而随着最大范围尺寸的减少, FSFI 值会迅速下降。
---- 下面的脚本可以用来计算 FSFI 值:
rem FSFI Value Compute
rem fsfi.sql
column FSFI format 999,99
select tablespace_name,sqrt(max(blocks)/sum(blocks))*
(100/sqrt(sqrt(count(blocks)))) FSFI
from dba_free_space
group by tablespace_name order by 1;
spool fsfi.rep;
/
spool off;
---- 比如,在某数据库运行脚本 fsfi.sql, 得到以下 FSFI 值:
TABLESPACE_NAME FSFI
------------------------------ -------
RBS 74.06
SYSTEM 100.00
TEMP 22.82
TOOLS 75.79
USERS 100.00
USER_TOOLS 100.00
YDCX_DATA 47.34
YDCX_IDX 57.19
YDJF_DATA 33.80
YDJF_IDX 75.55
---- 统计出了数据库的 FSFI 值,就可以把它作为一个可比参数。在一个有着足够有效自由空间,且 FSFI 值超过 30 的表空间中,很少会遇见有效自由空间的问题。当一个空间将要接近可比参数时,就需要做碎片整理了。
4 、自由范围的碎片整理
---- ( 1 )表空间的 pctincrease 值为非 0
---- 可以将表空间的缺省存储参数 pctincrease 改为非 0 。一般将其设为 1 ,如:
alter tablespace temp
default storage(pctincrease 1);
---- 这样 SMON 便会将自由范围自动合并。也可以手工合并自由范围:
alter tablespace temp coalesce;
5 、段的碎片整理
---- 我们知道,段由范围组成。在有些情况下,有必要对段的碎片进行整理。要查看段的有关信息,可查看数据字典 dba_segments ,范围的信息可查看数据字典 dba_extents 。如果段的碎片过多, 将其数据压缩到一个范围的最简单方法便是用正确的存储参数将这个段重建,然后将旧表中的数据插入到新表,同时删除旧表。这个过程可以用 Import/Export (输入 / 输出)工具来完成。
---- Export ()命令有一个(压缩)标志,这个标志在读表时会引发 Export 确定该表所分配的物理空间量,它会向输出转储文件写入一个新的初始化存储参数 -- 等于全部所分配空间。若这个表关闭, 则使用 Import ()工具重新生成。这样,它的数据会放入一个新的、较大的初始段中。例如:
exp user/password file=exp.dmp compress=Y grants=Y indexes=Y
tables=(table1,table2);
---- 若输出成功,则从库中删除已输出的表,然后从输出转储文件中输入表:
imp user/password file=exp.dmp commit=Y buffer=64000 full=Y
---- 这种方法可用于整个数据库。
---- 以上简单分析了 Oracle 数据库碎片的产生、计算方法及整理,仅供参考。数据库的性能优化是一项技术含量高,同时又需要有足够耐心、认真细致的工作。 对数据库碎片的一点探讨,
回复: ORACLE碎片整理[转帖]
下面是一种如何自动处理表空间碎片的代码,希望对上大家看上文有用
=========================================
Coalesce Tablespace Automatically
This technique comes from Sandeep Naik, a database administrator for GSXXI, Inc. in New York City, New York
Here is a handy script which can be scheduled to automatically run and coalesces the tablespaces.
This script is designed to run in NT but can be run in any operating system by slight modifications in the path where the file spools from the SQLPLUS environment. It assumes that the user who runs the script has priviledges to view the data dictionary.
Start of code
--------------------------------------
sqlplus /
prompt this script will coalesce the tablespace automatically
set verify off;
set termout off;
set head off;
spool c:tempcoalesce.log
select 'alter tablespace '||TABLESPACE_NAME||' coalesce ;'
from DBA_FREE_SPACE_COALESCED where PERCENT_EXTENTS_COALESCED <100
or PERCENT_BLOCKS_COALESCED<100 ;
spool off;
@ c:tempcoalesce.log
set head on;
set termout on;
set verify on;
prompt Tablespaces are coalesced successfully
Through the years, we've been writing scripts to do all kinds of things. The ones included in this section are our favorite database tuning scripts--developed, begged, and borrowed over the last few years. These scripts aren't definitive in any way; they could have been, and probably have been, written in a dozen alternative ways. But they do get the work done. (You will find all of these scripts on the O'Reilly web site.)
The version of Oracle that you are running can have a significant impact on your performance. Use the following command to show version information:
SELECT *FROM v$version;--------------------------------------------------------------Oracle7 Server Release 7.2.3.2.0 - Production ReleasePL/SQL Release 2.2.2.2.0 - ProductionCORE Version 2.3.7.1.0 - Production (LSF Alpha)TNS for SVR4: Version 2.1.6.0.0 - ProductionNLSRTL Version 2.3.6.0.0 - Production
In addition to the standard INIT.ORA parameters, Oracle also has a number of undocumented parameters, that appear with an underscore in front of them in parameter displays. Several of the undocumented parameters, such as _DB_BLOCK_WRITE_BATCH and _LOG_ENTRY_PREBUILD_THRESHOLD, can be used to improve your performance. (See the discussion of such parameters in Chapters and .)
Run this script to show your system's parameter setting; then compare your settings to the recommendations in this book.
# undoc.sqlSELECT name, valueFROM v$parameter;SELECT ksppinm, ksppivlFROM x$ksppiWHERE SUBSTR(ksppinm,1,1) = '_';
We don't show the output from this script because of its excessive length.
This script shows details of the System Global Area in memory. The most significant information is the reading for "free memory," which may indicate that the SHARED_POOL_SIZE INIT.ORA parameter should be reduced if the free memory is excessive. If the parameter setting is low, you should not be decreasing the SHARED_POOL_SIZE.
WARNING: Oracle tends to maintain some free memory even when the shared pool size is flooded with activity and needs to be made larger.
Other figures that are useful are the "db_block_buffers," size of the buffer cache, which is usually required to be at least 20 megabytes for optimum performance; the "sql area," which is where all of the shared SQL is placed; the dictionary cache, which is where Oracle's dictionary is placed; and the "log buffer," which is where all changes are written prior to writing to your redo logs.
The log buffer should typically be at least 32,078 bytes or larger. The "shared_sql" and "dictionary cache" sizes are affected by the size of your SHARED_POOL_SIZE INIT.ORA parameter. Unfortunately, the dictionary cache is tuned automatically and not very well by the kernel. The majority of sites operate most efficiently with a shared pool size of at least 30,000,000 bytes.
# sgastat.sqlSELECT *FROM v$sgastatWHERE name IN ('free memory', 'db_block_buffers', 'log_buffer'`dictionary cache', `sql area', `library cache');NAME BYTES---------------------------------- ----------free memory 88652db_block_buffers 20480000log_buffer 512000dictionary cache 2528868sql area 43658416library cache 13177800
One of the most common activities you'll find yourself doing as a DBA is scanning the physical database, looking for new table and index extents. You ought to do this on a regular basis, ideally as part of an automated daily or weekly overnight procedure. Your goal is to minimize the number of extents on disk. Access to contiguous areas of disk is much faster than access to noncontiguous areas. In one test that we did on a 4,000-row table, we found that when the entire table fit on one extent, it took 0.76 second to scan it; when the table was spread over 10 extents, it took 3.62 seconds.
We have found that the existence of a small number of such extents (fewer than five) doesn't seem to affect performance very much, but it is still good practice to store your objects in a single extent. (In the next section, we describe how to size the table for the reorganization while not wasting valuable disk space.)
The following script assumes that the operating system Oracle block size is 4 kilobytes and that all ROLLBACK segments were created with 10 initial extents (MINEXTENTS parameter).
# objstor.sqlSELECT SUBSTR(s.segment_name,1,20) OBJECT_NAME,SUBSTR(s.segment_type,1,5) TYPE,SUBSTR(s.tablespace_name,1,10) T_SPACE,NVL(NVL(t.initial_extent, i.initial_extent),r.initial_extent)/ 4096FST_EXT,NVL(NVL(t.next_extent,i.next_extent),R.NEXT_EXTENT) / 4096 NXT_EXT,s.extents - 1 tot_ext,s.blocks tot_blksFROMdba_rollback_segs R,dba_indexes I,dbs_tables T,dba_segments SWHERE s.segment_name LIKE UPPER('&S_NAME') || '%'AND s.tablespace_name LIKE UPPER('&T_SPACE') || '%'AND s.extents > 1AND s.owner = t.owner (+)AND s.segment_name = t.table_name (+)AND s.tablespace_name = t.tablespace_name (+)AND s.owner = i.owner (+)AND s.segment_name = i.index_name (+)AND s.tablespace_name = i.tablespace_name (+)AND s.owner = r.owner (+)AND s.segment_name = r.segment_name (+)AND s.tablespace_name = r.tablespace_name (+)ORDER BY s.segment_name, s.segment_type;OBJECT_NAME TYPE T_SPACE FST NXT TOT TOTEXT EXT EXT BLKS------------------ ------ -------- --- --- --- ----ALL_TRAN_AUDX_INDX INDEX DEV_IDX 125 63 2 251OBJ$ TABLE SYSTEM 13 13 1 26PRODUCT_PROFILE TABLE SYSTEM 13 13 1 26RBACK1 ROLLB RBK 25 25 9 300RBACK2 ROLLB RBK 25 25 9 525RBACK_BIG ROLLB RBK 256 256 9 2560XREF$ TABLE SYSTEM 13 13 1 26
This section contains several scripts that we use to size a database.
When tables, indexes, and rollback segments are created, they are preassigned a storage allocation (extent), which is reserved and cannot be used by any other object. Although the objects may not use all of the space allocated at the start, as more information is placed into the area the amount of available free space diminishes. This query helps you instantly find application problems resulting from space shortages. Run it at regular intervals for the best information. Note that this script assumes that you have Oracle data blocks of 4 kilobytes (4,096 bytes). This size is operating system dependent, and you will have to modify the query if your block sizes differ.
## tspuse.sqlSELECT SUBSTR(D.tablespace_name,1,15) TSPACE,D.file_id FILE_ID,D.bytes / 1024 / 1024 TOT_MB,D.bytes / 4096 ORA_BLKS,SUM(E.blocks) TOT_USED,ROUND(SUM(E.blocks) / D.bytes / 4096, 4 * 100 PCT_USEDFROM sys.dba_extents E,sys.dba_data_files DWHERE D.file_id = E.file_id (+)GROUP BY D.tablespace_name, D.file_id, D.bytes /TSPACE FILE_ID TOT_MB ORA_BLKS TOT_USED PCT_USED------- ------- ------ -------- -------- --------DEV 4 250 64000 36633 57.2DEV_AUD 6 100 25600 3691 14.4DEV_IDX 5 300 76800 61317 79.8HST 7 200 51200 38400 75.0INV 8 80 20480 13739 67.1INV_IDX 9 50 12800 7673 59.9RBK 3 25 6400 4110 64.2SYSTEM 1 20 5120 2366 46.2TMP 2 50 12800
This query of the database gives a detailed breakdown of the fragmentation of each tablespace file within the database.
# fragment.sqlSELECT SUBSTR(ts.name,1,10) TSPACE,tf.blocks BLOCKS,SUM(f.length) FREE,COUNT(*) PIECES,MAX(f.length) BIGGEST,MIN(f.length) SMALLEST,ROUND(AVG(f.length)) AVERAGE,SUM(DECODE(SIGN(f.length-5), -1, f.length, 0)) DEADFROM sys.fet$ F,sys.file$ TF,sys.ts$ TSWHERE ts.ts# = f.ts#AND ts.ts# = tf.ts#GROUP BY ts.name, tf.blocks;Tspace Blocks Free Pieces Biggest Smallest Average DeadDEV 64000 27366 9 25614 105 3041 0DEV_AUD 25600 21908 1 21908 21908 21908 0DEV_IDX 76800 15482 16 175 4 968 2HST 51200 12799 1 12799 12799 12799 0INV 20480 6740 12 6740 6740 6740 0INV_IDX 12800 5126 4 2565 63 1282 0RBK 6400 2289 1 2289 2289 2289 0SYSTEM 5120 2753 74 487 3 16 12TMP 12800 12799 41 1536 11 312 0
The last column, "Dead," is based on the assumption that any contiguous block smaller than five Oracle blocks (20 kilobytes for the operating system that we used for testing) cannot be used. That is, no table or index has an INITIAL or NEXT extent size less than 20 kilobytes.
This query reports how full a particular table actually is. It compares the number of Oracle blocks that have at least one record against the total number of blocks allocated to the table extent(s). You can use this query to interrogate table after table; the table name replaces the name &TAB_NAME in each statement execution.
# tabused.sqlSELECT BLOCKS ALLOCATED_BLKS,COUNT(DISTINCT SUBSTR(T.ROWID,1,8)|| SUBSTR(T.ROWID,15,4)) USED,(COUNT(DISTINCT SUBSTR(T.ROWID,1,8)|| SUBSTR(T.ROWID,15,4))/ BLOCKS) * 100 PCT_USEDFROM SYS.DBA_SEGMENTS E,&TAB_NAME TWHERE E.SEGMENT_NAME = UPPER ('&TAB_NAME')AND E.SEGMENT_TYPE = 'TABLE'GROUP BY E.BLOCKS;ALLOCATED_BLKS USED PCT_USED-------------- ---- --------2560 1728 67.50
This query reports the number of rows physically residing in Oracle blocks of a table. This query can be used to calculate how much space a table will ultimately require.
# blokrows.sqlSELECT SUBSTR(T.ROWID,1,8) || '-' ||SUBSTR(T.ROWID,15,4) BLOCK,COUNT(*) ROW_CNT,FROM &TAB_NAME TWHERE ROWNUM < 2000GROUP BY SUBSTR(T.ROWID,1,8) || '-' || SUBSTR(T.ROWID,15,4);
Output from this query is as follows:
BLOCK ROW_CNT------------- -------00001F52-0002 9300001F53-0002 8500001F54-0002 8200001F55-0002 10000001F56-0002 8300001F57-0002 7100001F58-0002 8200001F59-0002 9100001F5A-0002 9300001F5B-0002 9100001F5C-0002 6300001F5D-0002 6900001F5E-0002 7500001F5F-0002 100001F60-0002 400001F61-0002 5
By looking at the results of the set of scripts that are included in this section, you can do a good job of calculating future table requirements:
We advise you to be relatively conservative in making estimates so that you don't consume needless amounts of disk space, while being sensible enough not to run out of space too soon. This is not always an easy task! Our calculation is as follows:
Total records = 40,000 x 10= 400,000 records (adjust to 500,000)Records Per Block = 80 . . . 90 (adjust to 75)Space Requirements = (Total Records / Records Per Block) x Block Size = ( 500,000 / 75 ) * ( 1024 * 4 )= 27,306,667 bytes = 26.4 megabytes
If the default extent sizes and PCTINCREASE on your tablespaces are set incorrectly, these can have a marked impact on your performance. The SYSTEM tablespace has a default PCTINCREASE of 50%. If you decrease that value to 0, as many sites do, the number of extents on the dictionary objects becomes excessive and degrades performance. The default INITIAL and NEXT parameters on your temporary tablespace can also cause a lot of extents to be thrown if they are set to a value smaller than the SORT_AREA_SIZE. Ideally, the NEXT extent sizes should be equal to, or a multiple of, the SORT_AREA_SIZE.
SELECT SUBSTR(tablespace_name, 1,18), initial_extent,next_extent, pct_increaseFROM dba_tablespacesORDER BY tablespace_name;
Another troublesome problem that can affect performance is when a user other than SYSTEM or SYS has his or her default tablespace set to the SYSTEM tablespace. Use the following query to ensure that all of your users have been assigned to an appropriate tablespace:
SELECT usernameFROM dba_usersWHERE username NOT IN (`SYS', `SYSTEM')AND (default_tablespace = `SYSTEM'ORtemporary_tablespace = `SYSTEM');
Some scripts list all objects that will not have a free extent that is large enough, assuming that the object has filled its current space allocation and is forced to throw an extent. The following script provides the same information but runs 20 times faster:
# exent.sqlSELECT seg.owner, seg.segment_name,seg.segment_type, seg.tablespace_name, t.next_extentFROM sys.dba_segments seg, sys.dba_tables tWHERE (seg.segment_type = 'TABLE'AND seg.segment_name = t.table_nameAND seg.owner = t.ownerAND NOT EXISTS(SELECT tablespace_nameFROM dba_free_space freeWHERE free.tablespace_name = t.tablespace_nameAND bytes >= t.next_extent ))UNIONSELECT seg.owner, seg.segment_name,seg.segment_type, seg.tablespace_name,DECODE (seg.segment_type,'CLUSTER', c.next_extent)FROM sys.dba_segments seg,sys.dba_clusters cWHERE (seg.segment_type = 'CLUSTER'AND seg.segment_name = c.cluster_nameAND seg.owner = c.ownerAND NOT EXISTS(SELECT tablespace_namefrom dba_free_space freeWHERE free.tablespace_name = c.tablespace_nameAND bytes >= c.next_extent ))UNIONSELECT seg.owner, seg.segment_name,seg.segment_type, seg.tablespace_name,DECODE (seg.segment_type,'INDEX', i.next_extent )FROM sys.dba_segments seg,sys.dba_indexes iWHERE (seg.segment_type = 'INDEX'AND seg.segment_name = i.index_nameAND seg.owner = i.ownerAND NOT EXISTS(SELECT tablespace_nameFROM dba_free_space freeWHERE free.tablespace_name = i.tablespace_nameAND bytes >= i.next_extent ))UNIONSELECT seg.owner, seg.segment_name, seg.segment_type,seg.tablespace_name,DECODE (seg.segment_type, 'ROLLBACK', r.next_extent)FROM sys.dba_segments seg, sys.dba_rollback_segs rwhere (seg.segment_type = 'ROLLBACK'AND seg.segment_name = r.segment_nameAND seg.owner = r.ownerAND NOT EXISTS(SELECT tablespace_nameFROM dba_free_space freeWHERE free.tablespace_name = r.tablespace_nameAND bytes >= r.next_extent ));
From both a performance and recovery perspective, it's important to put the archive logs on a different disk from that used by your other data files and redo logs. If the archive logs are on the same disk as other data files or the redos, there is a high likelihood of disk I/O bottlenecks.
# datafile.sqlSELECT valueFROM v$parameterWHERE name like 'log_archive_dest'UNIONSELECT nameFROM v$datafileUNIONSELECT memberFROM v$logfile/
The following query lists four users' usage of the CPU, ordered by largest usage first. You must have TIMED_STATISTICS set to TRUE for the readings to appear. If TIMED_STATISTICS is set to FALSE, all of the readings will be zero.
# sesscpu.sqlSELECT SUBSTR(name,1,30) parameter,ss.username||'('||se.sid||') ' user_process, valueFROM v$session ss, v$sesstat se, v$statname snWHERE se.statistic# = sn.statistic#AND name like '%CPU used by this session%'AND se.sid = ss.sidORDER BY substr(name,1,25), value DESC/
Throughout this book, we have emphasized how proper sizing of the Oracle cache buffers can help to reduce I/O. Computing the hit ratio is a very helpful way to do this sizing. The hit ratio tells us how many times Oracle has needed to retrieve a database block and has found it in memory (rather than having to access it on disk). Because memory access is so much faster than disk access, the higher the hit ratio, the better your performance.
You can ordinarily obtain the hit ratio for your application only by looking at either the UTLBSTAT/UTLESTAT statistics or the DBA MONITOR screens, as described earlier in this chapter. The script included below shows how to get the hit ratio from SQL*Plus. If you do this, you can automatically schedule hit ratio queries and can direct the output to a report or another database table. That table can then be used to produce application statistics or management reporting.
# hitrate.sqlSELECTSUM(DECODE(name, 'consistent gets',value, 0)) "Consis Gets",SUM(DECODE(name, 'db block gets',value, 0)) "DB Blk Gets",SUM(DECODE(name, 'physical reads',value, 0)) "Phys Reads",(SUM(DECODE(name, 'consistent gets',value, 0))+ SUM(DECODE(name, 'db block gets',value, 0))- SUM(DECODE(name, 'physical reads',value, 0)))/(SUM(DECODE(name, 'consistent gets',value, 0))+ SUM(DECODE(name, 'db block gets',value, 0)) ) * 100 "Hit Ratio"FROM v$sysstat;
The following output is taken from an actual site. Before the buffer cache was increased and a handful of SQL statements were tuned, the hit ratio was at 26%.
Consis Gets DB Blk Gets Phys Reads Hit Ratio---------- ----------- ---------- ----------436987321 877262 2142974 99. 5105852
The next step you must take is to find out which user is causing the poor hit ratio.
# userhit.sqlSELECT se.username||'('|| se.sid||')' "User Session",SUM(DECODE(name, 'consistent gets',value, 0)) "Consis Gets",SUM(DECODE(name, 'db block gets',value, 0)) "DB Blk Gets",SUM(DECODE(name, 'physical reads',value, 0)) "Phys Reads",(SUM(DECODE(name, 'consistent gets',value, 0))+ SUM(DECODE(name, 'db block gets',value, 0))- SUM(DECODE(name, 'physical reads',value, 0)))/(sum(DECODE(name, 'consistent gets',value, 0))+ SUM(DECODE(name, 'db block gets',value, 0)) ) * 100 "Hit Ratio"FROM v$sesstat ss, v$statname sn, v$session seWHERE ss.sid = se.sidAND sn.statistic# = ss.statistic#AND value != 0AND sn.name IN ('db block gets', 'consistent gets', 'physical reads')GROUP BY se.username||'('|| se.sid||')' ;User Session Consis Gets DB Blk Gets Phys Reads Hit Ratio------------- ----------- ----------- ---------- ---------(5) 27679 8934 11012 69.92(6) 36 272 24 92.21CORRIGANP(16) 173176 385 521 99.70GURRYM(18) 1265544 2187 11959 99.06OREILLYT(21) 22705 149 21 99.91RUSSELLD(61) 128754 317 185 99.86
This script lets you interrogate the Oracle data dictionary performance tables via SQL*Plus. These tables give you information about all the objects stored in your dictionary (tablespaces, files, users, rollback segments, constraints, synonyms, etc.). This information is available in other ways, but getting at it through SQL*Plus lets you automate your queries, as we described for the hit ratio in the previous section.
If the dictionary cache is perfectly tuned, the query below would return no rows. When entries are loaded into the dictionary cache for the first time, the "Getmisses" figure is incremented by one. If the "Count," which is the number of entries set aside for each dictionary cache type, is set too small, entries will have to be thrown out to make way for new dictionary entries being read from disk into memory. This will cause rows to appear in the following query.
# rowcache.sqlSELECT parameter, count, getmissesFROM v$rowcacheWHERE getmisses > count;Dictionary Cache (Part of Shared Buffer Pool)PARAMETER COUNT GETMISSES-------------------------------- ---------- ----------dc_free_extents 41 172dc_used_extents 18 150dc_segments 125 202dc_objects 1798 1815dc_columns 4428 4639
Some people prefer to work with ratios and percentages. The dictionary cache miss ratio should ideally be less than 1%, although when the database is first started, the miss ratio will be higher because each dictionary item loaded into memory will record a miss. If the miss ratio is greater than 2% and you have spare memory, increase your SHARED_POOL_SIZE. If the ratio has decreased, you should have improved your performance.
# rowcache.sqlSELECT SUM(gets) "Gets", SUM(misses) "Misses",TO_CHAR(SUM(getmisses) / SUM(gets) * 100 , `999.99')||'%' "Miss Ratio"FROM v$rowcache;Gets Misses Miss Ratio-------------------------------------------------------------------------119,929,181 314,100 2.61%
It is sometimes useful to know which users are accessing the rollback segments. This is important information when a user is continually filling the rollbacks and causing extents to be thrown.
# rolbusrs.sqlSELECT r.name "Rollback Segment Name ",p.spid "System Process Id ",s.username||'(`||l.sid||')' "Oracle User Pid"FROM v$lock l, v$process p, v$rollname r, v$session sWHERE l.sid = p.pid(+)AND s.sid=l.sidAND TRUNC (l.id1(+)/65536) = r.usnAND l.type(+) = 'TX'AND l.lmode(+) = 6ORDER BY r.name/
It's useful to know the foreign keys and the unique or primary keys to which they relate. Foreign keys produce potentially damaging locking problems if the foreign key columns on the child table are not indexed, as we describe in Chapter 8, Selecting a Locking Strategy. The first query below lists all of the foreign keys and the parent table and columns to which they relate.
# forgnkey.sqlSELECT a.owner , a.table_name , c.column_name ,b.owner , b.table_name , d.column_nameFROM dba_constraints a, dba_constraints b,dba_cons_columns c, dba_cons_columns dWHERE a.r_constraint_name = b.constraint_nameAND a.constraint_type = 'R'AND b.constraint_type = 'P'AND a.r_owner=b.ownerAND a.constraint_name = c.constraint_nameAND b.constraint_name=d.constraint_nameAND a.owner = c.ownerAND a.table_name=c.table_nameAND b.owner = d.ownerAND b.table_name=d.table_name;
The second query lists all of the foreign keys that do not have the appropriate indexes in place on the child table. It shows the foreign key constraints that cause locking problems.
# forgnkey.sqlSELECT acc.owner||'-> '||acc.constraint_name||'('||acc.column_name||'['||acc.position||'])'||' ***** Missing Index'FROM all_cons_columns acc, all_constraints acWHERE ac.constraint_name = acc.constraint_nameAND ac.constraint_type = 'R'AND (acc.owner, acc.table_name, acc.column_name, acc.position)IN(SELECT acc.owner, acc.table_name, acc.column_name, acc.positionFROM all_cons_columns acc, all_constraints acWHERE ac.constraint_name = acc.constraint_nameAND ac.constraint_type = 'R'MINUSSELECT table_owner, table_name, column_name, column_positionFROM all_ind_columns)ORDER BY acc.owner, acc.constraint_name,acc.column_name, acc.position;
The following query lists all columns that have differing lengths or data types but that have the same column name. For example, ACCOUNT_NO may be NUMBER(9) in one table and VARCHAR(9) in another. Having different data types can cause data casting problems and result in indexes not being used. (See Chapter 6 for details on data casting.)
# coldiffs.sqlSELECT owner, column_name, table_name||' '||data_type||'('||DECODE(data_type, 'NUMBER', data_precision, data_length)||')'"Characteristics"FROM all_tab_columnsWHERE (column_name, owner)IN(SELECT column_name, ownerFROM all_tab_columnsGROUP BY column_name, ownerHAVING MIN(DECODE(data_type, 'NUMBER', data_precision,data_length))< MAX(DECODE(data_type, 'NUMBER', data_precision,data_length)) )AND owner NOT IN ('SYS', 'SYSTEM')
Oracle7.1 and later provide a mechanism for caching tables in the buffer cache, using the command
ALTER TABLE tablename CACHE
Caching tables will speed up data access and improve performance by finding the data in memory and avoiding disk reads. To determine which tables have been cached, run the following command:
# tabcache.sqlSELECT owner, table_name, cacheFROM all_tablesWHERE owner not in ('SYS', 'SYSTEM')AND cache not like 'N%';
Having invalid objects in your database usually indicates that your underlying tables have been altered to add a new column or have had DDL operations performed on them. The most common objects that become invalid are views, packages, and procedures. Invalid packages and procedures can cause a long response delay because they have to be recompiled. The user is forced to wait for the recompilation to complete. If you do alter your tables, you should always recompile your invalid packages and procedures to avoid user frustration. To obtain a list of all of the invalid objects, run the following query:
# invalobj.sqlSELECT owner, object_type, object_name, statusFROM all_objectsWHERE status = 'INVALID'ORDER BY owner, object_type, object_name/
Triggers can significantly affect performance. They can cause major problems if the trigger has been disabled and someone has forgotten to reenable the trigger. In this case, the changes that the trigger was to have made will be difficult to recreate, since it is difficult to determine what updates have been applied to the table after the trigger was disabled. To obtain a list of all of the triggers and their status, run this query:
# triggers.sqlTTITLE 'List All Triggers'SELECT table_name, trigger_name, statusFROM all_triggersORDER BY table_name, trigger_name;
Latches are low-level locking mechanisms that are used to protect Oracle data and memory structures, such as the least-recently-used list in the buffer cache or the redo allocation of space in the log buffer. (We describe latch tuning in Chapter 11.) The following script investigates who is holding the latches:
# latchhld.sqlSELECT l.name "Latch Held", p.username "User Holding the Latch"FROM v$process p,v$latchholder lWHERE l.pid = p.pid;
This listing provides you with a list of the number of objects on a per-user basis. It can be run regularly to make sure that your database is operating as you expect it to.
# objcount.sqlSELECT username,COUNT(DECODE(o.type, 2, o.obj#, `')) Tab,COUNT(DECODE(o.type, 1, o.obj#, `')) Ind,COUNT(DECODE(o.type, 5, o.obj#, `')) Syn,COUNT(DECODE(o.type, 4, o.obj#, `')) Vew,COUNT(DECODE(o.type, 6, o.obj#, `')) Seq,COUNT(DECODE(o.type, 7, o.obj#, `')) Prc,COUNT(DECODE(o.type, 8, o.obj#, `')) Fun,COUNT(DECODE(o.type, 9, o.obj#, `')) Pck,COUNT(DECODE(o.type,12, o.obj#, `')) Trg,COUNT(DECODE(o.type,10, o.obj#, `')) DepFROM sys.obj$ o, sys.dba_users UWHERE u.user_id = o.owner# (+)GROUP BY username;
http://www.oreilly.com/catalog/oracle2/chapter/ch10.html#73980








