`
itspace
  • 浏览: 961196 次
  • 性别: Icon_minigender_1
  • 来自: 杭州
社区版块
存档分类
最新评论

记一次exp hang故障处理

阅读更多
今天早上上班,刚刚上QQ,就有客户发来了离线消息。
引用
*** 8:26:38
   ora9i  8585370 33161666   0 02:00:00      -  0:00 exp '/ as sysdba' full=y compress=n consistent=y feedback=100000 direct=y file=/bak9i/backup/logical/full01.dmp,/bak9i/backup/logical/full02.dmp,/bak9i/backup/logical/full03.dmp,/bak9i/backup/logical/full04.dmp,/bak9i/backup/logical/full05.dmp,/bak9i/backup/logical/full06.dmp,/bak9i/backup/logical/full07.dmp,/bak9i/backup/logical/full08.dmp,/bak9i/backup/logical/full09.dmp,/bak9i/backup/logical/full10.dmp,/bak9i/backup/logical/full11.dmp,/bak9i/backup/logical/full12.dmp,/bak9i/backup/logical/full13.dmp,/bak9i/backup/logical/full14.dmp,/bak9i/backup/logical/full15.dmp,/bak9i/backup/logical/full16.dmp,/bak9i/backup/logical/full17.dmp,/bak9i/backup/logical/full18.dmp,/bak9i/backup/logical/full19.dmp,/bak9i/backup/logical/full20.dmp,/bak9i/backup/logical/full21.dmp,/bak9i/backup/logical/full22.dmp,/bak9i/backup/logical/full23.dmp,/bak9i/backup/logical/full24.dmp buffer=52428800 filesize=2000M log=/bak9i/backup/logical/full.log
   ora9i 22937640 28180504   0   Mar 14      -  0:00 exp '/ as sysdba' full=y compress=n consistent=y feedback=100000 direct=y file=/bak9i/backup/logical/full01.dmp,/bak9i/backup/logical/full02.dmp,/bak9i/backup/logical/full03.dmp,/bak9i/backup/logical/full04.dmp,/bak9i/backup/logical/full05.dmp,/bak9i/backup/logical/full06.dmp,/bak9i/backup/logical/full07.dmp,/bak9i/backup/logical/full08.dmp,/bak9i/backup/logical/full09.dmp,/bak9i/backup/logical/full10.dmp,/bak9i/backup/logical/full11.dmp,/bak9i/backup/logical/full12.dmp,/bak9i/backup/logical/full13.dmp,/bak9i/backup/logical/full14.dmp,/bak9i/backup/logical/full15.dmp,/bak9i/backup/logical/full16.dmp,/bak9i/backup/logical/full17.dmp,/bak9i/backup/logical/full18.dmp,/bak9i/backup/logical/full19.dmp,/bak9i/backup/logical/full20.dmp,/bak9i/backup/logical/full21.dmp,/bak9i/backup/logical/full22.dmp,/bak9i/backup/logical/full23.dmp,/bak9i/backup/logical/full24.dmp buffer=52428800 filesize=2000M log=/bak9i/backup/logical/full.log 
*** 8:26:44
看来我只有kill掉了
*** 8:28:55
ora9i@p750:/ora9i/admin/ytyy/bdump$ ps -ef | grep exp
   ora9i  8585370 33161666   0 02:00:00      -  0:00 exp '/ as sysdba' full=y compress=n consistent=y feedback=100000 direct=y file=/bak9i/backup/logical/full01.dmp,/bak9i/backup/logical/full02.dmp,/bak9i/backup/logical/full03.dmp,/bak9i/backup/logical/full04.dmp,/bak9i/backup/logical/full05.dmp,/bak9i/backup/logical/full06.dmp,/bak9i/backup/logical/full07.dmp,/bak9i/backup/logical/full08.dmp,/bak9i/backup/logical/full09.dmp,/bak9i/backup/logical/full10.dmp,/bak9i/backup/logical/full11.dmp,/bak9i/backup/logical/full12.dmp,/bak9i/backup/logical/full13.dmp,/bak9i/backup/logical/full14.dmp,/bak9i/backup/logical/full15.dmp,/bak9i/backup/logical/full16.dmp,/bak9i/backup/logical/full17.dmp,/bak9i/backup/logical/full18.dmp,/bak9i/backup/logical/full19.dmp,/bak9i/backup/logical/full20.dmp,/bak9i/backup/logical/full21.dmp,/bak9i/backup/logical/full22.dmp,/bak9i/backup/logical/full23.dmp,/bak9i/backup/logical/full24.dmp buffer=52428800 filesize=2000M log=/bak9i/backup/logical/full.log
   ora9i 22937640 28180504   0   Mar 14      -  0:00 exp '/ as sysdba' full=y compress=n consistent=y feedback=100000 direct=y file=/bak9i/backup/logical/full01.dmp,/bak9i/backup/logical/full02.dmp,/bak9i/backup/logical/full03.dmp,/bak9i/backup/logical/full04.dmp,/bak9i/backup/logical/full05.dmp,/bak9i/backup/logical/full06.dmp,/bak9i/backup/logical/full07.dmp,/bak9i/backup/logical/full08.dmp,/bak9i/backup/logical/full09.dmp,/bak9i/backup/logical/full10.dmp,/bak9i/backup/logical/full11.dmp,/bak9i/backup/logical/full12.dmp,/bak9i/backup/logical/full13.dmp,/bak9i/backup/logical/full14.dmp,/bak9i/backup/logical/full15.dmp,/bak9i/backup/logical/full16.dmp,/bak9i/backup/logical/full17.dmp,/bak9i/backup/logical/full18.dmp,/bak9i/backup/logical/full19.dmp,/bak9i/backup/logical/full20.dmp,/bak9i/backup/logical/full21.dmp,/bak9i/backup/logical/full22.dmp,/bak9i/backup/logical/full23.dmp,/bak9i/backup/logical/full24.dmp buffer=52428800 filesize=2000M log=/bak9i/backup/logical/full.log
   ora9i 31916520 13173082   0 08:25:29  pts/0  0:00 grep exp
ora9i@p750:/ora9i/admin/ytyy/bdump$ kill -9 8585370
ora9i@p750:/ora9i/admin/ytyy/bdump$ kill -9 22937640
ora9i@p750:/ora9i/admin/ytyy/bdump$ ps -ef | grep exp
   ora9i 24445036 13173082   0 08:27:58  pts/0  0:00 grep exp 
*** 8:29:15
9i上逻辑导出不成功

于是我让他查看/bak9i/backup/logical/full.log看看是什么内容,日志显示
引用
. exporting database links
. exporting sequence numbers
. exporting directory aliases
. exporting context namespaces
. exporting foreign function library names
. exporting PUBLIC type synonyms
. exporting private type synonyms
. exporting object type definitions
. exporting system procedural objects and actions
. exporting pre-schema procedural objects and actions
. exporting cluster definitions

也就意味着Oracle 9208在exp cluster定义的时候hang住了。我感觉这个问题,比较怪异,于是上metalink搜寻了一番,只有Bug 5035017: EXPORT HANGS ON CLUSTER DEFINITIONS - ORA-4021和此案例比较类似,此bug是发生在Oracle 10.1.0.4.0版本上,但目前的数据库是Oracle 9208且运行在AIX 6106之上。而且此bug exp错误时还伴随着ORA-4021产生。
引用
EXP-56: ORACLE error 4021 encountered
ORA-4021: timeout occurred while waiting to lock object
ORA-6512: at "SYS.DBMS_METADATA", line 1511
ORA-6512: at "SYS.DBMS_METADATA", line 1548
ORA-6512: at "SYS.DBMS_METADATA", line 1864
ORA-6512: at "SYS.DBMS_METADATA", line 3707
ORA-6512: at "SYS.DBMS_METADATA", line 3689
ORA-6512: at line 1
EXP-0: Export terminated unsuccessfully

这和本案例中有较大不同,但据bug描述,此问题和xml组件有一定的关系。于是进一步查看其生产库xml组件状态,果然处于invliad状态。
引用
Oracle9i Catalog Views 9.2.0.8.0 VALID
Oracle9i Packages and Types 9.2.0.8.0 VALID
Oracle Workspace Manager 9.2.0.1.0 VALID
JServer JAVA Virtual Machine 9.2.0.8.0 VALID
Oracle XDK for Java 9.2.0.10.0 VALID
Oracle9i Java Packages 9.2.0.8.0 VALID
Oracle interMedia 9.2.0.8.0 VALID
Spatial 9.2.0.8.0 VALID
Oracle Text 9.2.0.8.0 VALID
Oracle XML Database 9.2.0.8.0 INVALID
Oracle Ultra Search 9.2.0.8.0 VALID
Oracle Data Mining 9.2.0.8.0 VALID
OLAP Analytic Workspace 9.2.0.8.0 UPGRADED
Oracle OLAP API 9.2.0.8.0 UPGRADED
OLAP Catalog 9.2.0.8.0 VALID

但这只能怀疑exp hang和xml有关。于是我又再次让客户运行exp脚本,观察v$session_wait等待事件出现library cache pin 事件。进一步细分查询
select * from v$lock where block=1;为空
引用
select sid, event, p1raw, seconds_in_wait, wait_time  from v$session where event='library cache pin';
333 library cache pin 07000003DD14C3A0 26530 0
387 library cache pin 07000003B8E14D60 255 0
646 library cache pin 07000003DD14C3A0 35987 0

进一步查询x$kglob,查看其等待对象,出现关键字XM
引用
col object for a200
SELECT kglnaown AS owner, kglnaobj as Object,kglhdadr
FROM sys.x$kglob
WHERE kglhdadr='07000003DD14C3A0';
1 SELECT VALUE(KU$) XM 07000003DD14C3A0

是以至此,基本上可以判断此问题和xml组件失效有关系。于是询问客户是否可以将此组件删除,确定之后执行以下脚本:
引用
SQL> set echo on
SQL> set echo on
SQL> spool xdb_remove.log
SQL> @?/rdbms/admin/catnoqm.sql
SQL> Rem
SQL> Rem $Header: catnoqm.sql 03-jan-2002.17:32:31 spannala Exp $
SQL> Rem
SQL> Rem catnoqm.sql
SQL> Rem
SQL> Rem Copyright (c) 2001, 2002, Oracle Corporation.  All rights reserved.
SQL> Rem
SQL> Rem    NAME
SQL> Rem         catnoqm.sql - CATalog script for removing (NO) XDB
SQL> Rem
SQL> Rem    DESCRIPTION
SQL> Rem         this script drops the metadata created for SQL XML management
SQL> Rem         This scirpt must be invoked as sys. It is to be invoked as
SQL> Rem
SQL> Rem          @@catnoqm
SQL> Rem    NOTES
SQL> Rem
SQL> Rem    MODIFIED   (MM/DD/YY)
SQL> Rem    spannala    01/03/02 - tables are not handled by xdb
SQL> Rem    spannala    01/02/02 - registry
SQL> Rem    spannala    12/20/01 - passing in the resource tablespace name
SQL> Rem    tsingh      11/17/01 - remove connection string
SQL> Rem    tsingh      06/30/01 - XDB: XML Database merge
SQL> Rem    amanikut    02/13/01 - Creation
SQL> Rem
SQL> Rem
SQL>
SQL> execute dbms_registry.removing('XDB');

PL/SQL procedure successfully completed.

SQL> drop user xdb cascade;

但是在drop user xdb cascade时又出现hang住现象。继续查询v$session_wait,出现
library cache lock等待事件。
引用
SQL> select sid,event from v$session_wait where event not like '%SQL%';

       SID EVENT
---------- ----------------------------------------------------------------
         1 pmon timer
         2 rdbms ipc message
         3 rdbms ipc message
         4 rdbms ipc message
         5 rdbms ipc message
        10 rdbms ipc message
        12 rdbms ipc message
        11 rdbms ipc message
         9 rdbms ipc message
         6 rdbms ipc message
         7 rdbms ipc message

       SID EVENT
---------- ----------------------------------------------------------------
        21 db file sequential read
       392 db file sequential read
         8 smon timer
       652 library cache lock
       122 jobq slave wait
       462 jobq slave wait
       591 jobq slave wait
       327 jobq slave wait
       457 jobq slave wait

20 rows selected.

接下来就轮到解决library cache lock,具体过程看日志了。
引用
SQL> select spid from v$process where addr =(select paddr from v$session where sid=652);

SPID
------------
15925756

SQL> !ps -ef|grep 15925756
   ora9i 41615506 27918796   0 10:19:13  pts/0  0:00 grep 15925756
   ora9i 15925756 43254130   0 10:15:22      -  0:00 oracleytyy (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

SQL> select b.sid,a.user_name,a.kglnaobj
from x$kgllk a , v$session b
where a.kgllkhd  2    3  l in
(select p1raw from v$session_wait
where wait_time=0 and event = 'library ca  4    5  che lock')
and a.kgllkmod <> 0
and b.saddr=a.kgllkuse  6    7  ;

       SID USER_NAME
---------- ------------------------------
KGLNAOBJ
------------------------------------------------------------
        21 SYS
XDB$RESOURCE

       392 YTYY
XDB$RESOURCE


SQL> select sid,serial# from v$session where sid in (21,392);

       SID    SERIAL#
---------- ----------
        21      31951
       392       5343

SQL> select spid from v$process where addr in (select paddr from v$session where sid in (21,392));

SPID
------------
18547020
23200210

SQL> !ps -ef|grep 18547020
   ora9i 38600852 27918796   0 10:23:02  pts/0  0:00 grep 18547020
   ora9i 18547020        1   0   Mar 14      -  0:01 oracleytyy (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))

SQL> !ps -ef|grep 23200210
   ora9i 23200210        1   0 09:06:05      -  0:00 oracleytyy (LOCAL=NO)
   ora9i 42140070 27918796   0 10:23:13  pts/0  0:00 grep 23200210

SQL> alter system kill session '21,31951';

alter system kill session '21,31951'
*
ERROR at line 1:
ORA-00031: session marked for kill


SQL> SQL> alter system kill session '392,5343';
 
alter system kill session '392,5343'
*
ERROR at line 1:
ORA-00031: session marked for kill


SQL> !ps -ef|grep 18547020
   ora9i 18547020        1   0   Mar 14      -  0:01 oracleytyy (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
   ora9i 20709834 27918796   0 10:26:22  pts/0  0:00 grep 18547020

SQL> !kill -9 18547020

SQL> !ps -ef|grep 23200210
   ora9i 18547022 27918796   0 10:26:48  pts/0  0:00 grep 23200210
   ora9i 23200210        1   0 09:06:05      -  0:00 oracleytyy (LOCAL=NO)

SQL> !kill -9 23200210

SQL> select spid from v$process where addr in (select paddr from v$session where sid in (21,392));

SPID
------------
18547020
23200210

SQL> select b.sid,a.user_name,a.kglnaobj
from x$kgllk a , v$session b
where a.kgllkhd  2    3  l in
(select p1raw from v$session_wait
where wait_time=0 and event = 'library cache lock')
and a.kgllkmod <> 0
and b.saddr=a.kgllkuse  4    5    6    7  ;

no rows selected


解决了library cache lock,xdb用户也顺利删除。最终exp导出正常。

附:
9201客户端导9208数据库时出现
引用
Table ODM_PMML_DTD will be exported in conventional path.
. . exporting table                   ODM_PMML_DTD
EXP-00003: no storage definition found for segment(7, 267)
. . exporting table             ODM_P_I_ITEM_RULES






分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics