DBA Corner: 2007

Saturday, August 11, 2007

ORA-19511: VERITAS NetBackup: Status 25: Cannot connect on socket

Exact Error Message
25 cannot connect on socket

Details:
Overview:
When performing backups or restores, socket errors are being produced.

Troubleshooting:
Please follow all steps within the VERITAS NetBackup (tm) Troubleshooting Guide or the NetBackup Troubleshooter within the Activity Monitor for this status code before continuing.

Please confirm hardware and software compatibility before continuing. A list of compatible hardware and software may be obtained within the VERITAS NetBackup Release Notes or on the VERITAS Support Web site.

If the above does not resolve the issue, please continue.

1. Status code 25 caused by no open sockets on the Master/Media Server

Please look for the following log messages:

Master Log Files:

Media Server Log Files:
bpbrm:
02:44:21.877 [1092.2944] <2> nb_bind_on_port_addr: port 4998 unavailable
02:44:21.877 [1092.2944] <2> nb_getsockconnected: cannot connect() to MasterServer, Only one usage of each socket address (protocol/network address/port) is normally permitted.
02:44:21.877 [1092.2944] <2> db_begin: nb_getsockconnected() failed: Only one usage of each socket address (protocol/network address/port) is normally permitted. (10048)
02:44:21.877 [1092.2944] <2> db_FLISTsend: db_begin() failed: cannot connect on socket
02:44:21.877 [1092.2944] <16> bpbrm handle_backup: db_FLISTsend failed: cannot connect on socket (25)
02:44:21.877 [1092.2944] <2> nb_bind_on_port_addr: port 4999 unavailable

Client Log Files:

Resolution:
A socket in a TIME_WAIT state may be made available sooner by reducing the tcp_time_wait_interval parameter. The default for this parameter on Solaris is 240000 milliseconds or 4 minutes. The default for Hewlett-Packard (HP) is a value of 60000 milliseconds or 1 minute. The default on AIX is set in increments of 15 seconds. The default value is 1 or 15 seconds. On AIX, this is the lowest it may be set to. VERITAS has found that the tcp_time_wait_interval can often be set to 1 second with no adverse effects. However, customers should adjust the TCP_TIME_WAIT_INTERVAL in increments; lowering the value until backups no longer fail with the status 25.

The value of this parameter may be obtained and changed by running the following command:

For Solaris 2.6 or earlier:

#ndd -get /dev/tcp tcp_close_wait_interval
#240000
#ndd -set /dev/tcp tcp_close_wait_interval 10000 (10 seconds)

For Solaris 7 or later:

#ndd -get /dev/tcp tcp_time_wait_interval
#240000
#ndd -set /dev/tcp tcp_time_wait_interval 10000 (10 seconds)

The ndd command makes the change immediately, without a need for a reboot. This setting will go back to default after a reboot. To make the change permanent, the command can be added to the appropriate TCP/IP startup script. On Solaris, this is /etc/rc2.d/S69inet.

For HP-UX 11, use the following command:

#ndd -get /dev/tcp tcp_time_wait_interval
#60000
#ndd -set /dev/tcp tcp_time_wait_interval 10000 (10 seconds)

Note: The equivalent command on HP-UX 10.x is nettune instead of ndd.

The ndd command makes the change immediately, without a need for a reboot. This setting will go back to default after a reboot. To make the change permanent, the command can be added to the appropriate TCP/IP startup script. On HP-UX 11, see /etc/rc.config.d/nddconf for examples on how to set it.

It is not necessary to change the tcp_timewait default parameter on AIX, but for reference:

To display the value, enter:
#no -a
or
#no -o tcp_timewait

To change the value, enter:
#no -o tcp_timewait=1

Where is a positive integer, with each increment representing 15 seconds. The default is 1 for 15 seconds. The change takes effect immediately. The change is effective until the next boot. A permanent change is made by adding the command to /etc/rc.net.

Windows System:
Ports in TIME_WAIT will be seen by running netstat -a from a command prompt on the master or media server. This can be an indication that ports are not being released quickly enough to allow another connection on that port. The registry setting TcpTimedWaitDelay can be added to reduce the time a port will stay in this state.

1. Go to Start | Run, type regedit, and click OK
2. Navigate to HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
3. Highlight the Parameters key, right-click and select New | DWORD Value
4. Type the name TcpTimedWaitDelay
5. Double-click TcpTimedWaitDelay and enter a decimal value of 30
By default, this value is 240 seconds (4 minutes). It is recommended that this be changed to a value between 30-60 (seconds) in decimal, to decrease the wait time before a port becomes available.

Description:
This parameter determines the length of time that a connection will stay in the TIME_WAIT state when being closed. While a connection is in the TIME_WAIT state, the socket pair cannot be re-used. This is also known as the "2MSL" state, as by RFC the value should be twice the maximum segment lifetime on the network. See RFC793 for further details.

IBM: TSM (ADSM) back-up - TCP/IP connection failure

J1gh2 (MIS)

11 Aug 04 7:15

Hi folks

For a few months now a level 0 backup fails about once a week with "ANS1017E (RC-50) Session rejected: TCP/IP connection failure". This error appears in the database log file but there is nothing in the dsmerror.log or dsmsched.log.

The TSM server is in a heathy state in all other respects and all other scheduled backups do get completed after this . We are running TSM 5.1 on AIX 5.2. We also run TSM 4.2 on AIX 4.3 but this does not exhibit this problem.

Since it happens intermittently, it`s difficult to set a trace. The workload on the server is not particulary challending and topas shows ample idle time.

I would appreciate any help and suggestions. Thanks a lot

LED888 (TechnicalUser)

12 Aug 04 22:17

Session rejected: TCP/IP connection failure [Same as ANS1017E]
This is what the client sees and reports, but has no idea why.
The cause is best sought in the ADSM server Activity Log for that time.
Could be a real datacomm problem; or...
Grossest problem: the TSM server is down.

If you get this condition after supposedly changing the client and server to use a different port number (e.g., 1502), and the Activity Log has no significant information about the attempted session, use 'netstat' or 'lsof' or similar utility in the server operating system to
verify that the *SM server is actually serving the port number that you believe it should be. (You *did* code the port numbers into both the client and server options files, right?)
An administrator may have done a 'CANcel SEssion'.
If during a Backup, likely the server cancelling it due to higher priority task like DB Backup starting and needing a tape drive...particularly when there is a drive shortage. Look in the server Activity Log around that time and you will likely see "ANR0492I All drives in use. Session 22668 for node ________ (AIX) being preempted by higher priority operation.".

Or look in the Activity Log for a "ANR0481W Session NNN for node () terminated - client did not respond within NN seconds." message, which reflects a server COMMTimeout value that is too low.

Message "ANR0482W Session for name () terminated - idle for more than N minutes." is telling you that the sever IDLETimeout value is too low. Remember that longstanding clients may take considerable time to rummage around in their file systems looking for new files to back up.

Another problem is in starting your client scheduler process from
/etc/inittab, but failing to specify redirection - you need:
dsmc::once:/usr/bin/dsmc sched > /dev/null 2>&1 # TSM scheduler

An unusual cause is in having the client and server defined to use the same port number!

Might also be a firewall rejecting the TSM client as it tries to reach the server through that firewall.

J1gh2 (MIS)

13 Aug 04 5:03

Thanks a lot, LED888

I will follow up all the leads that you have given me and let you know.

Cheers

J1gh2 (MIS)

17 Aug 04 8:29

Hi LED888

I did indeed find ""ANR0492I All drives in use..." in the activity log at the time of the TCP/IP connection failure. We are now installing more drives...

Thanks a lot for your help

LED888 (TechnicalUser)

19 Aug 04 20:37

That's great news!

Friday, May 11, 2007

ORA-1861 Using Date Format in Rman Recovery 'SET UNTIL TIME'

BackupButler BB - ORA-1861 Using Date Format in Rman Recovery 'SET UNTIL TIME'

ORA-1861 Using Date Format in Rman Recovery 'SET UNTIL TIME'
You are trying to use the RMAN command SET UNTIL TIME command to

perform a Point-In-Time-Recovery. The following errors are reported:

Recovery Manager: Release 8.0.4.0.0 - Production

RMAN-06005: connected to target database: TARGET

RMAN-06008: connected to recovery catalog database

RMAN> run {

2> allocate channel t1 type disk;

3> set until time "1998-08-03:15:52:50";

4> .....

5> .....

6> release channel t1;

7> }

RMAN-03022: compiling command: allocate

RMAN-03023: executing command: allocate

RMAN-08030: allocated channel: t1

RMAN-08500: channel t1: sid=9 devtype=DISK

RMAN-03022: compiling command: set

RMAN-03026: error recovery releasing channel resources

RMAN-08031: released channel: t1

RMAN-00569: ================error message stack follows============

RMAN-03002: failure during compilation of command

RMAN-03013: command type: set

RMAN-06003: ORACLE error from target database: ORA-01861: literal does not match format string

RMAN>

Problem Explanation:
====================

The 'SET UNTIL TIME' format in the RMAN script does not match the NLS_DATE_FORMAT. Possibly, the script writer forgot to use the easy method of specifying dates; i.e. using the TO_DATE function.

Search Words:
=============

RECOVERY

NLS_DATE_FORMAT

ORA-01861

Solution Description:
=====================

Use the TO_DATE function in your rman scripts to specify dates
that are independant of the current NLS_DATE_FORMAT setting.

For example:

set until time "to_date('2002 feb 20 15:33','yyyy mon dd hh24:mi')";

If you do not use the TO_DATE function, then the following applies.

The NLS_DATE_FORMAT variable must be set to the same format used in the 'SET
UNTIL TIME' RMAN command.

UNIX:

Use the SETENV command to set the NLS_DATE_FORMAT variable to the desired format prior to logging on to RMAN.

i.e, %setenv NLS_DATE_FORMAT YYYY-MM-DD:HH24:MI:SS

WinNT:

Use the SET command to set the NLS_DATE_FORMAT variable to the desired format prior to logging on to RMAN.

i.e, C:\> set NLS_DATE_FORMAT=YYYY-MM-DD:HH24:MI:SS

Or

Create an NLS_DATE_FORMAT entry in the WinNT registry with the desired format. Open the Registry Editor and create the entry under the LOCAL_MACHINE/SOFTWARE/ORACLE key.

* Important Note * If more than one $ORACLE_HOME, you must create the entry under the correct $ORACLE_HOME subkey.

i.e. LOCAL_MACHINE/SOFTWARE/ORACLE/HOMEn, where n designates the $ORACLE_HOME number.

Solution Explanation:
=====================

After taking any of the above steps, RMAN will recognize the SET UNTIL TIME command format.

Monday, February 19, 2007

EXP-00003: no storage definition found for segment(#,#)

VERY IMPORTANT!!!!!! EXP ERROR
Cause
This problem during export has been introduced with the 9.2.0.5 patchset. In that patchset, a fix has been included for Bug 2948717 "Queries against DBA_SEGMENTS can be slow accessing BYTES/BLOCKS/EXTENTS columns". Note that this fix has also been implemented in Oracle10g Release 1 (10.1.0.2.0).

As a result of that fix, an export from the 9.2.0.5.0 or any higher release database (e.g. 9.2.0.6.0 or 10.1.0.4.0) with a lower release export utility, e.g. 9.2.0.4.0 or 8.1.7.4.0 will now produce the EXP-3 error. For details see:
Bug 3593227 "EXP-3 WHEN EXPORTING A TABLE WITH MORE THAN 1 EXTENT USING EXP BELOW 9.2.0.5"
Bug 3784697 "EXPORT OF LOB FROM 9205 WITH PRE-9205 EXP GIVES EXP-3 (NO STORAGE DEF FOUND)"

These bugs were closed as a duplicate of:
Bug 3291851 "EXP-3: NO STORAGE DEFINITION FOUND FOR SEGMENT" (not a public bug)
Solution
The EXP-3 error only occurs when the export from the 9.2.0.5.0 or any higher release database (e.g. 9.2.0.6.0 or 10.1.0.4.0) is done with a lower release export utility, e.g. 9.2.0.4.0. The reason to run an export from a 9.2.0.5.0
or any higher release database (e.g. 9.2.0.6.0 or 10.1.0.4.0) with a 9.2.0.4.0 export utility, is to import the export dump file later into a target database of that lower release.

Note: if your intention is to export from a 9.2.0.6.0 database in order to import (back) into a 9.2.0.6.0 target database, there is no need to use a pre-9.2.0.5 release export utility. In that situation use the 9.2.0.6.0 export utility.

Workaround 1
-------------

Rerun the export and specify the parameter COMPRESS=Y (this is the default value for COMPRESS).

If you have a table with a secondary object, e.g. a LOB column, then this workaround might not work and you will re-produce the EXP-3 error during export. In that case, use workaround 2.

Workaround 2
-------------

a. Make a note of the export definition of exu9tne from $ORACLE_HOME/rdbms/admin/catexp.sql

b. Copy this to a new file and add:
"UNION ALL select * from sys.exu9tneb" to the end of the definition

e.g. In the 9.2.0.5.0 or any higher release source database (e.g. 9.2.0.6.0 or 10.1.0.4.0), the workaround view would be:


CREATE OR REPLACE VIEW exu9tne (
tsno, fileno, blockno, length) AS
SELECT ts#, segfile#, segblock#, length
FROM sys.uet$
WHERE ext# = 1
UNION ALL
SELECT * from sys.exu9tneb
/

c. Run this as the SYS user (!) against the database that needs to be exported.

d. Re-run the export as required.

e. Put back the original definition of exu9tne as obtained in step a.

Note: only use this workaround when absolutely necessary, and don't forget to put back the original definition of the view SYS.exu9tne as mentioned in step e.

Above is from metalink note.

Jaffar

Monday, February 12, 2007

Bind Variable Peeking

Happy New Peek

By Jaromir D.B. Nemec

It is 6 A.M. of the New Years day, some DBA recapitulate the resolutions while others are already sleeping. In neither of those situations an alerting call is welcome. Though, exactly that happens. “Our nightly Oracle job is hanging, but nothing was changed!”, is the plain message. “Nothing was changed, nothing was changed”, flashes through the DBA’s head before going to resolve the problem, “why do we ever need a sophisticated database if our business never changes!”.

The Problem

Let’s demonstrate the troubleshooting step by step[1]. If you want to execute the example in your own environment, skip to the Example section at the end where you find the scripts to perform the setup.

The first step is to understand the logic of the running batch. The algorithm is rather simple; the status of some entities (selected based on defined business rules) in the fact table should be updated.

The tricky part is that the fact table is versioned, i.e. each entity consists of a number of versions. The business policy says that only the status of the last version should be updated.

As our business is rather dynamic, each entity contains up to a thousand of versions. In order to handle such a large number of records, the table is partitioned by a business year, defined as the year of the transaction date.

The next step in the troubleshooting is to get the SQL statement that is currently running in the session of the hanging batch. This is not a big task, accessing the views v$session and v$sqltext you get the "hanging" statement. See Figure 1.


UPDATE MY_FACT A
SET STATUS = 'O'
WHERE ENTITY_ID = :B2 AND BUSINESS_YEAR = :B1 AND
VERSION =
(SELECT MAX(VERSION) FROM MY_FACT B
WHERE A.ENTITY_ID = B.ENTITY_ID AND
      A.BUSINESS_YEAR = B.BUSINESS_YEAR)
;


Figure 1 The “hanging” Statement

The statement is a relatively simple update corresponding well to the job description above. The entity and the business year are fixed with bind variables (the fact table is partitioned by a business_year – see the table DDL in the Example section). The last version is determined by using a subquery. Apparently the update statement we see is performed in a loop over all selected business entities.

The next thing you should do is to check the execution plan of this statement. With the explain plan statement and dbms_xplan utility this task is quickly done. See Figure 2.


EXPLAIN PLAN set statement_id = 'N3' into plan_table 
    FOR
UPDATE MY_FACT A
SET STATUS = 'O'
WHERE ENTITY_ID = :B2 AND BUSINESS_YEAR = :B1 AND
VERSION =
(SELECT MAX(VERSION) FROM MY_FACT B
 WHERE A.ENTITY_ID = B.ENTITY_ID AND
       A.BUSINESS_YEAR = B.BUSINESS_YEAR)
;

---

SELECT * FROM table(DBMS_XPLAN.DISPLAY('plan_table', 'N3','ALL')) ;

--

-----------------------------------------------------------------------------------

| Id | Operation                           | Name       |Rows |Cost|Pstart| Pstop |

-----------------------------------------------------------------------------------

|   0| UPDATE STATEMENT                    |            |    1|  16|      |       |

|   1|  UPDATE                             | MY_FACT    |     |    |      |       |

|*  2|   TABLE ACCESS BY LOC.IND.ROWID     | MY_FACT    |    1|   3|      |       |

|   3|    NESTED LOOPS                     |            |    1|  16|      |       |

|   4|     VIEW                            | VW_SQ_1    |    2|  10|      |       |

|   5|      SORT GROUP BY                  |            |    2|  10|      |       |

|   6|       PARTITION RANGE SINGLE        |            |  604|  10|  KEY |   KEY |

|*  7|        TABLE ACCESS BY LOC.IND.ROWID| MY_FACT    |  604|  10|  KEY |   KEY |

|*  8|         INDEX RANGE SCAN            | MY_FACT_IX1| 1208|   6|  KEY |   KEY |

|   9|     PARTITION RANGE SINGLE          |            |    1|   2|  KEY |   KEY |

|* 10|      INDEX RANGE SCAN               | MY_FACT_IX1|    1|   2|  KEY |   KEY |

-----------------------------------------------------------------------------------


Predicate Information (identified by operation id):

---------------------------------------------------


   2 - filter("BUSINESS_YEAR"=:B1)

   7 - filter("B"."BUSINESS_YEAR"=:B1)

   8 - access("B"."ENTITY_ID"=TO_NUMBER(:B2))

  10 - access("ENTITY_ID"=TO_NUMBER(:B2) AND "VERSION"="VW_COL_1")

 

Figure 2 The Execution Plan (output modified to optimise presentation)

What is wrong with this execution plan? First an index range scan on one index partition of our fact table is performed to access all versions of the processed entity (see access and filter predicates in lines 8 and 7 of the execution plan). Then using sort group by operation the maximal version is found. The nested loops operation (line 3) is performed exactly once. The reason is that the version number is unique within each entity. Therefore there exists only one record with the maximal version.

Where should be a problem to access thousand of records via index, sort them and get the record with the maximal version? This execution plan should not cause a hanging job, even if applied in a loop to process a number of entities.

Well, the explain plan apparently doesn’t provide the clue, let's try to effectively execute the statement and observe the execution plan with the autotrace option. Of course, it is not possible to execute the statement without initialising the bind variables. We simply choose some random entity_id and use the current business year. We will see the importance of the choice of the bind variables later on.



SQL> set autotrace traceonly;

SQL> --

SQL> variable ent_id number;

SQL> variable bus_year varchar2(10);

SQL> --

SQL> execute :ent_id := 4; :bus_year := '2006-01-01';


PL/SQL procedure successfully completed.


SQL> --

SQL> UPDATE MY_FACT A

 2  SET STATUS = 'O' WHERE ENTITY_ID = :ent_id AND BUSINESS_YEAR = TO_DATE(:bus_year,'YYYY-MM-DD') AND

 3  VERSION = (SELECT MAX(VERSION) FROM MY_FACT B

 4         WHERE A.ENTITY_ID = B.ENTITY_ID AND A.BUSINESS_YEAR = B.BUSINESS_YEAR)

 5  ;



1 row updated.


Execution Plan

----------------------------------------------------------                                                            

  0      UPDATE STATEMENT Optimizer=ALL_ROWS (Cost=16 Card=1 Bytes=52)                                                                                                           
                                                                                                                     
  1    0   UPDATE OF 'MY_FACT'                                                                                       

  2    1     TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'MY_FACT' (TABLE)(Cost=3 Card=1 Bytes=17)                                                                                                                                                                                                      

  3    2       NESTED LOOPS (Cost=16 Card=1 Bytes=52)                                                                

  4    3         VIEW OF 'VW_SQ_1' (VIEW) (Cost=10 Card=2 Bytes=70)                                                   

  5    4           SORT (GROUP BY) (Cost=10 Card=2 Bytes=32)                                                         

  6    5             PARTITION RANGE (SINGLE) (Cost=10 Card=604 Bytes = 9664)
                                                                                                                  
  7    6               TABLE ACCESS (BY LOCAL INDEX ROWID) OF 'MY_FACT' (TABLE)(Cost=10 Card=604 Bytes=9664)

  8    7                 INDEX (RANGE SCAN) OF 'MY_FACT_IX1' (INDEX)(Cost=6 Card=1208)                                                                                           
                                                                                                                  
  9    3         PARTITION RANGE (SINGLE) (Cost=2 Card=1)                                                             

 10    9           INDEX (RANGE SCAN) OF 'MY_FACT_IX1' (INDEX) (Cost=2 Card=1)                                                                                                    
                                                                                                              

Statistics

----------------------------------------------------------                                                             
       220  recursive calls                                                                                          

         4  db block gets                                                                                             

       125  consistent gets                                                                                          

         0  physical reads                                                                                            

        64  redo size                                                                                                

       637  bytes sent via SQL*Net to client                                                                         

       830  bytes received via SQL*Net from client                                                                   

         3  SQL*Net roundtrips to/from client                                                                        

        11  sorts (memory)                                                                                           

         0  sorts (disk)                                                                                             

         1  rows processed                                                                                           

SQL> ----

SQL> quit;

Figure 3 The Autotrace Run

Surprisingly, what we see is exactly the same execution plan like the one before. So again, even autotrace provides no hint to the root of the problem. Luckily, the next thought directs the troubleshooting in the right direction. You realise that the database version is 10g and therefore the system view v$sql_plan (introduced in 9i) shows the execution plan of the currently running statement. For further details how to extract the execution plan directly from the view v$sql_plan See [4].



----------------------------------------------------------------------

| Id | Operation                  | Name   |Rows | Cost|Pstart|Pstop |

----------------------------------------------------------------------

|   0| UPDATE STATEMENT           |        |     |   15|      |      |

|   1|  UPDATE                    | MY_FACT|     |     |      |      |

|*  2|   HASH JOIN                |        |    1|   15|      |      |

|   3|    PARTITION RANGE SINGLE  |        |  998|    7|  KEY |  KEY |

|*  4|     TABLE ACCESS FULL      | MY_FACT|  998|    7|  KEY |  KEY |

|   5|    VIEW                    | VW_SQ_1|  998|    7|      |      |

|   6|     SORT GROUP BY          |        |  998|    7|      |      |

|   7|      PARTITION RANGE SINGLE|        |  998|    7|  KEY |  KEY |

|*  8|       TABLE ACCESS FULL    | MY_FACT|  998|    7|  KEY |  KEY |

----------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

  2 - access("VERSION"="VW_COL_1" AND "A"."ENTITY_ID"="ENTITY_ID" AND "A"."BUSINESS_YEAR"="BUSINESS_YEAR")

  4 - filter(("ENTITY_ID"=:B2 AND "BUSINESS_YEAR"=:B1))

  8 - filter(("B"."ENTITY_ID"=:B2 AND "B"."BUSINESS_YEAR"=:B1))

Figure 4 The Real Execution Plan

The Cause

Apparently, the real execution plan is completely different from both previous plans generated by explain plan and autotrace. What is the cause of this difference? The bind variables contained in the SQL statement. In Oracle 9i a feature called "bind variable peek" was introduced. For a full explanation of this feature see [1] in the Reference Section. Here we give only a brief overview.

In some situations, when a statement is parsed Oracle peeks (here is the origin of the name of the feature) on the actual value of the bind variables and uses this information while optimising the execution plan. This is certainly an important information that could significantly influence the decision made by the optimiser.

Bind variable peek is not always possible, a typical example of a situation where a peek is not performed is the parsing of a statement within the explain plan statement. The syntax of the explain plan statement allows to process a query containing bind variables without actually defining these variables, so there is no chance to perform the peek. Neither value nor data type of those bind variables is known when the statement is parsed. The optimiser must use the default rule to calculate the selectivity of the predicates with bind variables. The result could be very different from the "real" execution plan based on the peeked values and known data types of bind variables. This is the reason why the execution plan generated by explain plan was so different from the actual one.

What was wrong with autotrace? A run with autotrace actually executes the statement, so the bind variables in this case must be well defined. The presented execution plan is not the “real” executed plan. After the execution an ordinary explain plan of the executed statement is performed and the result plan is sent to the output. There is no peeked information in the execution plan.

We see there is no difference between the explain plan and autotrace option; both can produce misleading execution plans if bind variables are used.

The Trap

The plan switch dependent on the actual value peeked is not an unknown issue. It is well described in case of a histogram defined of the predicate column. Yes of course, if a particular column contains skew data, there is probably more than one method how to access the table based on a predicate on this column (e.g. an index range scan and a table full scan). Depending on the actual value of the bind variable used first time when the statement is parsed, one of those access methods will be chosen. This decision will be used in all other executions of our statement; executions that potentially contain different values of the bind variables making the former optimise plan rather sub optimal. Once upon a time, the cursor will be aged out from the shared pool and the game starts from the beginning. Again, the first parse wins and defines the execution plan for all its successors.

The Partition Peek

Less known is the case in which through the peek the predicate relevant to partition pruning is enriched. Such a predicate (e.g. partition_key_column = :bind_variable) leads without the peek to the runtime pruning marked as KEY-KEY in the execution plan, resulting in using the optimiser statistics from the table level. If a peek is performed, the situation is completely different. Using the peeked value, the exact partition is known (marked as N-N in the columns Pstart and Pstop of the execution plan, where N is the partition number). In the letter case the optimiser statistics from the partition level are used. Depending on the actual peeked value, the optimiser statistics of the proper partition are used.

This is the last piece of puzzle needed to solve the problem.

1. Explain plan statement (as no peek could be done) uses the table level statistics providing an acceptable execution plan. (see Figure 2)
2. The whole year 2005 the bind variable peek results in the pruning of the 2005 partition (i.e. the bind variable peek returns the value 2005 for the business_year) that means that the optimiser statistics from the 2005 partition are used. This leads to the same execution plan as in point 1.
3. Shortly after allocating the partition for the year 2006 there may occur a situation in which the bind variable peek hits the business year 2006. This leads to the usage of statistics from the partition 2006 and in the end to a different execution plan. As the 2006 partition is nearly empty, a hash join with a full scan is considered as the best access.
4. So far this is not bad as this execution plan is the right one for the 2006 partition. The problem is that the same plan is used repeatedly for all the subsequent executions, especially for those entities stored in the 2005 partition. This makes the processing effectively hanging as the partition for the year 2005 is heavily populated. Full partition scan performed in a loop is a deadly scenario.

The Example

The whole story is demonstrated using an example. This example was tested in Oracle 9i and 10g (versions 9.2.0.5.0 and 10.2.0.1.0).

To create the environment use the script set_up.sql.

The batch behaviour is simulated in the script run.sql. Figure 5 shows the difference between a good and a bad run.



SQL> @run

SQL> -- run bind variable peek example

SQL> -- J.Nemec 2006

SQL> -- http://www.db-nemec.com

SQL> -- this script demonstrates a instability caused by bind

SQL> --  variable peeking.

SQL> -- Due to peek the partition level statistics from different

SQL> --  partition are used leading to a dramatic plan change

SQL> --

SQL> -- Statement 1

SQL> --- peek decides to use statistics from the partition 2005 (good)

SQL> -- Statement 2

SQL> --- peek decides to use statistics from the partition 2006 (bad)

SQL> --

SQL> begin

 2  declare

 3    i number;

 4    cursor c is

 5    select entity_id, max(business_year) business_year

 6    from my_fact where entity_id between 1 and 1000

 7    group by entity_id order by entity_id;

 8   begin

 9    for x in c loop

10    update my_fact a

11    set status = 'D'

12    where entity_id = x.entity_id and business_year = x.business_year

13    and version = (select max(version) from my_fact b

14                               where a.entity_id = b.entity_id and

15                                     a.business_year = b.business_year);

16    dbms_output.put_line('updating ' || x.entity_id ||

17                      ' ' || x.business_year);

18    end loop;

19   end;

20  end;

21  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:09.86

SQL> -- now define the loop with cursor returning the date - 2006 first

SQL> -- note, that the statement must differ from the previous one

SQL> -- to enforce the statement to be parsed with BVP

SQL> begin

 2  declare

 3    i number;

 4    cursor c is

 5    select entity_id, max(business_year) business_year

 6    from my_fact where entity_id between 4 and 1000

 7    group by entity_id order by entity_id;

 8   begin

 9    for x in c loop

10    update my_fact a

11    set status = 'O' --- status was changed

12    where entity_id = x.entity_id and business_year = x.business_year

13    and version = (select max(version) from my_fact b

14                               where a.entity_id = b.entity_id and

15                                     a.business_year = b.business_year);

16    dbms_output.put_line('updating ' || x.entity_id ||

17                         ' ' || x.business_year);

18    end loop;

19   end;

20  end;

21  /

PL/SQL procedure successfully completed.

Elapsed: 00:27:24.03

Figure 5 Two completely different executions of the same statement

To update one thousand of records in the fact table takes approximately ten seconds (see the first run). The second run, differing only in the fact that the first update peeks the new partition, needs nearly half an hour to be completed. This degradation factor is the reason why the job seems to be hanging.

To get an overview of the engaged execution plans use the script generate_plan.sql.

The Peek

To make the story even worse, to check what value would be used in a bind variable peek isn’t the only important thing. Even more important is the information if the peek will be performed at all. In other words, not all database clients perform the bind variable peek. For example while using the dbms_sql to parse a statement no bind variable peek is performed (as a feature). Other clients can omit the peek as a cause of bugs – see [2].

How can we see if a bind variable peek was performed? The best way is to use a 10053 trace while parsing the statement. The Figure 6 shows the relevant part of the trace file which contains the evidence that a peek was done.



*******************************************

Peeked values of the binds in SQL statement

*******************************************

kkscoacd

Bind#0

 oacdty=02 mxl=22(21) mxlc=00 mal=00 scl=00 pre=00

 oacflg=13 fl2=206001 frm=00 csi=00 siz=24 off=0

 kxsbbbfp=07ac1c54  bln=22  avl=02  flg=09

 value=1

Bind#1

 oacdty=12 mxl=07(07) mxlc=00 mal=00 scl=00 pre=00

 oacflg=13 fl2=206001 frm=00 csi=00 siz=8 off=0

 kxsbbbfp=07ac455c  bln=07  avl=07  flg=09

 value="1/1/2005 0:0:0"

Figure 6 – bind variable peek in a 10053 trace file

A particularly interesting question is how to get the correct execution plan of a statement with bind variables without executing the statement (this could be desired for long running statements). An answer (for select statements) can be found under [3] in the Reference Section.

Conclusion

In addition to the well known case of the predicate on a column with a histogram, there exists at least one other case in which the execution plan may be instable caused through the bind variable peeking. A partition pruning based on a predicate with bind variables. This is a particularly important scenario in case of range partitioned fact tables organized as rolling windows. These tables contain at least two types of partitions, those filled up and those pre allocated to future loads. As the optimiser statistics of both types are very different, the risk of getting the wrong execution plan in case of peeking “in the wrong partition” is relatively high. The interesting thing is that the problems could be quasi pre-programmed, as allocation of new partitions is time-dependent.

References

[1] Bind variable peek http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14211/optimops.htm#sthref1254

[2] Thin driver doesn’t peek
http://asktom.oracle.com/pls/ask/f?p=4950:8:::::F4950_P8_DISPLAYID:19398056075583#48426602526853

[3] Generation of an explain plan without actually executing the statement. http://www.jlcomp.demon.co.uk/faq/how_to_explain.html

[4] Using V$SQL_PLAN to extract the execution plan http://www.oracleadvice.com/Tips/plantip.htm

Jaromir D.B. Nemec is a freelancer specializing in data warehouse and integration solutions. He can be reached under http://www.db-nemec.com

Example

The attachment contains following scripts:

Set_up.sql – script for set up the environment

run.sql – script simulating the batch job

generate_plan.sql – script for generation of the involved execution plans

Last revision 14.7.2006

[1] Note that the example is strongly simplified to clearly demonstrate the core principles. Also the troubleshooting steps are artificially chosen to demonstrate the pitfalls.

Friday, January 26, 2007

Limit of 1050 columns for ANSI joins

Please reference:

http://www.halfcooked.com/mt/archives/000936.html

There's a metalink bug "4369235 - ANSI joins may fail with ORA-1445" that
states that  this was fixed in the 10.2.0.2 patchset and that the workaround
is to "Transform the ansi join queries manually"