Troubleshooting ORA-1555

Troubleshooting ORA-1555 (Doc ID 1307334.1)



Concepts/Definitions

The ORA-1555 errors can happen when a query is unable to access enough undo to build
a copy of the data at the time the query started. Committed “versions” of blocks are
maintained along with newer uncommitted “versions” of those blocks so that queries can
access data as it existed in the database at the time of the query. These are referred to as
“consistent read” blocks and are maintained using Oracle undo management.

See Note 40689.1 - ORA-1555 "Snapshot too old" - Detailed Explanation for more about
these errors.

Diagnosing

Due to space limitations, it is not always feasible to keep undo blocks on hand for the life of the instance. Oracle Automatic Undo Management (AUM) helps to manage the time frame that undo blocks are stored. The time frame is the “retention” time for those blocks. 

There are several ways to investigate the ORA-1555 error. In most cases, the error is a legitimate problem with getting to an undo block that has been overwritten due to the undo “retention” period having passed. 

AUM will automatically tune up and down the “retention” period, but often space limitations or configuration of the undo tablespace will throttle back continuous increases to the “retention” period. 

The error message is reported in the user session and often is not captured in the alert log. The user could see a message like

Using rollback segment functionality:
ORA-1555: snapshot too old (rollback segment too small)

or

Using AUM:
ORA-01555: snapshot too old: rollback segment number 9 with name "_SYSSMU9$" too small

If the error is captured in the alert.log, you would see something like

Tue May 26 16:16:57 2009
ORA-01555 caused by SQL statement below (SQL ID: 54yn3n36w24ft, Query Duration=922 sec, SCN: 0x0007.8a55f4e3)

Initial Investigation

Rollback Segments:
With Oracle 10g and later versions of Oracle, you can still use a Rollback Segments configuration. ORA-1555 errors in that environment still follow older guidelines as described in 

Note 10579.1 - How many Rollback Segments to Have
Note 107085.1 - Tuning Rollback Segments
Note 69464.1 - Rollback Segment Configuration & Tips

Automatic Undo Management:
The database will be self tuning for undo when using Automatic Undo Management. This does not eliminate ORA-1555 completely, but does minimize ORA-1555 as long as there is adequate space in the undo tablespace and workloads tend to follow repeatable patterns. In some cases with periodic changes to workload (large data updates particularly with LOB data) the self tuning of undo can become aggressive and lead to undo issues.

Note 461480.1 - FAQ Automatic Undo Management (AUM) / System Managed Undo (SMU)
Note 135053.1 -How to Create a Database with Automatic Undo Management
Note 268870.1 - How to Shrink the datafile of Undo Tablespace
Note 231776.1 - How to switch a Database from Automatic Undo Management (AUM) back to using Rollback Segments
Note 296863.1 - How to Keep All UNDO Segments from Being Offlined in Oracle 10g - Fast Ramp-Up

LOB Issues:
Out-of-row LOB undo is maintained in the LOB segment. So the UNDO tablespace and undo retention is not associated with most LOB ORA-1555 issues. Instead the LOB column is created using either PCT_VERSION or RETENTION to manage how much space within blocks or time transpires before the LOB undo is overwritten. In environments with high updates, deletes on rows including LOBs, the chances of ORA-1555 on LOB undo is very high.

PCT_VERSION and RETENTION are not auto-tuned. To “tune” those configuration settings, you must change the values for PCT_VERSION or RETENTION. Changes to UNDO_RETENTION does not change LOB retention time frames.

Note 162345.1 - LOBS - Storage, Read-consistency and Rollback
Note 386341.1 - How to determine the actual size of the LOB segments and how to free the deleted/unused space above/below the HWM
Note 563470.1 – Lob retention not changing when undo_retention is changed
Note 422826.1 – How to identify LOB Segment Use PCTVERSION or RETENTION from Data Dictionary

Error Tracing

Undo error tracing can be done for normal undo operations using the following events:

NOTE: Normal undo operations will be indicated in the error message in that the error message includes a segment name like

…. name "_SYSSMU1$" too small

If the error doesn’t show a segment name

… name "" too small

the problem is often related to LOB undo
If using pfile:

event="10442 trace name context forever, level 10"

If using spfile:

Alter system set events '10442 trace name context forever, level 10';

Reproduce the ORA-1555 error and upload the trace file to Oracle Support.

LOB undo error tracing is more difficult. Set additional tracing events as follows:

Start Session 1
Alter session set events '10046 trace name context forever, level 12';
Reproduce the error
Exit Session 1

Start Session 2
Alter session set events '10051 trace name context forever, level 1';
Reproduce the error
Exit Session 2

Start Session
Alter session set events '1555 trace name errorstack forever, level 3';
Reproduce the error
Exit Session 3

Additional resources to review:
Note 846079.1 – LOBs and ORA-1555 troubleshooting
Note 253131.1 –Concurrent Writes May Corrupt LOB Segment When Using Auto Segment Space Management
Note 467872.1 – TROUBLESHOOTING GUIDE (TSG) – ORA-1555

V$UNDOSTAT Analysis

The V$UNDOSTAT view holds undo statistics for 10 minute intervals. This view 
represents statistics across instances, thus each begin time, end time, and 
statistics value will be a unique interval per instance.

This does not track undo related to LOB
Note 262066.1 – How To Size UNDO Tablespace For Automatic Undo Management
Note 1112363.1 – When Does Undo Used Space Become Available?
Note 240746.1 – 10g NEW FEATURE on AUTOMATIC UNDO RETENTION

Diagnostics Scripts

Refer to Note 746173.1 : Common Diagnostic Scripts for AUM problems
and Note 877613.1 : AUM Common Analysis/Diagnostic Scripts

Common Causes/Solutions

Using Rollback Segments functionality:

* Problem happening on SYSTEM tablespace that still uses old Rollback Segment functionality even when configured for Automatic Undo Management (AUM).

* There are not enough rollback segments to manage the undo needed for long running queries.

* Rollback Segments are too small and undo is overwritten before long running queries complete.

Reference:
Note 69464.1 – Rollback Segment Configuration & Tips
Note 10630.1 – ORA-1555: “Snapshot too old” – Overview
Note 862469.1 – ORA-604 & ORA-1555 Rollback Segment 0 with Name “System” Too Small

Using Automatic Undo Management (AUM):

* TUNED_UNDORETENTION in V$UNDOSTAT around the time of the error is lower than the QUERY DURATION indicated in the error message. This is a legitimate ORA-1555 and if queries are going to run for very long time frames, UNDO_RETENTION may need to be larger. Auto-tuned retention may not be able to keep up with the undo workload and staying within space limitations on the UNDO tablespace.

* LOB updates and/or deletes are frequent and a higher PCT_VERSION is required to provide enough space in the LOB Segment to accommodate the LOB undo. RETENTION on LOBs that are updated or deleted frequently can run into problems holding UNDO long enough for queries.

* QUERY DURATION shown in the error message is 30+ years and therefore, no amount of undo will satisfy the consistent read blocks.

Note 750195.1 – ORA-1555 Shows Unrealistic Query Duration (billions of seconds)

* QUERY DURATION shown in the error message is 0. NOTE: This has been filed as a bug on many release levels and has been very difficult to narrow down to a specific problem.

Note 761128.1 – ORA-1555 Error when Query Duration as 0 Seconds 

* QUERY DURATION is lower than TUNED_UNDRETENTION. Undo header information can sometimes get overwritten or you could be seeing a bug.

* TUNED_UNDORETENTION stays very high and UNDO tablepsace continues to grow continuously or getting space errors.

Note 1112431.1 – Undo Remains Unexpired When Using Non-autoextensible Datafiles for Undo Tablespace.

================================================================================

Rollback Segment Configuration & Tips (Doc ID 69464.1)


ROLLBACK SEGMENT CONFIGURATION & TIPS
======================================

Good rollback segment configuration is crucial to a well tuned Oracle database.
The following should help you recognize and solve problems arising from 
inappropriate numbers or sizes of rollback segments. 

This bulletin will answer the following questions for Oracle databases prior to 
use of Automatic Undo Managment (AUM) with Oracle9i and Oracle10g:

     -What is a Rollback Segment?

     -How many concurrent transactions can a rollback segment handle?

     -How do you find out the max extents for a rollback segment?

     -How do you determine which user is using what rollback segment?

     -How does Oracle determine which rollback segment to use?

     -Why the ORA-1555 snapshot too old problem?

     -How many rollback segments you need to have?

     -How do you look for a rollback segment contention?

     -How do you decide the size of your rollback segments?

     -What are the guidelines on setting the OPTIMAL parameter for rollback
      segments?

     -How do you allocate a rollback segment for a transaction?

     -How do you bring rollback segments on line automatically?


What is a Rollback Segment ?
============================

A rollback segment is made up of multiple extents that consist of several 
'rollback entries' which keeps undo information for transactions. Rollback 
segments are needed for:

   1. Read-consistant database information.
   2. Database recovery.
   3. Rolling back uncommitted transactions.


How many transactions can a rollback segment handle concurrently ?
==================================================================

The maximum number of transactions allowed for a rollback segment is dependent 
on the 'transaction table slots' found in the header of the rollback segment. 
The first block of the rollback segment is called the segment header. The 
segment header contains a structure called a transaction table. The transaction
table contains slots (or rows) which keep track of all active transactions 
within that rollback segment. Each active transaction assigned to this rollback 
segment will also have a corresponding slot or row assigned to it in the
transaction table. Thereby, with a larger Oracle block size there is more room 
for a larger transaction table. In other words, more slots or rows can be 
inserted into the transaction table. The extent map is also kept in the 
segment header.

 * Before 7.3

For example, on Solaris:


TRANSACTIONS          BLOCKSIZE
------------          ---------
    32                    2K
    69                    4K
   142                    8K
 
 * For 7.3

    Also for Solaris :

TRANSACTIONS          BLOCKSIZE
------------          ---------
    31                    2K
    67                    4K
   140                    8K


How do you find out the maximum extents for a rollback segment?
===============================================================

Run the following query:

        sql> select SEGMENT_NAME, STATUS, MAX_EXTENTS
                from dba_rollback_segs;

This will tell you the MAX_EXTENTS that the rollback segment was created with; 
however, the first block of each segment contains an extent map for each extent. 
The maximum number of extents allowed is therefore a function of the database 
block size of each extent map entry. This is a common extent mapping
table:

DATA BLOCK SIZE     GREATEST MAXEXTENT VALUE
---------------     ------------------------
512                           25
1K                            57
2K                            121
4K                            249
8K                            505

Thereby, the MAX_EXTENTS value in "dba_rollback_segs" may not be accurate 
because the MAX_EXTENTS cannot exceed the GREATEST MAXEXTENT VALUE.


How to determine which user is using what rollback segment?
===========================================================

SELECT   r.name "RB NAME ", p.pid "ORACLE PID",
         p.spid "SYSTEM PID ", NVL (p.username, 'NO TRANSACTION') "OS USER", 
         p.terminal
FROM v$lock l, v$process p, v$rollname r, v$session s
WHERE    l.sid = s.sid(+)
AND      s.paddr = p.addr
AND      TRUNC (l.id1(+)/65536) = r.usn
AND      l.type(+) = 'TX'
AND      l.lmode(+) = 6
ORDER BY r.name;


How does Oracle determine which rollback segment to use?
========================================================

The rules are:

   1. Always assign to the rollback segment which has the least number of 
      active transactions.

   2. If two or more rollback segments have the same "least number of active 
      txns" , then assign to the one which is after the last one used. This 
      ensures that undo is kept for a longer time.


Why the ORA-1555 snapshot too old problem?
==========================================

  See Note.40689.1: CAUSES AND SOLUTIONS TO ORA-1555 "SNAPSHOT TOO OLD".
  See Note.45895.1: ORA-1555 "SNAPSHOT TOO OLD" in Very Large Databases (VLDB).


How MANY rollback segments do you need to have?
===============================================

Oracle7 keeps a transaction table in the header of every rollback segment. 
Every transaction must have update access to the transaction table for its 
rollback segment. You need enough rollback segments to prevent transactions 
contending for the transaction table.


How do you find out a transaction table contention?
===================================================

Any non-zero value for 'undo header' in the CLASS column of "v$waitstat" 
indicates contention for rollback segment header blocks.

Example:

SVRMGR> select * from v$waitstat;

CLASS                   COUNT         TIME
------------------   ----------    ----------
data block               0              0
sort block               0              0
save undo block          0              0
segment header           0              0
save undo header         0              0
free list                0              0
system undo header       0              0
system undo block        0              0
undo header              0              0
undo block               0              0


Note that 'undo header' value is zero, hence NO contention.
Another way to find out is by running the following query .. a non-zero 
value for the 'WAITS' column indicates a rollback segment contention.


SVRMGR> select name, waits
     2> from v$rollstat s, v$rollname n
     3> where s.usn=n.usn;


NAME                           WAITS
------------------------------ ----------
SYSTEM                            0
R01                               0
R02                               0
R03                               0


To calculate the number of rollback segments, you need to know how many 
transactions are likely to be active at any given time. This depends on what 
users are doing. Note that queries do not need transaction table access, so not 
all active users will have active transactions (OLTP applications tend to have 
many short transactions).

General recommendation for how many rollback segments:

  For OLTP : One rollback segment for every ten users.
  For BATCH jobs : One rollback segment for each concurrent job.


How do you decide what SIZE your rollback segments should be?
=============================================================

There are two issues that need to be considered when deciding if your segment 
is large enough. First, you want to make sure that the transactions will not 
cause the head of the rollback segment to wrap around too fast and catch the 
tail. This causes the segment to extend in size. Second, if you have long 
running queries that access data that frequently changes, you want to make sure 
that the rollback segment doesn't wrap around and prevent the construction of a 
read consistent view (look at "Why the ORA-1555 snapshot too old problem?" 
above).


Determining the proper rollback segment size:
---------------------------------------------

The size needed for a rollback segment depends directly on the transaction 
activity of your database. You need to be concerned about the activity during 
normal processing of the database, not with rare or semi-frequent large 
transactions. We will deal with these special cases separately.


Same size extents:
------------------

For sizing rollback segments extents, Oracle strongly recommend that each 
extent be of the same size.


INITIAL extent size:
--------------------

Choose the INITIAL storage parameter from the list 2KB, 4KB, 8KB, 16KB, 32KB ...
etc. This will insure that when you drop the extent you can reuse all the 
freed space without waste.


NEXT extent size:
-----------------

Use the same value for NEXT as INITIAL.


MINEXTENTS:
-----------

Set MINEXTENTS to 20, this will make it unlikely that the rollback segment 
needs to grab another extent because the extent that should move into is still 
being used by an active transaction.

To find out the size of the rollback segments needed to handle normal 
processing on the database you need to some testing. A good test is to start 
with small rollback segments and allow your application to force them to extend.
 Here are the steps to run such test:

   1. Create a rollback segment tablespace.

   2. Select a number of rollback segments to test and create them in the 
      tablespace.

   3. Create the rollback segments so that all extents are the same size. 
      Choose an extent size that you will suspect will need between 10 to 30 
      extents when the segments grow to full size.

   4. Each rollback segment should start with two extents before the test is 
      done. This is the minimum number of extents any rollback segment can have.

   5. Activate only the rollback segments that you are testing by making the 
      status "online". The only other segment that should be "online" is the 
      system rollback segment.

   6. Run transactions and load data typical of the application.

   7. Watch for rollback segment contention. How to find out?

   8. Watch for the maximum size a rollback extends to.

The maximum size any one of the rollback segments reaches during the test is 
the size you want to use when configuring. We will call this size the "minimum 
coverage size." If you see rollback contention, adjust the number of the 
rollback segments (increase) and rerun the test. Also, if the largest size 
requires fewer than 10 or more than 30, it is a good idea to lower or raise the 
extent size, respectively, and rerun the test.


Sizing rollback segments for STEADY AVERAGE transaction rate:
-------------------------------------------------------------

For databases where the transaction rate base has NO fluctuation, there is a 
straightforward way to configure the tablespace:

Create a tablespace that will fit your calculated number of  rollback 
segments with the "minimum coverage size" you have determined. Follow the 
guidelines above for INITIAL and NEXT extents.

As a safety net, allocate some additional space in the tablespace to allow 
segments to grow if necessary. If you select to do this, use the OPTIMAL 
feature to force the rollback segments to free up any additional space they 
allocate beyond their determined size requirement.


Sizing rollback segments for FREQUENT LARGE transaction rate:
-------------------------------------------------------------

A large transaction is one in which there is not enough space to create all 
rollback segments of the size necessary to handle its rollback information. 
Since we can't depend on the segment shrinking in time to allow repeated large 
transactions, OPTIMAL is not really an option for this environment. There are 
basically two options that you can choose from for your rollback segment 
tablespace.

A) Reduce the number of segments so that all are large enough to hold the 
   largest transactions. This option will introduce contention and will cause 
   some degradation in performance. It is a reasonable choice if performance is
   not extremely critical.

B) Build one or more large rollback segments and make sure that large 
   transactions use these segments. The SET TRANSACTION USE ROLLBACK SEGMENT 
   command is necessary to control the placement of these large transactions. 
   This option is difficult to implement if large transactions are being run 
   with adhoc queries and there is no systematic control of large transactions. 
   This option is recommended in an environment where the large transactions 
   are issued from a controlled environment. In other words, an application 
   which will set the transaction to the appropriate rollback segment.


Sizing rollback segments for INFREQUENT LARGE transaction rate:
---------------------------------------------------------------

Use the OPTIMAL feature to set up a flexible rollback segment scheme, one 
in which you are not concerned about which rollback segment the large 
transaction falls upon. The key is to leave enough free space in the rollback 
tablespace that the largest transaction's rollback information can fit entirely 
into it. To do this, create the rollback tablespace with the space needed for 
your calculated number of segments and their "minimum coverage size" plus this 
additional space. Set the OPTIMAL for each segment equal to the minimum 
coverage size.

What you will see is that the large transaction will randomly make one of the 
segments grow and consume the free space, but the segment will release the 
space before the next large transaction comes along. Note that you are 
sacrificing some performance for this flexibility.


What are the guidelines on setting the OPTIMAL parameter for rollback segments?
===============================================================================

When you create or alter a rollback segment, you can use the storage 
parameter OPTIMAL, which applies only to rollback segments, to specify the 
optimal size of the rollback segment in bytes. You should carefully assess the 
kind of transactions the system runs when setting the OPTIMAL parameter for 
each rollback segment.

For a system that executes long running transactions frequently, OPTIMAL 
should be large so that Oracle does not have to shrink and allocate extents 
frequently. Also, for a system that executes long queries on active database, 
OPTIMAL should be large to avoid "snapshot too old" ORA-1555 errors.

OPTIMAL should be smaller for a system that mainly executes short transactions 
and queries so that the rollback segments remain small enough to be cached in  
memory, thus improving system performance. You should not make OPTIMAL smaller 
than the "minimum coverage size". Otherwise, performance will suffer due to 
excessive segment resizing.


How do you allocate a rollback segment for a transaction?
=========================================================

Oracle assigns a transaction to a rollback segment using simple rules. You 
can issue a SET TRANSACTION statement with the USE ROLLBACK SEGMENT clause to 
choose a specific rollback segment for your transaction.

Example : SET TRANSACTION USE ROLLBACK SEGMENTS large_rs1;

To assign a transaction to a rollback segment explicitly, the rollback 
segment must be online and the SET TRANSACTION ... statement must be the first 
statement of the transaction.

After the transaction is committed, Oracle will automatically assign the 
next transaction to any available rollback segment rollback segment unless the 
new transaction is explicitly assigned to a specific rollback by the user.


How do you bring rollback segments on line automatically?
=========================================================

Private rollback segments could be brought online automatically at database 
startup only if they are listed in the init.ora initialization parameter 
rollback_segments.

The number of public rollback segments that will be brought online only at database 
startup will depend on the values of the initialization parameters transactions 
and transactions_per_rollback_segments.  The minum number of public rollback
segments acquired by the instance will be transactions/transactions_per_rollback_segments. 

Public rollback segments could also be brought online specifying them in the 
parameter rollback_segments.


=================================================================================

ORA-01555 "Snapshot too old" in Very Large Databases (if using Rollback Segments) (Doc ID 45895.1)


The following causes are discussed:

Cause  Description
-----  -----------
  1    A long query takes place at the same time as an update
  2    Read consistent view during "insert"
  3    Delayed block cleanout on old committed updates
  4    Fetch across commit
  5    Rollback segment corruption


Cause #1:  
=========  
        
A long query takes place at the same time as an update.  When a query reads 
a block that has been updated or committed since the query began, it attempts 
to construct a "read consistent view" of the block by fetching the old value 
from the rollback segment.  The intention is to get a "snapshot" of the 
database as of a single point in time - the time the query started. However,
if the transaction commits, the rollback segment slot containing the old value 
will be freed for reuse.  If the query is very "long", as is frequently the  
case in VLDB (Very Large Databases), there may be enough time for some other 
transaction to overwrite the old value in the slot (because rollback segments 
are circular).  Hence, when we revisit the slot to retrieve the old value,  
we find that the read consistent "snapshot" we are trying to construct is 
"too old", so we get an ORA-01555.  
  
    Solution 1a:
    ------------
    In VLDB, the first and most important solution is to prevent updates 
    from occurring simultaneously with long queries on the same tables. 
    Such "workload partitioning" (separating read-only DSS operations 
    from read-write OLTP operations) is a fundamental principal of data 
    warehousing regardless of the database software.  Partitioning can be  
    physical, by separating data into "read-only" and "read-write"  
    tablespaces, or temporal, by scheduling updates and queries on the 
    same table at different times.  
    Recommended reading:
    "The Data Warehouse Toolkit", by Ralph Kimball, John Wiley & Sons, 1996.  
  
  
    Solution 1b:
    ------------
    Increase the rollback segment(s).  Larger segments allow more time for
    undo writes to wrap around, decreasing the possibility of an overwrite
    during a query. This is probably the easiest solution to an ORA-01555, 
    but it can be expensive in VLDB.  It is especially inappropriate when
    the volume of data being updated is low relative to the amount of data 
    being queried from the same table.  
  
  
    Solution 1c:
    ------------
    Add more rollback segments.  Additional rollback segments can help 
    because each new transaction has a bigger "pool" of rollback segments
    to choose from, diluting the odds that committed values will be 
    overwritten by new undo. This approach can be effective in VLDB if 
    some control is exercised over which rollback segments are used for 
    which transactions (which is another way to partition work).  It can 
    be done with SET TRANSACTION USE ROLLBACK SEGMENT or with Oracle 
    Parallel Server (OPS). See Solutions 3a and 3b for more information.  
  
  
    Solution 1d:
    ------------
    Don't use the OPTIMAL storage parameter in the rollback segment.  
    Avoiding use of OPTIMAL means that rollback segment extents are not 
    returned to the tablespace after use (and thus remain usable for read
    consistent view).  This is yet a third way to hold onto the committed
    values in the rollback segments as long as possible.  
  
  
    Solution 1e:
    ------------
    Delay commits.  As long as an update is not committed, the rollback
    segment slot containing the old value is not released and therefore 
    cannot be overwritten.  This will work in VLDB if the application 
    can enforce delayed commits and the number of updates is relatively 
    low (or else you might run out of rollback space).  
  
  
    Solution 1f:
    ------------
    Write shorter well-tuned, parallel queries.  Tune the database.  
    Utilize Oracle's data warehousing features such as partition views 
    and disk affinity.  Utilize Oracle Parallel Server (OPS) for internode
    query parallelism.  The point being: The less time it takes for a 
    query to complete, the lesser the chance an old value in the rollback
    segment will be overwritten during the query.  On the other hand, if 
    queries, loads and updates can be tuned well enough to outrun an 
    ORA-01555 error, such that updates can run simultaneously with queries,
    then perhaps these operations can be tuned well enough to implement  
    temporal partitioning such that updates "don't have to" run  
    simultaneously with queries.  
  
  
    Solution 1g:
    ------------
    Use read-only snapshots.  This solution may suffice in cases where a
    table is not too large to support a second copy of the data:  the 
    snapshot.  Here, the original table accepts ad hoc updates but not 
    queries while the snapshot accepts only queries.  At a chosen interval,
    all updates against the original table are automatically propagated  
    to the snapshot. For VLDB, this can be a nice solution for relatively
    smaller tables which might be joined with very large tables.  
  
  
Cause #2:
=========  
  
Read consistent view during "insert".  Read consistency is usually 
unnecessary when querying a table into which rows are simultaneously being
inserted.  When performing a full table scan, Oracle always scans up to 
whatever the "High Water Mark" (HWM) was when the query began, so any rows
inserted above the HWM are simply ignored by the query.  But inserts can 
still be the cause of an ORA-01555 if rows are inserted under the HWM. This
is especially true in VLDB and OPS where it is common practise to 
pre-allocate extents to tables. Pre-allocation of extents, as well as large
deletes, can leave considerable free space under the HWM.  Any new rows 
inserted into these free blocks after the query begins will be read by the
query and consequently require read consistent view.  
  
All solutions discussed in Cause #1 also apply here.  In addition, the 
following solutions can reduce or eliminate read consistent view during 
inserts:  
  
  
    Solution 2a:
    ------------
    Again, partition the workload so that queries do not run against 
    tables simultaneously with inserts.  Short PARALLEL loads can 
    minimize the amount of time a table is unavailable for queries.  
  
  
    Solution 2b:
    ------------
    Don't pre-allocate extents, or pre-allocate only before a large 
    insert operation and only as much space as will be used immediately. 
    In OPS, consider using the "!" option with GC_FILES_TO_LOCKS to 
    dynamically allocate blocks to the free list as needed.  
  
  
    Solution 2c:
    ------------
    Don't "delete *" from a table.  Use "truncate" instead, which frees
    all blocks and flattens the HWM.  In general, be aware that large 
    deletes leave gaps under the HWM that can invite ORA-01555 errors.  
  
  
    Solution 2d:
    ------------
    Use SQL*Loader direct path, which writes above the HWM.  
  
  
Cause #3:
=========  
  
Delayed block cleanout on old committed updates.  An update operation 
completes and commits; the updated blocks are not touched again until a 
long-running query begins.  Delayed Block Cleanout (DBC) has never been 
done on the blocks.  This can result in a scenario which happens only 
under specific circumstances in VLDB, causing ORA-01555 errors when NO 
updates or inserts are being committed on the same blocks a query is 
retrieving.  
  
All of the following must be true for an ORA-01555 to occur in this case:  
  
    (i) An update completes and commits and the blocks are not  
    touched again until...  
  
    (ii) A long query begins against the previously updated blocks.  
  
    (iii) During the query, a considerable amount of DML takes place, 
    though not on the previously updated blocks which the query is 
    currently fetching.  
  
    (iv) Under condition (iii) there is so much DML relative to available 
    rollback space that the rollback segment used in the first update 
    wraps around, probably several times.  
  
    (v) Under condition (iv), the commit SCN of the first update is 
    cycled out of the rollback segment.  
  
    (vi) Under condition (iv) the lowest SCN in the rollback segment is 
    pushed higher than the read consistent SCN in the query.   
  
    (Note:  The read consistent SCN is what the query uses to construct 
     a read consistent view.  Any block which has an SCN higher than this
     was obviously updated after the query started and requires rollback).  
  
The above conditions imply that when a query reaches a block that has been 
updated but not cleaned out, the query quickly learns that the update 
committed, and accordingly cleans out the block.  But because the update 
SCN is no longer in the rollback segment (condition (v)), the query doesn't
know WHEN the update committed.  This is important because if the commit 
happened before the query began, the current value in the block can be used 
by the query; but if the commit happened after, the old value must be fetched
from the rollback segment. Now, because the rollback segment wrapped in (iv),
we know that the update SCN can't be higher than the lowest SCN in the 
rollback segment, which gives us a nice upper bound.  If we only knew that 
the read consistent SCN was higher than this upper bound, we would know that
the update committed before the query started.  But we don't know this 
because of condition (vi), so we can't even accurately "estimate" the update
SCN.  Hence, we get an ORA-01555.  
  
Most of the rollback space solutions discussed in #1 also apply here, although
generally speaking they will be more difficult to implement successfully.  
Unless it is clear that the available rollback space was grossly underestimated
relative to normal DML activity levels, it is best not to attempt to resolve
ORA-01555 on DBC issues in VLDB by increasing the size or number of rollback 
segments. The following solutions are recommended:  
  
  
    Solution 3a: 
    ------------
    If the above described conditions occur regularly, consider 
    implementing OPS and partitioning off excessive DML to separate 
    instances.  Since each instance writes to its own set of rollback 
    segments you can avoid trouncing some committed values.  Careful 
    planning and partitioning are critical for success and to avoid 
    introducing pinging costs.  A similar and perhaps simpler option 
    is to partition some of the DML onto a completely separate database.  
  
  
    Solution 3b:
    ------------
    Judicious use of SET TRANSACTION USE ROLLBACK SEGMENT in batch 
    processing can partition updates as effectively as 3a.  
  
  
    Solution 3c:
    ------------
    As of 7.3, set DELAYED_LOGGING_BLOCK_CLEANOUTS = true (the default)
    and commit more frequently during updates. This feature was added 
    in 7.3 to boost Oracle Parallel Server performance because it helps
    avoid pinging on DBC.  It also helps avoid ORA-01555 errors because
    it causes blocks in the buffer cache to be cleaned out immediately 
    after a commit.  Setting this parameter to false facilitates the 
    same behavior as in previous versions of Oracle.  
  
  
    Solution 3d:
    ------------
    Prior to 7.3, run full table scans and analyze indexes immediately 
    after a large update (or at least scan through the updated/inserted
    blocks).  This also forces immediate cleanout.  
  
  
  
Cause #4:
=========  
  
"Fetch across commit".  This is where a program performs a looping operation
involving repeated reads of a datum which it is also changing and committing.
The "ORA-01555" is signaled because each read must be read consistent with 
the read-consistent SCN from the time the cursor was opened, and eventually 
we overwrite the originals.  
  
"Fetch across commit" is not supported by the ANSI standard. Oracle allows it,
but it is the DBA's/application developer's responsibility to understand the
implications.  There are no special consideration for VLDB; this section is
included only for completeness.  More extensive discussions of this problem,
along with detailed solutions, can be found in the articles referenced below.  
  
  
    Solution 4a:
    ------------
    Rewrite the program to avoid fetch across commit.  
  
  
    Solution 4b:
    ------------
    Commit less often.  
  
  
    Solution 4c:
    ------------
    Add an "order by 1" statement to the query.  This creates a temp 
    segement to hold the results of the order by and will not require 
    consistent read.  
  
  
Cause #5:  
=========
  
  
Rollback segment corruption.  Very rare cause of an ORA-01555.  No 
VLDB-specific considerations, although some older operating systems may 
have problems managing extremely large files integrally, which can lead 
to corruption.  
  
  
    Solution 5a:
    ------------
    Check for undocumented parameters in the init.ora, particularly those
    that include the word CORRUPT, and report them to Oracle Support 
    Services.  If such a parameter exists, the "solution" will be to 
    recreate the database.  
  
  
    Solution 5b:
    ------------
    Make sure you're running a certified system configuration for your 
    release of Oracle.  Check system logs and Oracle logs for errors.  
  
  
    Solution 5c:
    ------------
    Contact Oracle Support Services for further assistance on suspected 
    corruption after all possible causes discussed in this paper have 
    been rigorously investigated.  
  


Search Words:
=============
=================================================================================
The ORA-01555 is caused by Oracle read consistency mechanism.  If you have a long running SQL that starts at 10:30 AM, Oracle ensures that all rows are as they appeared at 10:30 AM, even if the query runs until noon!
Oracles does this by reading the "before image" of changed rows from the online undo segments.  If you have lots of updates, long running SQL and too small UNDO, the ORA-01555 error will appear.
From the docs we see that the ORA-01555 error relates to insufficient undo storage or a too small value for the undo_retention parameter:
ORA-01555: snapshot too old: rollback segment number string with name "string" too small Cause: Rollback records needed by a reader for consistent read are overwritten by other writers. Action: If in Automatic Undo Management mode, increase the setting of UNDO_RETENTION. Otherwise, use larger rollback segments.
You can get an ORA-01555 error with a too-small undo_retention, even with a large undo tables.  However, you can set a super-high value for undo_retention and still get an ORA-01555 error.  Also see these important notes on commit frequency and the ORA-01555 error
The ORA-01555 snapshot too old error can be addressed by several remedies:
  1.  Re-schedule long-running queries when the system has less DML load.
  2. Increasing the size of your rollback segment (undo) size.  The ORA-01555 snapshot too old also relates to your setting for automatic undo retention.
  3. Don't fetch between commits.

Avoiding the ORA-01555 error

Steve Adams has good notes on avoiding the ora-1555 snapshot too old error:
  • Do not run discrete transactions while sensitive queries or transactions are running, unless you are confident that the data sets required are mutually exclusive.
  • Schedule long running queries and transactions out of hours, so that the consistent gets will not need to rollback changes made since the snapshot SCN. This also reduces the work done by the server, and thus improves performance.
  • Code long running processes as a series of restartable steps.
  • Shrink all rollback segments back to their optimal size manually before running a sensitive query or transaction to reduce risk of consistent get rollback failure due to extent deallocation.
  • Use a large optimal value on all rollback segments, to delay extent reuse.
  • Don't fetch across commits. That is, don't fetch on a cursor that was opened prior to the last commit, particularly if the data queried by the cursor is being changed in the current session.
  • Use a large database block size to maximize the number of slots in the rollback segment transaction tables, and thus delay slot reuse.
  • Commit less often in tasks that will run at the same time as the sensitive query, particularly in PL/SQL procedures, to reduce transaction slot reuse.
  • If necessary, add extra rollback segments (undo logs) to make more transaction slots available.  

Oracle ACE Steve Karam also has advice on avoiding the ORA-01555: Snapshot too old, rollback segment too small with UNDO sizing.
Question: I am updating 1 million rows on Oracle 10g, and I run it as batch process, committing after each batch to avoid undo generation. But in Oracle 10g I am told undo management is automatic and I do not need run the update as batch process. Answer: Automatic undo management was available in 9i as well, and my guess is you were probably using it there. However, I’ll assume for the sake of this writing that you were using manual undo management in 9i and are now on automatic. Automatic undo management depends upon the  UNDO_RETENTION parameter, which defines how long Oracle should try to keep committed transactions in UNDO segments. However, the UNDO_RETENTION parameter is only a suggestion. You must also have an UNDO tablespace that’s large enough to handle the amount of UNDO you will be generating/holding, or you will get "ORA-01555: Snapshot too old, rollback segment too small" errors. You can use the UNDO advisor to find out how large this tablespace should be given a desired UNDO retention, or look online for some scripts…just Google for: oracle undo size Oracle 10g also gives you the ability to guarantee undo. This means that instead of throwing an error on SELECT statements, it guarantees your UNDO retention for consistent reads and instead errors your DML that would cause UNDO to be overwritten. Now, for your original question…yes, it’s easier for the DBA to minimize the issues of UNDO when using automatic undo management. If you set the UNDO_RETENTION high enough with a properly sized undo tablespace you shouldn’t have as many issues with UNDO.
How often you commit should have nothing to do with it, as long as your DBA has properly set UNDO_RETENTION and has an optimally sized UNDO tablespace. Committing more often will only result in your script taking longer, more LGWR/DBWR issues, and the “where was I” problem if there is an error (if it errors, where did it stop?). Lastly (and true even for manual undo management), if you commit more frequently, you make it more possible for ORA-01555 errors to occur. Because your work will be scattered among more undo segments, you increase the chance that a single one may be overwritten if necessary, thus causing an ORA-01555 error for those that require it for read consistency. It all boils down to the size of the undo tablespace and the undo retention, in the end…just as manual management boiled down to the size, amount, and usage of rollback segments. Committing frequently is a peroxide band-aid: it covers up the problem, tries to clean it, but in the end it just hurts and causes problems for otherwise healthy processes.

Oracle guru Joel Garry offers another great explanation of  the machinations of the ORA-01555 error:
You have to understand, in general, ORA-01555 means something else is causing it to die - Oracle needs to be able to create a read-consistent view of the table for the query as it looked at the start of the query, and it is unable to because something has overwritten the undo necessary to create such a view. Since you have the same table over and over in your alert log, that probably means the something is the previous queries your monitoring software is making, not ever releasing the transaction. Something like:
  1. 10AM query starts, never ends
  2. 11AM query starts, never ends
  3. Noon query starts, never ends
  4. 1PM query starts
Meanwhile, the undo needed from the 10AM query for the 1PM query gets overwritten, 1PM query dies with ORA-01555, since it needs to know what the table looked like before the 10AM query started mucking with it. Also if the query is a loop with a commit in it, it can do the same thing without other queries, as eventually the next iteration requires looking back at it's own previous first generation, can't do it, and barfs. Upping undo_retention may help, or may not, depending on the real cause. Also checkv$undostat, you may still have information in there if this is ongoing (or may not, since by the time you check it the needed info may be gone).
=======================================================================

Comments

  1. Hi, guys!
    Here is really good explanation how to fix the issue (ORA-01555 due to corrupted lobs)
    http://dbpilot.net/2018/02/05/ora-1555-snapshot-too-old-due-to-corrupted-lobs/

    ReplyDelete
  2. Hi, guys!
    Here is really good explanation how to fix the issue (ORA-01555 due to corrupted lobs)
    http://dbpilot.net/2018/02/05/ora-1555-snapshot-too-old-due-to-corrupted-lobs/
    [link text](http://dbpilot.net/2018/02/05/ora-1555-snapshot-too-old-due-to-corrupted-lobs/)

    ReplyDelete

Post a Comment

Oracle DBA Information