CLUSTERING FACTOR DEMYSTIFIED PART – I

CLUSTERING FACTOR DEMYSTIFIED PART – I




The clustering_factor measures how synchronized an index is with the data in a table.  A table with a high clustering factor is out-of-sequence with the rows and large index range scans will consume lots of I/O.  Conversely, an index with a lowclustering_factor is closely aligned with the table and related rows reside together of each data block, making indexes very desirable for optimal access.
Rules for Oracle indexing

To understand how Oracle chooses the execution plan for a query, you need to first learn how the SQL optimizer decides whether or not to use an index.

Oracle provides a column called clustering_factor in the dba_indexes view that provides information on how the table rows are synchronized with the index. The table rows are synchronized with the index when the clustering factor is close to the number of data blocks and the column value is not row-ordered when theclustering_factor approaches the number of rows in the table.
For queries that access common rows with a table (e.g. get all items in order 123), unordered tables can experience huge I/O as the index retrieves a separate data block for each row requested.
If we group like rows together (as measured by the clustering_factor in dba_indexes) we can get all of the row with a single block read because the rows are together. 
Note:  As we see grouping related rows together can make a huge reduction in disk I/O, and Oracle has embraced this row sequencing idea in 10g and beyond with thesorted hash cluster, a fully supported way to ensure that related rows always reside together on the same data block. 
Today we have choices for row sequencing.  We can even group related rows from several tables together with multi-table hash clusters, or we can use single table clusters, or manual row re-sequencing (CTAS with ORDER BY) to achieve this goal:
To illustrate, consider this query that filters the result set using a column value:

select
   customer_name
from
   customer
where
   ustomer_state = ‘New Mexico’;


Here, the decision to use an index vs. a full-table scan is at least partially determined by the percentage of customers in New Mexico. An index scan is faster for this query if the percentage of customers in New Mexico is small and the values are clustered on the data blocks.

Why, then, would a CBO choose to perform a full-table scan when only a small number of rows are retrieved? Perhaps it is because the CBO is considering the clustering of column values within the table.

Four factors work together to help the CBO decide whether to use an index or a full-table scan: the selectivity of a column value, the db_block_size, the avg_row_len, and the cardinality. An index scan is usually faster if a data column has high selectivity and a low clustering_factor.
 
This column has small rows, large blocks, and a low clustering factor.In the real-world, many Oracle database use the same index for the vast majority of queries.  If these queries always to an index range scan (e.g. select all orders for a customer), them row re-sequencing for a better clustering_factor can greatly reduce Oracle overhead:


Oracle provides several storage mechanisms to fetch a customer row and all related orders with just a few row touches:
  • Sorted hash clusters - New in 10g, a great way to sequence rows for super-fast SQL
     
  • Multi-table hash cluster tables - This will cluster the customer rows with the order rows, often on a single data block.
     
  • Periodic reorgs in primary index order - You can use the dbms_redefinitionutility to periodically re-sequence rows into index order.
     
To maintain row order, the DBA will periodically re-sequence table rows (or use asingle-table, or multi-table cluster) in cases where a majority of the SQL references a column with a high clustering_factor, a large db_block_size, and a smallavg_row_len. This removes the full-table scan, places all adjacent rows in the same data block, and makes the query up to thirty times faster.

On the other hand, as the clustering_factor nears the number of rows in the table, the rows fall out of sync with the index. This high clustering_factor, where the value is close to the number of rows in the table (num_rows), indicates that the rows are out of sequence with the index and an additional I/O may be required for index range scans.

Even when a column has high selectivity, a high clustering_factor, and smallavg_row_len, there is still indication that column values are randomly distributed in the table, and an additional I/O will be required to obtain the rows. An index range scan would cause a huge amount of unnecessary I/O as shown in below, thus making a full-table scan more efficient.
This column has large rows, small blocks, and a high clustering factor.
In sum, the CBOs decision to perform a full-table vs. an index range scan is influenced by the clustering_factordb_block_size, and avg_row_len. It is important to understand how the CBO uses these statistics to determine the fastest way to deliver the desired rows.


=========================================================

SYMPTOMS:
There are several symptoms that point to the buffer cache as a problem. If the following are observed in the Top 5 Timed Events (V$session_event, v$system_event), then investigate further:
- LATCH: CACHE BUFFER CHAINS (CBC): This generally points to hot blocks in the cache.
LATCH: CACHE BUFFER LRU CHAINS: contention indicates that the buffer cache is too small, and block replacement is excessive.
BUFFER BUSY WAITS: This is almost always an application tuning issue. Multiple processes are requesting the same block.
DB FILE SEQUENTIAL READS : wait for a single block to be read synchronously from disk
DB FILE SCATTERED READS : wait for multiple blocks to be read concurrently from disk
- DB FILE PARALLEL READS : wait for a synchronous multiblock read
FREE BUFFER WAITS: When the DBWn processes do not make sufficient free buffers to meet the demand, free buffer waits are seen.
   Let’s discuss above wait events in detail.
  As I discussed in my earlier article on buffer cache architecture, the buffer cache holds in memory versions of data blocks for faster access. Each buffer in the buffer cache has an associated element the buffer header array, externalized as x$bh. Buffer headers keep track of various attributes and state of buffers in the buffer cache.  These buffer headers are chained together in a doubly linked list and linked to a hash bucket. There are many hash buckets (# of buckets are derived and governed by _db_block_hash_buckets parameter). Access (both inspect and change) to these hash chains are protected by cache buffers chains latches.
Further, buffer headers can be linked and delinked from hash buckets dynamically.
Here is a simple algorithm to access a buffer:
1. Hash the data block address (DBAs: a combination of tablespace, file_id and block_id) to find hash bucket.
2. Get the CBC latch protecting the hash bucket.
3. If success (CBC latch obtained), walk the hash chain, reading buffer headers to see if a specific version of the block is already in the chain.
      If  specific version of block found,
         access the buffer in buffer cache, with protection of buffer pin/unpin actions.
      If specific version of block not found,
          Get cache buffer LRU chain latch to find a free buffer in buffer cache,
          unlink the buffer header for that buffer from its current chain,
          link that buffer header with this hash chain,
          release the latch and
          read block in to that free buffer in buffer cache with buffer header pinned.
   If not success(CBC latch not obtained) ,
      spin for spin_count times and
      go to step 2.
      If that latch was not got with spinning, then
         sleep (with exponentially increasing sleep time with an upper bound),
         wake up, and go to step 2.
Let;s talk about them one by one .
                                       LATCH : CACHE BUFFER CHAINS
The CBC latches are used when searching for, adding, or removing a buffer from the buffer cache.
  Multiple users looking for the same blocks or blocks in the same hash bucket and hence trying to obtain the latch on the same bucket. Such blocks are called hot blocks. Wait event can occur if
- Latch can’t be obtained due to contention.
- Latch was obtained by one session and was held for long while walking the chain as the chain was long . Hence, others trying to obtain the latch have to wait.
Some of the activities that can cause this wait event are:
1. Multiple users trying to read code/description etc. from look up tables.
Soln:
- Identify the hot block
  . Segments by logical reads in AWR report
  . v$segment_statistics (object_name, statistic_name, value)
    This query shows the top 10 statistic values, by object and statistic name.
    > SELECT *
      FROM (SELECT owner, object_name,
     object_type, statistic_name, sum(value)
    FROM V$SEGMENT_STATISTICS
    GROUP BY owner, object_name, object_type, statistic_name
    ORDER BY SUM(value) DESC)
      WHERE ROWNUM <= 10;
- Find out the file# (P1) and block# (P2) from v$session_wait
  SQL:>select sid, event, P1 File#, P2 Block#
             from v$session_wait
             where event like ‘%buffer busy%';
- Find out the segment whose data is on this file/block
SQL>select owner, segment_name
            from dba_extents
            where file_id = &file#
                and &block# between block_id and block_id+blocks-1;
Modify the application to use PL/SQL to read the look up table once and store code/descriptions in local variables which can be accessed later rather than reading from the table.
2. Simultaneous update/select operations on the same block:
 CBC latch contention can become worse if a session is modifying the data block that users are reading because readers will clone a block with uncommitted changes and roll back the changes in the cloned block. SInce all these clone copies will go in the same bucket and be protected by the same latch, length of the cbc chain gets longer, it takes more time to walk the chain , latch is held for a longer time and hence users trying to obtain the latch  have to wait for longer period.
Soln:
- Modify the application to commit frequently so that CR clones need not be created.
3. When multiple users are running nested loop joins on a table and accessing the table driven into via an index. Since the NL join is basically a
  For all rows in i
     look up a value in j  where j.field1 = i.val
  end loop
then table j’s index on field1 will get hit for every row returned from i. Now if the lookup on i returns a lot of rows and if multiple users are running this same query then the index root block is going to get hammered on the index j(field1).
e.g.
  select t1.val, t2.val
    from t1, t2
   where t1.c1 = {value}
     and t2.id = t1.id;
Here, if there are a large no. of rows in table t1 where c1 = value, Oracle will repeatedly hit the index on table t2 (id).
In this case SQL statements with high BUFFER_GETS (logical reads) per EXECUTION are the main culprits. Multiple concurrent sessions are executing the same inefficient SQL that is going after the same data set.
SOLUTION:
———-
  Reducing contention for the cache buffer chains latch will usually require reducing logical I/O rates by tuning and minimizing the I/O requirements of the SQL involved. High i/O rates could be a sign of a hot block (meaning a block highly accessed).
-  Find SQL ( Why is application hitting the block so hard? ). SQL by buffer gets in AWR report
- Replace Nested loop joins with hash join
- Hash Partition the index with hot block
- Create a reverse key index
- Use Hash clusters
4. This can also result from the use of sequences if cache option is not used.
Soln:
- Create sequences with cache option.
5. Inefficient SQL
   Multiple concurrent sessions are executing the same inefficient SQL that is going after the same data set.
Soln:
 Reducing contention for the cache buffer chains latch will usually require reducing logical I/O rates by tuning and minimizing the I/O requirements of  the SQL involved. High I/O rates could be a sign of a hot block (meaning a block highly accessed).
Spread data out to reduce contention
- Export the table, increase the PCTFREE significantly, and import the data. This minimizes the number of  rows per block, spreading them over many blocks. Of course, this is at the expense of storage and full table scans operations will be slower
- For indexes, you can rebuild them with higher PCTFREE values, bearing in mind that this may increase the height of the index.
- Consider reducing the block size : If the current block size is 16K, you may move the table or recreate the index in a tablespace with an 8K block size. This too will negatively impact full table scans operations. Also, various  block sizes increase management complexity.
6. Index leaf chasing from very many processes scanning the same unselective index with very similar predicate.
Soln:
- Modify the application to use more selective predicate/index.
7. Large size of buffer cache so that length of hash chains increases and hence time taken to walk the chain increasing thereby increasing latch contention.
Soln:
- Increase the parametere _db_block_hash_buckets so that no. of buckets increases and hence length of hash chain decreases.
                      BUFFER BUSY WAITS
                      —————————
  This is the second wait event which is encountered in case of logical I/O. It is a wait event for a buffer that is being used in an unsharable way or is being read into the buffer cache. Buffer busy waits are there when latch is obtained on the hash bucket but  the buffer is “busy” as
- Another session is reading the block into the buffer
- Another session holds the buffer in an incompatible mode to our request
These waits indicate read/read, read/write, or write/write contention. Buffer busy waits are common in an I/O-bound Oracle system.This wait can be intensified by a large block size as more rows can be contained within the block.
  These waits can be on :
- Data Blocks        : Multiple sessions trying to update/delete rows in same block
- Segment Header : Multiple sessions trying to insert records in same segment
- Undo header       : A lot of undo is being generated and hence contention on undo segment header.
- Undo block        : Contention on the same undo block as multiple users querying the records  in same undo block
Diagnosis :
- Top 5 events in AWR report
- Buffer wait statistics section of AWR report- shows the type of data block, the wait is on i.e. data block, segment header or undo block (v$waitstat)
- V$session_event
- V$system_event
- v$waitstat
Solution :
- Wait on Data Blocks (v$waitstat) : If the wait is on the data block, we can move “hot” data to another block to avoid this hot block or use smaller blocks (to reduce the No. of rows per block, making it less “hot”).
    . Identify segments with buffer busy waits
         v$segment_statistics (object_name, statistic_name, value)
    . Eliminate HOT blocks from the application.
    . Check for repeatedly scanned / unselective indexes.
    . Reduce the number of rows per block : Try rebuilding the object with a higher PCTFREE and lower PCTUSED
    . Check for ‘right- hand-indexes’ (indexes that get inserted into at the same point by many processes) – Rebuild as reverse key indexes
    . Increase INITRANS and MAXTRANS (where users are simultaneously accessing the same block),
      When  a DML occurs on a block, the lock byte is set in the block and any user accessing the record(s) being changed must check the Interested Transaction List (ITL) for info regarding building the before image of the block. The Oracle database writes info into the block, including all users who are “interested” in the state of the block, in the ITL. Increasing INITRANS  will create the space in the block to allow multiple ITL slots (for multiple user acces).
       
Wait on segment header (v$waitstat)
  . If using Manual segment space management,
      – Increase of number of FREELISTs and FREELIST GROUPs
      – Increase the PCTUSED-to-PCTFREE gap
      – Use ASSM
- Wait on undo header
  . Use Automatic undo management.
  . If using manual undo management, increase no. of rollback segments .
Wait on undo block
   . Try to commit more often
   . Use larger rollback segments or larger undo tablespace.
To practically simulate buffer busy wait and identify the hot block, visit this link.
                              LATCH: CACHE BUFFER LRU CHAIN
.   This is the first wait which occurs in case of physical I/O i.e. server process is unable to find the desired block in buffer cache and hence needs to read it from disk into cache. Before reading the block, it searches for the free buffers by scanning the LRU list. To scan the LRU list, it must obtain latch on the LRU chain. The cache buffer lru chain latch is also acquired  when writing a buffer back to disk, specifically when trying  to scan the LRU (least recently used) chain containing all the dirty blocks in the buffer cache (to delink from LRU list and link to WRITE LIST – MAIN).
Diagnosis :
- Top 5 timed events in AWR report
- v$session_event
- v$system_event
   This wait is encountered when
1. Multiple users issuing Statements that repeatedly scan  large unselective indexes or perform full table scans and hence scan LRU chain to look for free buffers.
Soln:
- Identify SQL causing large amount of logical I/O –  SQL by buffer gets in AWR report
- Tune SQL to reduce no. of buffers required
2. Insufficient size of buffer cache which leads to frequent aging out of the cached blocks which need to be reread from the disk requiring search for free buffers and hence need to obtain LRU latch.
Soln:
- Use AMM/ASMM
- Increase Db buffer cache size as per
  . ADDM recommendation
  . v$db_cache_advice
-  Increase the parameter _db_block_lru_latch to increase the no. of latches protecting lru chain so that contention on lru chain is reduced.
                                                 FREE BUFFER WAITS
  After the latch has been obtained on LRU list and server process is scanning the LRU list to search for free buffers, we are waiting for a free buffer but there are none available in the cache because there are too many dirty buffers in  the cache.
Causes :
1. Buffer cache is too small
Soln:
- Use AMM/ASMM
- Increase DB buffer cache size as per
  . ADDM recommendation
  . v$db_cache_advice
2. DBWR is slow in writing modified buffers to disk and is unable to keep up to the write
requests.
Soln:
- Increase I/O bandwidth by striping the datafiles.
- If asynchronous I/O is supported
     . Enable asynchronous I/O (DISK_ASYNCH_IO=true)
       – if multiple CPU’s are there,
            Increase the no. of database writers (DB_WRITER_PROCESSES)
         else
            Configure I/O slaves (set DBWR_IO_SLAVES to non zero )
  else
         Configure I/O slaves (set DBWR_IO_SLAVES to non zero )
       
3. Large sorts and full table scans and / or scan of unselective indexex are filling the  cache with modified blocks faster than the  DBWR is able to write to disk. As the data is not pre-sorted, the data for a key value is scattered across multiple blocks and hence more no. of blocks are read  to get records for a key.
Soln:
- Pre-sorting or reorganizing data can help
                        READ WAITS
  Waits encountered when data blocks are being read physically from the disk. These waits are always present even in well tuned Databases. Following strategy should be employed to decide of tuning is required:
- Compare total read time with the baseline value.
- If total wait time is excessive fraction of total DB time, two cases can be there
  1. Average time for each read is normal but no. of disk reads are high
     Soln:
     – Try to decrease no. of reads : Find SQL statements issuing most logical/physical rads (AWR report) and tune them.
     – Large no. of physical reads may be due to small DB_CACHE_SIZE – Use advisor to increase cache size
  2. Average time for each read is high (> 15ms)
     Soln:
     – Avoid reading from disk 
           . Cache tables/indexes
           . Use KEEP / RECYCLE pool
     – Stripe the datafiles for more bandwidth.
    – Reduce checkpointing. Whenever there is a checkpoint, data is written to disk. Since disk is busy servicing writes, reads get delayed.
 Various read waits can be separately tuned as follows:
                  DB FILE SCATTERED READ
  Waits for multiple blocks (up to DB_FILE_MULTIBLOCK_READ_COUNT) to be read from disk while performing full table scans / Index fast full scans (no order by) . This wait is encountered because:
- As full table scans are pulled into memory, they are scattered throughout the buffer cache , since it is highly unlikely that they fall into contiguous buffers.
- Since a large no. of blocks have to be read into the buffer cache, server process has to search for a large no. of free/usable blocks in buffer cache which leads to waits.
solutions :
- Try to cache frequently used small tables to avoid readingthem into memory over and overagain.
- Optimize multi-block I/O by setting the parameter DB_FILE_MULTIBLOCK_READ_COUNT
- Partition pruning: Partition tables/indexes so that only a partition is scanned.
- Consider the usage of multiple buffer pools
- Optimize the SQL statement that initiated most of the waits. The goal is to minimize the number of physical and logical reads.
     . Should the statement access the data by a full table scan or index FFS?
     . Would an index range or unique scan  be more efficient?
     . Does the query use the right driving table?
     . Are the SQL predicates appropriate for hash or merge join?
     . If full scans are appropriate, can parallel query improve the response time?
- Make sure all statistics are representative of the actual data. Check the LAST_ANALYZED date
The objective is to reduce the demands for both the logical and physical I/Os, and this is best achieved through SQL and application tuning.
                  DB FILE SEQUENTIAL READ
  These waits generally indicate a single block read (e.g. Idex full scan : an index read with order by clause).
Causes and solutions :
1. Use of an unselective index
Soln:
- Check indexes on the table to ensure that the right index is being used
- Check the column order of the index with the WHERE clause of the Top SQL statements
2. Fragmented Indexes : If the DBA_INDEXES.CLUSTERING_FACTOR of the index approaches the number of blocks in the table, then most of the rows in the table are ordered. This is desirable.
 However, if the clustering factor approaches the number of rows in the table, it means the rows in the table are randomly ordered and thus it requires more I/Os to complete the operation.
Soln:
You can improve the index’s clustering factor by rebuilding the table so that rows are ordered according to the index key and rebuilding the index thereafter.
3. High I/O on a particular disk or mount point
Soln:
- Use in partitioning to reduce the amount of blocks being visited
- Make sure optimizer statistics are up to date
- Relocate ‘hot’ datafiles :    Place the tables used in
the SQL statement on a faster part of the disk.

- Consider the usage of multiple buffer pools and cache frequently used indexes/tables in the KEEP pool
Inspect the execution plans of the SQL statements that access data through indexes
Tune the I/O subsystem
to return data faster.


4. Bad application design
Soln:
- Optimize the SQL statement that initiated most of the waits. The goal is to minimize the number of physical and logical reads.Inspect the execution plans of the SQL statements that access data through indexes
1.     Examine the SQL statement to see if
. it is doing a full-table scan when it should be using an index,
. it is using a wrong index or
. it can be rewritten to reduce the amount of data it retrieves
. it is appropriate for the SQL statements to access data through index lookups or would full table scan be more efficient?
. the statements use the right drivin table i.e. join order is proper?

5. 
Range scans on data spread in many different blocks
Soln:
- check that range scans should not be using reverse indexes.
- Load the data in sorted manner on the colums on which range scans will be there.
6.  Consider increasing the buffer cache to see if the expanded size will accommodate the additional blocks, therefore reducing the I/O and the wait.
                  DB FILE PARALLEL READ
  Waits for synchronous multiblock reads. The process has issued multiple I/O requests in parallel to read from datafiles into memory (not during parallel Query/DML) i.e. the process batches many single block I/O requests together and issues them in parallel.
Some of the most common wait problems and potential solutions are outlined here:
Sequential Read Indicates many index reads—tune the code (especially joins)
Scattered Read Indicates many full table scans—tune the code; cache small tables
Free Buffer Increase the DB_CACHE_SIZE; shorten the checkpoint; tune the code
Buffer Busy Segment header—add freelists or freelist groups
Buffer Busy Data block—separate “hot” data; use reverse key indexes; use
smaller blocks; increase initrans (debatable); reduce block
popularity; make I/O faster
Buffer Busy Undo header—add rollback segments or areas
Buffer Busy Undo block—commit more; larger rollback segments or areas

======================================================================


INDEX TUNING



Comments