Troubleshooting 'latch: cache buffers chains' Wait Contention
High cache buffers chains latch indicates that there is likely to be something reading a lot of buffers. Typically the SQL with the most gets is likely to be that which is contending:
The Query with SQL_ID a9nchgksux6x2 is reading 100x more buffers than the 2nd most 'hungry' statement and CPU and Elapsed are off the 'scale' of the report. This is a prime candidate for the cause of the CBC latch issues.
You can also link this information to the Top Segments by Logical Reads:
The top object read is SALES and the top SQL is a select from SALES which appears to correlate towards this being a potential problem select.
This SQL should be investigated to see if the Gets per Exec or the Executions figure per hour has changed in any way (comparison to previous reports would show this) and if so the reasons for that change investigated and resolved.
In this case the statement is reading > 10,000 buffers per execution and executing > 15,000 times
so both of these may need to be adjusted to get better performance.
NOTE:163424.1 - How To Identify a Hot Block Within The Database Buffer Cache.
===========================================================
How to Identify Hot Blocks Within the Database Buffer Cache that may be Associated with 'latch: cache buffers chains' Wait Contention
Run the query above a few times to establish the addresses (ADDR) that consistently has the highest sleeps.
The TCH column identifies the number of times each block has been hit by a SQL statement, so you can use this information to identify a hot block. The higher the value of the TCH column, the more frequent the block is accessed by SQL statements.
The following query joins with DBA_OBJECTS to find the objects waiting, the misses, sleeps, etc:
The output is something like following:
Note---> This informationmation taken from oracle metalink. all copy rights oracle only.
PURPOSE
This article describes how to troubleshoot issues where there are significant waits for 'latch: cache buffers chains'.
TROUBLESHOOTING STEPS
"latch: cache buffers chains" contention is typically encountered because SQL statements read more buffers than they need to and multiple sessions are waiting to read the same block.
If you have high contention, you need to look at the statements that perform the most buffer gets and then look at their access paths to determine whether these are performing as efficiently as you would like.
Typical solutions are:-
If you have high contention, you need to look at the statements that perform the most buffer gets and then look at their access paths to determine whether these are performing as efficiently as you would like.
Typical solutions are:-
- Look for SQL that accesses the blocks in question and determine if the repeated reads are necessary. This may be within a single session or across multiple sessions.
- Check for suboptimal SQL (this is the most common cause of the events) - look at the execution plan for the SQL being run and try to reduce the gets per executions which will minimize the number of blocks being accessed and therefore reduce the chances of multiple sessions contending for the same block.
- If you can identify a poor SQL and have identified a better plan, you can direct the optimizer to use this plan using the following article:Document 1400903.1 Directing Plans with Baselines/Profiles Using coe_load_sql_baseline.sql / coe_load_sql_profile.sql (shipped with SQLT)
Further information can be found in:
Document 1476044.1 Resolving Issues Where Waits for 'latch: cache buffers chains' Seen Due to Poorly Tuned SQL
Document 390374.1 Oracle Performance Diagnostic Guide (OPDG)
Document 163424.1 How To Identify a Hot Block Within The Database Buffer Cache.
Document 62172.1 Understanding and Tuning Buffer Cache and DBWR
Document 390374.1 Oracle Performance Diagnostic Guide (OPDG)
Document 163424.1 How To Identify a Hot Block Within The Database Buffer Cache.
Document 62172.1 Understanding and Tuning Buffer Cache and DBWR
Worked example:
Problem: Database is slow and 'latch: cache buffers chains' is high in the waits in AWR.
Start with Top 5 Waits:
Top 5 Timed Events Avg %Total ~~~~~~~~~~~~~~~~~~ wait Call Event Waits Time (s) (ms) Time Wait Class ------------------------------ ------------ ----------- ------ ------ ---------- latch: cache buffers chains 74,642 35,421 475 6.1 Concurrenc CPU time 11,422 2.0 log file sync 34,890 1,748 50 0.3 Commit latch free 2,279 774 340 0.1 Other db file parallel write 18,818 768 41 0.1 System I/O -------------------------------------------------------------
High cache buffers chains latch indicates that there is likely to be something reading a lot of buffers. Typically the SQL with the most gets is likely to be that which is contending:
SQL ordered by Gets DB/Inst: Snaps: 1-2 -> Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code. -> Total Buffer Gets: 265,126,882 -> Captured SQL account for 99.8% of Total Gets CPU Elapsed Buffer Gets Executions per Exec %Total Time (s) Time (s) SQL Id -------------- ------------ ------------ ------ -------- --------- ------------- 256,763,367 19,052 13,477.0 96.8 ######## ######### a9nchgksux6x2 Module: JDBC Thin Client SELECT * FROM SALES .... 1,974,516 987,056 2.0 0.7 80.31 110.94 ct6xwvwg3w0bv SELECT COUNT(*) FROM ORDERS ....
The Query with SQL_ID a9nchgksux6x2 is reading 100x more buffers than the 2nd most 'hungry' statement and CPU and Elapsed are off the 'scale' of the report. This is a prime candidate for the cause of the CBC latch issues.
You can also link this information to the Top Segments by Logical Reads:
Segments by Logical Reads -> Total Logical Reads: 265,126,882 -> Captured Segments account for 98.5% of Total Tablespace Subobject Obj. Logical Owner Name Object Name Name Type Reads %Total ---------- ---------- -------------------- ---------- ----- ------------ ------- DMSUSER USERS SALES TABLE 212,206,208 80.04 DMSUSER USERS SALES_PK INDEX 44,369,264 16.74 DMSUSER USERS SYS_C0012345 INDEX 1,982,592 .75 DMSUSER USERS ORDERS_PK INDEX 842,304 .32 DMSUSER USERS INVOICES TABLE 147,488 .06 -------------------------------------------------------------
The top object read is SALES and the top SQL is a select from SALES which appears to correlate towards this being a potential problem select.
This SQL should be investigated to see if the Gets per Exec or the Executions figure per hour has changed in any way (comparison to previous reports would show this) and if so the reasons for that change investigated and resolved.
In this case the statement is reading > 10,000 buffers per execution and executing > 15,000 times
so both of these may need to be adjusted to get better performance.
Note: This is a simple example where there is a high likelihood that the 'biggest' query is the culprit but it is not always the 'Top' SQL that causes the problem. For example, contention may occur on a statement with a smaller total if it is only executed a small number of times so that it may not appear as the top sql. It may still make millions of buffer gets, but will appear lower in the list because other sqls are performing many times, just not contending.
So, if the first SQL is not the culprit then look at the others.
So, if the first SQL is not the culprit then look at the others.
REFERENCES
NOTE:62172.1 - Understanding and Tuning Buffer Cache and DBWRNOTE:163424.1 - How To Identify a Hot Block Within The Database Buffer Cache.
===========================================================
How to Identify Hot Blocks Within the Database Buffer Cache that may be Associated with 'latch: cache buffers chains' Wait Contention
GOAL
Identify blocks which cause latch contention on the 'cache buffers chains' latch.
NOTE: To solve a hot block, the application may need to be reviewed because the contention may be caused by inefficient SQL repeatedly accessing the same buffers unnecessarily.
For more details see:
For more details see:
Document 1342917.1 Troubleshooting 'latch: cache buffers chains' Wait Contention
SOLUTION
Possible hot blocks in the buffer cache normally can be identified by a high or rapid increasing wait count on the CACHE BUFFERS CHAINS latch. This latch is acquired when searching for data blocks cached in the buffer cache. The Buffer cache is implemented as a chain of blocks linked together under a hash value. When a buffer is requested the hash value of the chain that this buffer will be in is calculated and that chain is scanned. Each chain is protected by a latch to prevent it from being changed while it is scanned. Contention in this latch can be caused by very heavy access to a single block. This can require the application to be reviewed.
By examining the waits for this latch, information about the segment and the specific block can be obtained using the following queries.
By examining the waits for this latch, information about the segment and the specific block can be obtained using the following queries.
Identify high contention latch addresses
First determine which latch addresses (ADDR) are interesting by examining the number of sleeps for this latch. The higher the sleep count, the more processes are having to wait for that latch, so the higher the sleeps, the higher the contention. If we are investigating contention, the higher the sleeps the more interesting the latch addresses (ADDR) is. The following select returns latch addresses (ADDR) ordered by sleeps:
SQL> SELECT child# "cCHILD" , addr "sADDR" , gets "sGETS" , misses "sMISSES" , sleeps "sSLEEPS" FROM v$latch_children WHERE name = 'cache buffers chains' ORDER BY 5, 1, 2, 3;
Run the query above a few times to establish the addresses (ADDR) that consistently has the highest sleeps.
Identify the blocks protected under that address
Once the addresses (ADDR) with the highest sleep count have been determined, these latch addresses can be used to get more detail about the blocks currently in the buffer cache protected by this latch. The following query identifies the blocks protected by a particular ADDR and should be run just after determining the ADDR with the highest sleep count:
SQL> column segment_name format a35 SELECT /*+ RULE */ e.owner || '.' || e.segment_name segment_name, e.extent_id extent#, x.dbablk - e.block_id + 1 block#, x.tch, l.child# FROM sys.v$latch_children l, sys.x$bh x, sys.dba_extents e WHERE x.hladdr = '&ADDR' AND e.file_id = x.file# AND x.hladdr = l.addr AND x.dbablk BETWEEN e.block_id AND e.block_id + e.blocks -1 ORDER BY x.tch DESC ;
Example output :
SEGMENT_NAME EXTENT# BLOCK# TCH CHILD# -------------------------------- ------------ ------------ ------ ---------- SCOTT.EMP_PK 5 474 17 7,668 SCOTT.EMP 1 449 2 7,668
The TCH column identifies the number of times each block has been hit by a SQL statement, so you can use this information to identify a hot block. The higher the value of the TCH column, the more frequent the block is accessed by SQL statements.
The following query joins with DBA_OBJECTS to find the objects waiting, the misses, sleeps, etc:
SQLgt; WITH bh_lc AS (SELECT /*+ ORDERED */ lc.addr, lc.child#, lc.gets, lc.misses, lc.immediate_gets, lc.immediate_misses, lc.spin_gets, lc.sleeps, bh.hladdr, bh.tch tch, bh.file#, bh.dbablk, bh.class, bh.state, bh.obj FROM x$kslld ld, v$session_wait sw, v$latch_children lc, x$bh bh WHERE lc.addr =sw.p1raw AND sw.p2 = ld.indx AND ld.kslldnam='cache buffers chains' AND lower(sw.event) LIKE '%latch%' AND sw.state ='WAITING' AND bh.hladdr=lc.addr ) SELECT bh_lc.hladdr, bh_lc.tch, o.owner, o.object_name, o.object_type, bh_lc.child#, bh_lc.gets, bh_lc.misses, bh_lc.immediate_gets, bh_lc.immediate_misses, spin_gets, sleeps FROM bh_lc, dba_objects o WHERE bh_lc.obj = o.object_id(+) UNION SELECT bh_lc.hladdr, bh_lc.tch, o.owner, o.object_name, o.object_type, bh_lc.child#, bh_lc.gets, bh_lc.misses, bh_lc.immediate_gets, bh_lc.immediate_misses, spin_gets, sleeps FROM bh_lc, dba_objects o WHERE bh_lc.obj = o.data_object_id(+) ORDER BY 1,2 DESC;
The output is something like following:
HLADDR TCH OWNER OBJECT_NAME OBJECT_TYP CHILD# GETS MISSES IMMEDIATE_GETS IMMEDIATE_MISSES SPIN_GETS SLEEPS ---------------- ----- -------- ------------------------- ---------- ------- ------ ------ ---------------- ---------------- --------- ------ 0000000621ED8B00 127 SABXFER PCBDEMPRELEVEMENT TABLE 80750 703326 1649 8007 0 1642 7 0000000621ED8B00 127 80750 703326 1649 8007 0 1642 7 0000000621ED8B00 74 SABXFER V_HIERARCHIEENTREPRISE_B TABLE 80750 703326 1649 8007 0 1642 7 0000000621ED8B00 1 SABXFER SACNTPR_PK INDEX 80750 703326 1649 8007 0 1642 7 0000000621ED8B00 1 SABXFER SAENCR2_1 TABLE 80750 703326 1649 8007 0 1642 7 0000000621ED8B00 1 80750 703326 1649 8007 0 1642 7 0000000621ED8B00 0 80750 703326 1649 8007 0 1642 7
Reducing contention
In order to reduce contention for this object the following mechanisms can be put in place:
- Examine the application to see if the execution of certain DML and SELECT statements can be reorganized to eliminate contention on the object. This may be because individual SQL statements are reading more buffers than they need to or SQL affecting specific buffers does so at the same time. For more details see:Document 1342917.1 Troubleshooting 'latch: cache buffers chains' Wait Contention
- Decrease the buffer cache. If less buffers are stored then there is less change of them contending. This may only help in a small amount of cases and may just move the problem elsewhere.
- DBWR throughput may have a factor in this as well because this determines how quickly buffers are flushed to disk. If dirty buffers are present then processes requesting those buffers may need to find earlier version meaning that they hold latches longer increasing the chances of contention. If using multiple DBWR's then increase the number of DBWR's. See:Document 62172.1 - Understanding and Tuning Buffer Cache and DBWR
- Increase the PCTFREE for the table storage parameters via ALTER TABLE or rebuild. This will result in less rows per block such that there is less chance of the same buffers being requested by processes.
- Consider implementing reverse key indexes (if range scans aren't commonly used against the segment)
=======================================================================
Resolving Issues Where Waits for 'latch: cache buffers chains' Seen Due to Poorly Tuned SQL
PURPOSE
Disclaimer: This note is written in the context that the Performance Guided Resolution tool was used and recommended this article. It may not make as much sense if read standalone or by other audiences.
TROUBLESHOOTING STEPS
Brief Definition:
Block headers are placed on the buffer cache in linked lists (cache buffer chains) which are accessed through a hash table. One or more hash chains are protected by one child of this latch. Processes need to get the child latch to scan for a buffer. This prevents the linked list from changing while scanning.
'latch: cache buffers chains' contention is typically encountered because SQL statements read more buffers than they need to and multiple sessions are waiting to read the same block.
Problem Confirmation:
AWR/statspack report top timed event shows significant percentage of database time spent waiting for 'latch: cache buffers chains'
Reducing Waits:
Contention on the cache buffers chains latch may occur when block headers are accessed very frequently (via logical reads). These are often referred to as Hot blocks. Look for SQLs that access the blocks in question and determine if the repeated reads are necessary. This may be within a single session or across multiple sessions. In order to identify the contending SQLs:
Check ASH data to find the SQLs most often waited for this latch.
Check SQL ordered by logical reads to find out the top SQLs performing extensive logical reads.
Check Segments by logical reads to find the segments upon which most of the logical reads occurred.
- Inefficient execution plans may perform many logical reads while they visit many blocks. If this query is executed by many sessions concurrently (or other similar queries against the same blocks), then there will be contention on these blocks. Check for suboptimal SQL (this is the most common cause of the events) - look at the execution plan for the SQL being run and try to reduce the gets per executions which will minimize the number of blocks being accessed and therefore reduce the chances of multiple sessions contending for the same block. If you can identify a poor SQL and have identified a better plan, you can direct the optimizer to use this plan using the following article:Document 1400903.1 Directing Plans with Baselines/Profiles Using coe_load_sql_baseline.sql / coe_load_sql_profile.sql (shipped with SQLT)
- When execution plans are optimal, then further checks are needed from your application side to validate whether the concurrency on blocks imposed via repeated execution of sqls are justified.
- If you have assigned a very large buffer cache, then reducing the size may help, in very rare case.
- Index leaf blocks may see contention due to key values that are increasing steadily (using a sequence) and concentrated in a leaf block on the "right-hand side" of the index. Look at using reverse key indexes if range scans aren't commonly used against the segment (remember that this change means that range scan won’t be used for this index).
- Use hash partitioning to spread values across blocks.
- Rebuilding table with more PCTFREE may distribute rows in multiple blocks. This will reduce the number of rows per block and hopefully, spread out contention for the blocks (at the expense of wasting space). This may cause some queries to run slower because they will need to access more blocks to obtain the same number of rows.
Once you have applied the changes to resolve the issues you have found, compare the latest AWR to the AWR that led you here via Guided Resolution (that AWR becomes your baseline). Look at the percentage decrease total wait time for this event. If there are still issues, re- evaluate those and address them according to the specific symptom.
Note---> This informationmation taken from oracle metalink. all copy rights oracle only.
VheliXsuffbu Dan Hacker https://wakelet.com/wake/UBsn3rgeM_kkv8ckfnwNf
ReplyDeletebreathracmofan