SRDC - Data Collection for RAC Instance Crash Issues (Doc ID 1675164.1)

SRDC - Data Collection for RAC Instance Crash Issues (Doc ID 1675164.1)


Applies To
Product:
  • Oracle Database - Enterprise Edition - Version 10.1.0.2 to 12.1.0.1 [Release 10.1 to 12.1]
  • Oracle Database - Enterprise Edition
  • Oracle Database - Standard Edition
  • GENERIC (All Platforms)
What is being collected and why?
Log files contain all errors and transactions to help support determine the cause of the problem. Support will use this information to guide their investigation into the problem and errors.
Action Plan

PREFERRED OPTION:  COLLECT TRACE FILES USING TFA COLLECTOR

Oracle strongly recommends using TFA Collector for collecting the data. TFA Collector collects only relevant information based on the time of the event, so the size of collected data will be much smaller.  TFA Collector collects all CRS log files, ASM trace files, database trace files and OSWatcher output and CHM (Cluster Health Monitor) output if they are installed. 
Reference: Document 1513912.1 TFA Collector - Tool for Enhanced Diagnostic Gathering
TFA Collector is installed in the GI HOME and comes with 11.2.0.4 GI and higher.  For GI 11.2.0.3 or lower, install the TFA Collector by referring to Document 1513912.1 for instruction on downloading and installing TFA collector.
$GI_HOME/tfa/bin/tfactl diagcollect -from "MMM/dd/yyyy hh:mm:ss" -to "MMM/dd/yyyy hh:mm:ss"
  
Format example: "Jul/1/2014 21:00:00"
Specify the "from time" to be 4 hours before and the "to time" to be 4 hours after the time of error.

ALTERNATIVE OPTION:  COLLECT AND UPLOAD THE FOLLOWING FILES
(IF THE TFA COLLECTOR IS NOT FOUND AND CAN NOT BE DOWNLOADED AND INSTALLED)

    The size of the collected files will be larger than that of using TFA collector because the entire trace files, instead of relevant excerpts, are collected. 
    1) Run diagcollection as root by using the instruction in the Document 330358.1 Oracle Clusterware 10gR2/ 11gR1/ 11gR2/ 12cR1 Diagnostic Collection Guide.
     For 11.2 and higher, issue "$GRID_HOME/bin/diagcollection.sh" as root on all nodes.

     For 10.2 and 11.1, issue "$CRS_HOME/bin/diagcollection.pl -crshome=$CRS_HOME --collect" as root on all nodes.
    2) Collect OSWatcher output.  Refer to the Document 301137.1 OSWatcher (Includes: [Video]) to install OSWatcher and learn about OSWatcher.
    3) Collect CHM (Cluster Health Manger output) output.  Refer to the Document 1328466.1 Cluster Health Monitor (CHM) FAQ.
     On 11.2.0.2 or higher on Linux and Solaris and 11.2.0.3 or higher on AIX and Windows,
     issue $GI_HOME/bin/diagcollection.pl --collect --chmos [--incidenttime --incidentduration 04:00]

    The above command collects 4 hours worth of CHM data starting from the incidenttime.
    The incidenttime must be in MM/DD/YYYYHH:MN:SS where MM is month, DD is date, YYYY is year, HH is hour in 24 hour format, MN is minute, and SS is second.
    For example, if you want to put the incident time to start from 10:15 PM on June 01, 2011, the incident time is 06/01/201122:15:00.

    The incidenttime and incidentduration can be changed to capture more data.  The incidenttime and incidentduration are optional.
   4) Upload alert.log and background process trace files from the database and asm instances.
        
To find the location of the files, connect to the database and asm instances and issue "show parameter background_dump_dest".
Upload all files from above steps to SR.


===============================================================================


Comments