Hash Join Processing

Hash Join Processing



DETAILS

A Hash Join (HJ) is a type of join processing introduced in Oracle 7.3. The goal of the implementation is to provide a better alternative to the Sort Merge Join (SMJ) and Nested Loops (NL) Join.

Hash Joins should be more efficient that both SMJ and NL since:
  • SMJ is comparatively slow as potentially both row sources have to be sorted to enable the merge phase.
  • NL is inefficient if the driving table contains lots of rows (i.e. high cardinality) and the inner table is expensive to visit , thus forcing multiple probes of the expensive inner table
Whereas HJ only requires a single pass of each table with no sorting.

Hash Joins are only implemented when using the CBO and are selected by comparing the cost of a Hash Join with SMJ and NL during query optimization. Initially many poorly performing SQL statements were seen to be choosing inefficient hash joins, however in almost all cases this performance was down to the choice of an access path based on poor or inaccurate statitics. As statistics collection has become more widespread and the quality of the statistics improved the choice of efficient hash joins has been made much more appropriately.

The use of Hash Joins are controlled by the initialisation parameter . When this is set to false the optimizer will not consider the use of hash joins when evaluating query plans. Note that a /*+ USE_HASH */ hint will override this parameter.

As with all other joins in Oracle, Hash Joins only ever join 2 row sources at any one time.

Processing

  • Decide on the number of partitions (buckets) hashed rows will be placed in to. This is determined using the hash_area_size, db_block_size and hash_multiblock_io_count parameters. The hash_area_sizeis adjusted by 20% to allow for the overhead of storing partitions, the bitmap of unique left input and the hash table - These will be discussed later. Formula:
    Number of Partitions = 0.8 x hash_area_size
    -----------------------
    (db_block_size x hash_multiblock_io_count)
  • Smallest Row source is read (called R from now on) and each row is subjected to a hashing algorithm. 2 hashing algorithms are used to hash the rows as an attempt to get a better spread of values around buckets. Collectively these hash buckets (partitions) are called the Hash Table.
  • The hashed rows are then placed in the appropriate partitions based on the first hash values, the second hash value is stored with the row in case re-hashing is required later.
  • At the same time a bitmap of the hashed values is built using both hash values
  • If the hash area fills up then the largest partition is chosen and written out to disk. Any new rows that hash to a partition on disk will then update the on disk partition. It is possible to only have part of a single partition in memory, with the rest of this partition and all the others residing on disk. Hopefully this will not happen as it will have a negative affect on performance.
  • This continues until the initial row source R is exhausted.
  • At this point processing takes place to try to fit the maximum possible partitions in memory. Partitions are sorted in order of size and then as many of these as possible are placed in memory.
  • Now that these partitions are in memory, reading from the second row source (henceforth 'S')  commences. Each row from the second row source is hashed in turn and compared to determine if the hash value hits a partition that is currently in memory. If it does then this row is returned as a joined row. If it does not match a partition in memory then the row is written to a new partition that stores rows from row source S. This partition is built using the same hash algorithm as used to hash R. This means that rows that hash to the same value in S will be in the same partition number in the new set of partitions as similar rows from R are in in the original set of partitions. This partition is stored on disk.
  • Processing continues through S until every row from that row source has been examine.
  • At this stage the process is left with a set of rows that join and a set of partitions on disk that have been built from source R and from source S.
  • To process these remaining on disk partitions,the smallest partition from either of original row source's partition sets (this could be either R or S dependent on the sizes of the on disk footprint), is used to build a new hash table in memory using the second hash function so as to distribute the rows more evenly. Note that because either row source can be used, the 'sides' of the joined tables can swan. This is known as 'side swapping'.
  • The new hash table in memory is probed using the matching partition from the other (larger) row source. Any joining rows are returned.
  • The process continues until all partitions have been exhausted.

Bitmap for unique join keys

When the R row source is hashed initially, a bitmap is created that reflects the unique values that could be part of a join. This bitmap is effectively a flag that indicates if a particular hash bucket contains values. The bitmap records the hash buckets in the hash table that actually contain records. The point of this is to strip out rows from S before they are written to the partition on disk because they do not hash to any of the
partition(s) that are currently in memory. This bitmap is consulted before the rows from S are put in to partitions on disk so that there is no overhead of writing out and then reading back in rows that are never going to join.

The filter tells the engine what is NOT stored on disk as opposed to what is. Since different values can hash to the same hash value, a match on a hash value does not guarantee that the actual value is the one that caused the hash partition to be populated. However it can filter a good proportion of rows, hence its usage. If the hash value is not present then the value is definitely not in the hash table.

It is possible to get false positives; keys that hash to a value in the hash value set, but are not contained in the data set. False positives are less likely as the hash value range increases.

Poorly Performing Hash Joins - What to do about them?

Historically, Hash Join has been wrongly maligned when the fault was with non-existant, substandard or inaccurate statistics causing these access paths to be chosen when they were inappropriate. If you have queries that choose hash joins inappropriately, then you should follow the normal steps for tuning any query, starting with a healthcheck of the environment to help you to find and address missing statistics and other issues in your queries.
Document 1366133.1 SQL Tuning Health-Check Script
Document 1226841.1 How To: Gather Statistics for the Cost Based Optimizer
If these do not help, then follow the steps in the following documents:

Document 742112.1 * Query Performance Degradation - Recommended Actions
Document 745216.1 * Query Performance Degradation - Upgrade Related - Recommended Actions

=============================================================

Note---> This informationmation taken from oracle metalink. all copy rights oracle only.

Comments