Improve SQL Query Performance by Using Bind Variables

Improve SQL Query Performance by Using Bind Variables


Why do bind variables matter for performance?

Before Oracle runs a SQL statement it checks it's valid and determines how to access the tables and join them together. This is called parsing. The optimizer has the task of figuring out which table access and join methods to use. This produces an execution plan. When Oracle sends a statement to the optimizer to do this it's called a hard parse.
If a plan already exists for a query, Oracle doesn't need to go through the optimization process again. It can reuse the existing plan. This is referred to as soft parsing.
Oracle considers even minor formatting differences between two SQL queries to be different statements. For example, Oracle sees:
select * from orders where order_id = :ord
and
SELECT * FROM orders WHERE order_id = :ord
as different statements, even though it's obvious (to us) that these are equivalent. Therefore it hard parses the first execution of both.

Where do bind variables come into this?

When you're querying a table by primary key the execution plan will invariably be:
  • Perform a unique scan of the primary key index
  • Access the corresponding row in the table using the rowid found in the index
This applies whatever value you're supplying for the primary key. For example, we would expect both of the following queries to use the plan outlined above (assuming order_id is the primary key):
select * from orders where order_id = 1234;
select * from orders where order_id = 9876;
The texts of these two are different however. Therefore both require a hard parse. This applies even if Oracle has already generated the plan for one of them.
This is a bit like opening your favorite mapping application to find your route to work just because you're using a different car. In the vast majority of cases this is wasted effort. The brand and model of car has no impact on the direction you take.
To avoid the extra hard parse, we need to replace the numbers in the statements above with a bind variable. This gives a single statement:
select * from orders where order_id = :ord;
If we run this query twice in succession with different values for :ord there's one hard parse and one soft parse.

What's the impact of hard parsing?

To answer this, let's look at some numbers. Let's take a basic orders table:
create table orders (
 order_id       integer not null primary key ,
 customer_id    integer not null ,
 order_datetime date not null ,
 status         varchar2(10) not null 
);
We'll query this by primary key 50,000 times. Each query will compare order_id to a different value. We'll take two approaches. One will build the SQL string dynamically, so each query is different. The other will use bind variables. Both will use execute immediate.
/* No bind variables */
declare
   order_rec orders%rowtype;
 begin
   for i in 1 .. 50000 loop
     begin
       execute immediate
         'select * from orders where order_id = ' || i
         into order_rec;
     exception
       when no_data_found then null;
     end;
   end loop;
 end;
/

PL/SQL procedure successfully completed.

Elapsed: 00:01:33.94

/* Bind variables */
declare
   order_rec orders%rowtype;
 begin
   for i in 1 .. 50000 loop
     begin
       execute immediate
         'select * from orders where order_id = :i'
         into order_rec
         using i;
     exception
       when no_data_found then null;
     end;
   end loop;
 end;
/

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.49
Ninety seconds down to less than four! The overhead for each individual execution is small (around additional 2ms). When we're executing queries thousands of times this adds up. Hard parsing is expensive relative to soft parsing.

How does cursor_sharing = force affect this?

The behavior described above relies on us setting the parameter cursor_sharing to exact. This is the default, so it's likely your databases have this setting.
We can change this mode of operation by setting cursor_sharing to force. When we do this, Oracle strips out all the literals from the query and replaces them with bind variables in the optimization phase. For example, the previous two order queries now both become:
select * from orders where order_id = :"SYS_B_0"
In effect Oracle has transformed the query into one that does use bind variables.

Problems with cursor_sharing = force

Given that changing cursor_sharing is easy to do, it's tempting to say we should just set it to force. This introduces some other issues however. Let's say we have a process to find all the open orders for a customer. The query is:
select * from orders
where status = 'ACTIVE'
and   customer_id = :customer;
The status will always be 'ACTIVE' for this query. Yet Oracle still changes this query to become:
select * from orders
where status = :"SYS_B_0"
and   customer_id = :customer;
Why is this an issue?
We may have a similar query elsewhere in the application to find all the completed orders for a customer, such as:
select * from orders
where status = 'COMPLETE'
and   customer_id = :customer;
Oracle will also transform this to the bound version shown above. Instead of two distinct queries we've ended up with one.
To see why this could be an issue, let's return to the car analogy. You don't want to change your route just because you're using a different car. It's likely you would want to take a different path if using another form of transport however (e.g. bike, walking, bus, …). If you don't know which type of transport you're using, how do you select which route to take?
Oracle addressed this issue with bind variable peeking(introduced in 9i). This enables the optimizer to look the actual values passed when parsing a query and produce a plan suited to them.
When the database first started peeking it only did this check on the first execution of a query. Subsequent executions of it would use this first plan. This applied even if there were more efficient plans for the values passed.
For example, imagine you get a new job. No matter which transport method you use, you're forced to take the same route you took to get there on your first day. If your first journey was on foot, it's likely you'll take paths which are difficult (or impossible!) to traverse in a car!
To reduce this problem, Oracle introduced Adaptive Cursor Sharing in 11g. This enabled the database to use multiple plans for the same query. It's able to identify when supplying different values for bind variables should result in different plans. In terms of your journey to work, you no longer need to force your car down a footpath because you walked to work on your first day :)
These features reduce problems associated with forced cursor sharing merging queries that are genuinely different. They're not entirely gone however. The more information you can provide to Oracle, the more likely it is to choose the optimal execution plan. Converting a fixed literal to a bind variable removes some information. This increases the chance that the optimizer produces a sub-optimal query plan.
This isn't the only issue with forced cursor sharing however. It still performs worse than using bind variables. Let's see what happens when we run the previous 50,000 order id lookups without binding and forced cursor sharing:
alter session set cursor_sharing = force;

Session altered.

Elapsed: 00:00:00.00

declare
   order_rec orders%rowtype;
 begin
   for i in 1 .. 50000 loop
     begin
       execute immediate
         'select * from orders where order_id = ' || i
         into order_rec;
     exception
       when no_data_found then null;
     end;
   end loop;
 end;
/

PL/SQL procedure successfully completed.

Elapsed: 00:00:08.92
It's a lot faster than the default (exact). It's still noticeably slower than using bind variables however.
Setting cursor_sharing to force may be useful as a quick fix if you have a poorly performing application that doesn't use binds. It's still slower than using bind variables and may adversely affect queries with literals. The best long-term solution is to write your queries to use bind variables.

The advantage of using PL/SQL

To finish off it's worth exploring a final way of implementing the query. In the story Dave was writing SQL queries directly in the middle tier. An alternative method is to place all your SQL in PL/SQL stored procedures. The application then just calls these modules.
Let's repeat the test above, this time with the SQL statement written directly in the PL/SQL block:
declare
   order_rec orders%rowtype;
 begin
   for i in 1 .. 50000 loop
     begin
       select *
       into   order_rec
       from   orders
       where  order_id = i;
     exception
       when no_data_found then null;
     end;
   end loop;
 end;
/

PL/SQL procedure successfully completed.

Elapsed: 00:00:03.48
As you can see this gives similar performance to the bound version of execute immediate.
Why this is? Provided you're writing static SQL in PL/SQL blocks (i.e. not as part of an execute immediate or dbms_sql statement) you're already using bind variables! Additionally, Oracle doesn't replace genuine literals in PL/SQL when cursor_sharing = force (MOS note 285447.1). This makes it more likely Oracle will choose the optimal execution plan.

Conclusion

As we've seen, not using bind variables causes Oracle to do more work when determining how to execute queries. Instead of coming up with an execution plan once and reusing it, the optimizer regenerates the plan for each execution with different input values.
=====================================================

Bind variables - The key to application performance



To understand bind variables, consider an application that generates thousands of SELECT statements against a table; for example:
SELECT fname, lname, pcode FROM cust WHERE id = 674;
SELECT fname, lname, pcode FROM cust WHERE id = 234;
SELECT fname, lname, pcode FROM cust WHERE id = 332;
Each time the query is submitted, Oracle first checks in the shared pool to see whether this statement has been submitted before. If it has, the execution plan that this statement previously used is retrieved, and the SQL is executed. If the statement cannot be found in the shared pool, Oracle has to go through the process of parsing the statement, working out the various execution paths and coming up with an optimal access plan before it can be executed. This process is know as a «hard parse» and for OLTP applications can actually take longer to carry out that the DML instruction itself.
When looking for a matching statement in the shared pool, only statements that exactly match the text of the statements are considered; so, if every SQL statement you submit is unique (in that the predicate changes each time, from id = 674 to id=234 and so on) then you'll never get a match, and every statement you submit will need to be hard parsed. Hard parsing is very CPU intensive, and involves obtaining latches on key shared memory areas, which whilst it might not affect a single program running against a small set of data, can bring a multi-user system to it's knees if hundreds of copies of the program are trying to hard parse statements at the same time. The extra bonus with this problem is that contention caused by hard parsing is pretty much immune to measures such as increasing available memory, numbers of processors and so on, as hard parsing statements is one thing Oracle can't do concurrently with many other operations, and it's a problem that often only comes to light when trying to scale up a development system from a single user working on subset of records to many hundreds of users working on a full data set.
The way to get Oracle to reuse the execution plans for these statements is to use bind variables. Bind variables are «substituion» variables that are used in place of literals (such as 674, 234, 332) and that have the effect of sending exactly the same SQL to Oracle every time the query is executed. For example, in our application, we would just submit
SELECT fname, lname, pcode FROM cust WHERE id = :cust_no;
and this time we would be able to reuse the execution plan every time, reducing the latch activity in the SGA, and therefore the total CPU activity, which has the effect of allowing our application to scale up to many users on a large dataset.
Bind Variables in SQL*Plus
In SQL*Plus you can use bind variables as follows:
SQL> variable deptno numberSQL> exec :deptno := 10SQL> select * from emp where deptno = :deptno;
What we've done to the SELECT statement now is take the literal value out of it, and replace it with a placeholder (our bind variable), with SQL*Plus passing the value of the bind variable to Oracle when the statement is processed. This bit is fairly straighforward (you declare a bind variable in SQL*Plus, then reference the bind variable in the SELECT statement)
Bind Variables in PL/SQL
Taking PL/SQL first of all, the good news is that PL/SQL itself takes care of most of the issues to do with bind variables, to the point where most code that you write already uses bind variables without you knowing. Take, for example, the following bit of PL/SQL:
create or replace procedure dsal(p_empno in number)
as
  begin
    update emp
    set sal=sal*2
    where empno = p_empno;
    commit;
  end;
/
Now you might be thinking that you've got to replace the p_empno with a bind variable. However, the good news is that every reference to a PL/SQL variable is in fact a bind variable.
Dynamic SQL
In fact, the only time you need to consciously decide to use bind variables when working with PL/SQL is when using Dynamic SQL.
Dynamic SQL, allows you to execute a string containing SQL using the EXECUTE IMMEDIATE command. For next example would always require a hard parse when it is submitted:
create or replace procedure dsal(p_empno in number)
as
  begin
    execute immediate
     'update emp set sal = sal*2 where empno = '||p_empno;
  commit;
  end;
/
The way to use bind variables instead is to change the EXECUTE IMMEDIATE command as follows:
create or replace procedure dsal(p_empno in number)
as
  begin
    execute immediate
     'update emp set
     sal = sal*2 where empno = :x' using p_empno;
  commit;
  end;
/
And that's all there is to it. One thing to bear in mind, though, is that you can't substitute actual object names (tables, views, columns etc) with bind variables - you can only subsitute literals. If the object name is generated at runtime, you'll still need to string concatenate these parts, and the SQL will only match with those already in the shared pool when the same object name comes up. However, whenever you're using dynamic SQL to build up the predicate part of a statement, use bind variables instead and you'll reduce dramatically the amount of latch contention going on.
The Performance Killer
Just to give you a tiny idea of how huge of a difference this can make performance wise, you only need to run a very small test:
Here is the Performance Killer ....
SQL> alter system flush shared_pool;
SQL> set serveroutput on;

declare
      type rc is ref cursor;
      l_rc rc;
      l_dummy all_objects.object_name%type;
      l_start number default dbms_utility.get_time;
  begin
      for i in 1 .. 1000
      loop
          open l_rc for
          'select object_name
             from all_objects
            where object_id = ' || i;
          fetch l_rc into l_dummy;
          close l_rc;
          -- dbms_output.put_line(l_dummy);
      end loop;
      dbms_output.put_line
       (round((dbms_utility.get_time-l_start)/100, 2) ||
        ' Seconds...' );
  end;
/
101.71 Seconds...
... and here is the Performance Winner:
declare
      type rc is ref cursor;
      l_rc rc;
      l_dummy all_objects.object_name%type;
      l_start number default dbms_utility.get_time;
  begin
      for i in 1 .. 1000
      loop
          open l_rc for
          'select object_name
             from all_objects
            where object_id = :x'
          using i;
          fetch l_rc into l_dummy;
          close l_rc;
          -- dbms_output.put_line(l_dummy);
      end loop;
      dbms_output.put_line
       (round((dbms_utility.get_time-l_start)/100, 2) ||
        ' Seconds...' );
end;
/
1.9 Seconds...
That is pretty dramatic.  The fact is that not only does this execute much faster (we spent more time PARSING our queries then actually EXECUTING them!) it will let more users use your system simultaneously.
Bind Variables in VB, Java and other applications
The next question is though, what about VB, Java and other applications that fire SQL queries against an Oracle database. How do these use bind variables? Do you have to in fact split your SQL into two statements, one to set the bind variable, and one for the statement itself?
In fact, the answer to this is actually quite simple. When you put together an SQL statement using Java, or VB, or whatever, you usually use an API for accessing the database; ADO in the case of VB, JDBC in the case of Java. All of these APIs have built-in support for bind variablesand it's just a case of using this support rather than just concatenating a string yourself and submitting it to the database.
For example, Java has PreparedStatement, which allows the use of bind variables, and Statement, which uses the string concatenation approach. If you use the method that supports bind variables, the API itself passes the bind variable value to Oracle at runtime, and you just submit your SQL statement as normal. There's no need to separately pass the bind variable value to Oracle, and actually no additional work on your part. Support for bind variables isn't just limited to Oracle - it's common to other RDBMS platforms such as Microsoft SQL Server, so there's no excuse for not using them just because they might be an Oracle-only feature.
Conclusion
Lastly, it's worth bearing in mind that there are some instances where bind variables are probably not appropriate, usually where instead of your query being executed many times a second (as with OLTP systems) your query in fact actually takes several seconds, or minutes, or hours to execute - a situation you get in decision support and data warehousing. In this instance, the time taken to hard parse your query is only a small proportion of the total query execution time, and the benefit of avoiding a hard parse is probably outweighed by the reduction in important information you're making available to the query optimizer - by substituting the actual predicate with a bind variable, you're removing the ability for the optimiser to compare your value with the data distribution in the column, which might make it opt for a full table scan or an index when this isn't appropriate. Oracle 9i helps deal with this using a feature known as bind variable peeking, which allows Oracle to look at the value behind a bind variable to help choose the best execution plan.
Another potential drawback with bind variables and data warehousing queries is that the use of bind variables disallows the potential for star transformations, taking away this powerful option for efficiently joining fact and dimension tables in a star schema.
===========================================================

Oracle SQL Bind Variable Tips


Comments